Hi Eric,
I have created static versions of several WordPress sites. Here's a link to
one of the sites:
http://futureofthebook.org/occurrence/
As you will see, some of the functionality is lost, such as the search and
commenting features. But the content is preserved, and now I don't have to
maintain WordPress for this site (for which its need for interactivity is
long past).
Here is the wget command I used:
wget \
--recursive \
--no-clobber \
--page-requisites \
--html-extension \
--convert-links \
--restrict-file-names=windows \
--include /occurrence \
--no-parent \
http://www.futureofthebook.org/occurrence/ \
--domains www.futureofthebook.org
I'm not certain that I needed all of these switches, but some of them were
necessary.
After I did the wget, I put the set of files into a new location and then
tested, tested, tested. Some links didn't work properly, and so I had to do
some manual work to get a fully functioning site. Nothing is perfect.
Once I had everything working the way I wanted, I pointed my Web server to
the new location of the site, backed up my WordPress database and files,
and saved everything as a tar file, just in case.
Good luck!
Best wishes,
Carol
On Mon, Oct 6, 2014 at 2:44 AM, Eric Phetteplace <[log in to unmask]> wrote:
> Hey C4L,
>
> If I wanted to archive a Wordpress site, how would I do so?
>
> More elaborate: our library recently got a "donation" of a remote Wordpress
> site, sitting one directory below the root of a domain. I can tell from a
> cursory look it's a Wordpress site. We've never archived a website before
> and I don't need to do anything fancy, just download a workable copy as it
> presently exists. I've heard this can be as simple as:
>
> wget -m $PATH_TO_SITE_ROOT
>
> but that's not working as planned. Wget's convert links feature doesn't
> seem to be quite so simple; if I download the site, disable my network
> connection, then host locally, some 20 resources aren't available. Mostly
> images which are under the same directory. Possibly loaded via AJAX.
> Advice?
>
> (Anticipated) pertinent advice: I shouldn't be doing this at all, we should
> outsource to Archive-It or similar, who actually know what they're doing.
> Yes/no?
>
> Best,
> Eric
>
--
Carol Kassel
NYU Digital Library Technology Services
[log in to unmask]
(212) 992-9246
dlib.nyu.edu
|