Print

Print


HI Alex,

This is really helpful to us. At my library we only had a few sites to archive so far. But we are looking into this in case the demand goes up in the future. The problem we are encountering is the expectation that the archived copy of a website should be easily viewable by patrons. Taking screenshots of every page was suggested which we don't think is a reasonable option. Archive-It seems to be good for that kind of expectation but it is not free. So we will have to see. 

I might need to take up on your offer in the local copy/WARC generation process if we decide to go to that route. 

Thank you!
Bohyun

-----Original Message-----
From: Code for Libraries [mailto:[log in to unmask]] On Behalf Of Alexander Duryee
Sent: Thursday, March 20, 2014 11:28 AM
To: [log in to unmask]
Subject: Re: [CODE4LIB] Archiving a website - best practices

Is this for a one-shot project, or will it be ongoing?  For a medium- to long-term initiative, I would suggest a subscription service like Archive-It; for a one-time effort, it would make more sense to use open-source tools like wget to generate a local copy + WARC.  If it's the latter, I'll be happy to take a look at the page and walk you through the process.

I'm not really aware of a set of best practices, beyond the usual tenets of digital preservation (show your work, maintain authenticity, do minimal harm, document, document, document, etc).  The model I've used in the past is generating a WARC alongside the access copy (using wget's WARC output), using that as the preservation master+technical metadata, and hosting the access copy on a front-facing machine.

--Alex


On Thu, Mar 20, 2014 at 11:05 AM, Kari R Smith <[log in to unmask]> wrote:

> Also, contact the SAA (Society of American Archivists)  Web Archiving 
> round table.  Lots of experience and help from that list of folks.
> I'm forwarding your question to that list.
>
> Kari
>
> -----Original Message-----
> From: Kim, Bohyun <[log in to unmask]>
> Sent: Thursday, March 20, 2014 8:26 AM
> To: [log in to unmask]
> Subject: [CODE4LIB] Archiving a website - best practices
>
> I am not up to date with archiving practices. So I may be asking about 
> a well-known problem.
>
> But anyone archiving an old website and if so, what method do you use? 
> We are discussing taking screenshots and/or creating a zip file of the 
> whole site and uploading to a repository at MPOW. Both seem to have 
> some shortcomings.
>
> Thank you!
> Bohyun
>