As an archivist, I don't see any problem using a PDF. Technically it should
be a PDF-A, but realistically it is usually a PDF.
I have done projects where I used PDFs for the archiving of full websites.
It can be quite handy, depending on needs of course. Sometimes it works
with the look and feel/design, and sometimes it doesn't. Content is pretty
good usually, in my experience.
Do a test and see whether your site crashes your Adobe product...sometimes
the code, special effects or just size can crash it without a PDF being
made...Plus look at the levels you want captured, that can also cause a
Electronic Records Archivist
Harry Ransom Center
The University of Texas at Austin
P.O. Box 7219
Austin, Texas 78713-7219
On Tue, Jan 14, 2014 at 12:48 PM, Kathryn Frederick (Library) <
[log in to unmask]> wrote:
> Thanks for the thoughtful responses. We've been actively digitizing our
> print paper (which ceased publication in 2011) and I was thinking of this
> as an extension of that effort. Right now, I think capturing a monthly WARC
> file of the site is definitely a good idea no matter what. But beyond that,
> as Kyle pointed out, it's not really the web site I'm after but the
> content. I'd like to present this content alongside print issues in our IR
> (currently ContentDM). In one sense, I can see doing a weekly capture of
> the site which would equate to an issue in the old format. But, I could
> also do a PDF of the content. A PDF makes sense to me in the context of a
> collection that is largely print-based and gets at what I want (keyword
> searchable content, authors, dates), but is it disingenuous to
> fundamentally alter the format? Plus there's the work involved... This may
> be a question for archivists, but I'm not one so would appreciate any
> additional thoughts from this group.
> On Tue, Jan 14, 2014 at 10:48 AM, Kathryn Frederick (Library) <
> [log in to unmask]> wrote:
> > Hi,
> > I'm trying to develop a strategy for preserving issues our school's
> > newspaper. Creating a WARC file of the content seems straightforward, but
> > how will that content fair long-term? Also, how is the WARC served to an
> > end-user? Is there some other method I should look at?
> > Thanks in advance for any advice!
> > Kathryn