There's always the option of capturing a WARC of the newspaper as the
preservation master for dark storage, and generating PDFs for access via
your CMS. If you're in ContentDM already, then a PDF would be much easier
to use (both on the back and frontends).
The provenance metadata of WARC is too important not to capture, but I
agree that it can be awkward to use for access. A hybrid approach of
generating WARCs and PDFs may be best - the PDF will handle most of your
use cases, and any further questions/issues (e.g. rendering questions,
research into interactive advertisements, etc.) can defer to the WARC.
I've used this approach elsewhere, and it was a relief to know that we
could always go back to a WARC file to resolve issues of
On Wed, Jan 15, 2014 at 11:52 AM, Andrew Darby <[log in to unmask]>wrote:
> If it's doable, I think preserving the whole enchilada is desirable. For
> instance, at my last library, there was a regular assignment where students
> needed the print version of old periodicals because they were tasked with
> analysing the ads and layouts. Someone might be interested in web layouts
> from the 2000s, and there might be content (again, ads, but also masthead
> logos, ???) that might not otherwise be captured.
> On Wed, Jan 15, 2014 at 10:29 AM, Wilhelmina Randtke <[log in to unmask]
> > Agreed, don't focus too much on preserving the presentation for an online
> > newspaper. The text and images are important, but the layout isn't so
> > important.
> > -Wilhelmina Randtke
> > On Tue, Jan 14, 2014 at 10:59 AM, Kyle Banerjee <[log in to unmask]
> > >wrote:
> > > IMO, there are many web archiving situations where it is more
> > > to just focus on the content rather than the manifestation of the
> > content.
> > > Just as you wouldn't expect a 1995 article from the NYT to be displayed
> > as
> > > the website was in 1995 or an article in an online database to actually
> > > appear like it originally appeared online, it's the content rather than
> > the
> > > skin that's relevant in the case of a newspaper. If you make sure it's
> > in a
> > > format that can be migrated forward and added to standalone or union
> > > systems that provide access to this sort of stuff, you'll be fine.
> > >
> > > kyle
> > >
> > >
> > > On Tue, Jan 14, 2014 at 8:48 AM, Kathryn Frederick (Library) <
> > > [log in to unmask]> wrote:
> > >
> > > > Hi,
> > > > I'm trying to develop a strategy for preserving issues our school's
> > > online
> > > > newspaper. Creating a WARC file of the content seems straightforward,
> > but
> > > > how will that content fair long-term? Also, how is the WARC served to
> > an
> > > > end-user? Is there some other method I should look at?
> > > > Thanks in advance for any advice!
> > > > Kathryn
> > > >
> > >
> Andrew Darby
> Head, Web & Emerging Technologies
> University of Miami Libraries