Hi Nathan,
It looks like your architectural thinking includes a management layer (DSpace, Islandora, other...) and a storage layer.
For a storage layer that provides bit level preservation, more copies, up to a point, is a good idea. For much disk storage, people go with 2 copies - a copy on disk and a backup copy on tape. For our digital repository, we have decided that 3 copies is a better guarantee, if we can afford it (see the analysis done by the HathiTrust for some insight on this - http://www.hathitrust.org/documents/hathitrust-3rd-instance-recommendations.pdf). We have also added requirements that these copies must be in 3 different physical locations, and one copy must be "offline" (e.g. a tape in storage). For DR purposes, it's good to think about the physical locations being as widely separated as possible.
Our digital repository storage layer currently stores 3 copies of preservation files - one on-line disk copy, one on-line tape copy, and one off line tape copy - although we are considering moving to a 2 disk copy, 1 offline tape copy model. Disk copies simplify fixity checking as a background task.
In addition, since our repository is both a preservation and access repository, we maintain a second disk copy of "high use" assets so as to provide high availability to on-line web applications.
Hope this helps.
- Randy Stern
Manager of Systems Development
Harvard Library, Office for Information Systems
Date: Tue, 24 Jan 2012 13:04:29 -0500
From: Nathan Tallman <[log in to unmask]>
Subject: Re: Preservation Server
Hi Adam,
I'll respond in-line below. Thanks!
Nathan
On Tue, Jan 24, 2012 at 12:38 PM, Adam Wead <[log in to unmask]> wrote:
> Hi Nathan,
>
> Can you tell us:
> - what kind of content you'll be ingesting (images, text, a/v)
>
Content will include all types of electronic files: images, text, video, audio, data sets, and more. We will normalize and convert to open-formats as much as possible during the accessioning process.
- how much of it do you expect you'll have (1TB, 100 TB, more?)
>
This is harder to answer, but ideally it will be scalable (sp?). We'll probably start off in the realm of 5-10 TB, but as we migrate our analog media to digital formats, we are going to have some very large files in the future. Plus, who knows what will come in via new accessions.
- what kind of access will you need to provide (world-wide or just local?)
>
Access is local only, but I will need to be able to run a WAMP type configuration.
- do you want off-site backups in one or more locations
>
Off-site backups will be handled by a vendor.
- what systems, if any, do you currently have in place
>
This will be a brand new system. Currently we have some preservation-files stored on a shared network location, but it's not working out for a myriad of reasons. We really want to have our system that we control.
- what software are you considering for the repository
>
Software hasn't been decided yet. I might start out just using the server with a simple folder hierarchy and store files that way. Should we choose to use repository software options include Fedora (via Islandora) and Dspace.
> Hardware options are going to vary a lot depending on what your
> requirements are. There are lots and lots of options but you can find
> something that will fit you needs.
>
> ...adam
>
> Adam Wead | Systems and Digital Collections Librarian ROCK AND ROLL
> HALL OF FAME + MUSEUM Library and Archives
> 2809 Woodland Avenue | Cleveland, Ohio 44115-3216 216-515-1960 | FAX
> 216-515-1964
> Email: [log in to unmask]
> Follow us: rockhall.com | Membership | e-news | e-store | Facebook |
> Twitter
>
> On Jan 24, 2012, at 12:21 PM, Nathan Tallman wrote:
>
> > My institution is going to be purchasing a preservation server
> > sometime within the next year. I'd like to solicit advice on specs.
> > I know this is highly dependent on our collection, but I'm looking
> > for some baseline hardware recommendations. We'll be using it to
> > store preservation-copies
> of
> > electronic files that belong to archival collections. Most of our
> > electronic files are not born-digital, but we are preparing for an
> > influx of born-digital records.
> >
> > Any advice is appreciated! Apologies for cross-posting.
> >
> > Thanks,
> > Nathan Tallman
> > American Jewish Archives
>
> [http://donations.rockhall.com/Logo_WWR.gif]<
> http://rockhall.com/exhibits/women-who-rock/>
> This communication is a confidential and proprietary business
> communication. It is intended solely for the use of the designated
> recipient(s). If this communication is received in error, please
> contact the sender and delete this communication.
>
> '
>
------------------------------
|