Print

Print


Getting ready to go to the LOC storage meeting and was reading through the discussions of a few weeks ago.  If I look at Utah State Archive storage cost, at $5,000 per year per TB vs. Amazon S3 at $370/year/TB it is such a big gap I have a hard time believing that Central IT organizations will be sustainable in the long run.  Not that Amazon is the answer to everything, but they have certainly put a stake in the ground regarding what spinning disk costs, fully loaded( meaning this includes utilities, building and personnel). Amazon S3 also provides 3 copies, 2 onsite and one in another data center.

I am not advocating by any means that S3 is the answer to it all, but it is quite telling to compare the fully loaded TB cost from an internal IT shop vs. the fully loaded TB cost from Amazon.

I appreciate you sharing the numbers Elizabeth and it is great your IT group has calculated what I am guessing is the true cost for managing data locally.


Michele Kimpton
CEO DuraSpace
On Aug 28, 2014, at 6:51 PM, Elizabeth Perkes <[log in to unmask]> wrote:

I agree with that. In our shop, we actually have two separate copies of the AIP. One is on m-disc and the other is on spinning disk (a relatively inexpensive NAS device connected to our server, for which we pay our IT department each month). Because our database can connect to the spinning disk version easily, we can run checksum audits on a regular basis. A while ago we had a situation where records were ingested 18 months previously, and we ran a checksum audit as our IT department was swapping out NAS devices to make sure nothing got lost in the process. Turned out that several of the records had corrupted checksums and the disk-based backup that IT provided was also corrupt. IT never found out the when or why the problem happened because their tools said everything was running fine. Fortunately, our software had recorded the checksum as it was ingested into our preservation system, and we were able to pull the m-disc and verify that that copy did have the correct checksum, so we uploaded the m-disc version to our server to replace the corrupted network-attached storage version and all was well. I suspect that we won't run very frequent checksum audits on the m-discs, except maybe sample audits to see how well the media is holding up. It will all have to be replaced soon enough anyway as technology marches onward. Since we store the records in BagIt bags on m-disc, we can validate them without too much difficulty. 

I think the bigger problem with relying solely on spinning disk is the economic one. We have centralized IT, where there is one big data center and servers are virtualized. Our IT charges us a monthly rate for not just storage, but also all of their overhead to exist as a department. It's a popular model in the IT world these days because it can save money in other areas, and we are required by statute to cooperate with IT in this model, so we can't just go out and buy/install whatever we want. For an archives, that's a problem, because our biggest need is storage but we are funded based upon the number of people we employ, not the quantity of data we need to store, and convincing the Legislature that we need $250,000/year for just one copy of 50 TB of data is a hard sell, never mind additional copies for SIP, AIP, and/or DIP. Yes, we are exploring cloud options to see if we can bring that price down. Still, any time your preservation strategy relies on ongoing funding, any kind of economic disruption will put the records in jeopardy. If we received a large budget cut like we did in 2008, I can well imaging someone saying, "unplug the server, we can't afford it anymore." Then where do you put the records? There just aren't that many media options to choose from, unfortunately, and each has its own costs, risks, and levels of expertise to implement. Depending on the length of the economic disruption, plus how much time after it is "over" before your budget bounces back, if ever, any preservation policy should incorporate some type of offline media to cover those funding breaks. 

Elizabeth Perkes
Electronic Records Archivist
Utah State Archives
346 South Rio Grande
Salt Lake City, UT 84101
801-531-3852
[log in to unmask]




On Thu, Aug 28, 2014 at 2:36 PM, Owens, Trevor <[log in to unmask]> wrote:

It’s worth noting that if an organization tries to make substantive use of offline media as part of a preservation strategy it’s going to be challenging to make upward movement on something like the NDSA Levels of Preservation.

 

That is, if you have any significant number of pieces of external media, auditing fixity (or anything else for that matter) is going to be a significant challenge.  Along with that, one of the first steps in the levels is “For data on heterogeneous media (optical discs, hard drives, etc.) get the content off the medium and into your storage system.” To that end, the NDSA levels work suggests that “storage system should generally be understood as either a nearline or online system using either all spinning disk or some combination of spinning disk and magnetic tape.” Naturally, there is a time and place for any given media, but in terms of making headway into having control over your data it’s tough to do most of the things we want to do with our data if it isn’t online.

 

For anyone interested, those quotes come from The NDSA Levels of Digital Preservation : An Explanation and Uses http://www.digitalpreservation.gov/ndsa/working_groups/documents/NDSA_Levels_Archiving_2013.pdf

 

Side note: It is neat to see some back and forth on this list J

 

 

From: The NDSA organization list [mailto:[log in to unmask]] On Behalf Of Allison Munsell
Sent: Thursday, August 28, 2014 3:23 PM


To: [log in to unmask]
Subject: Re: [NDSA-ALL] Story on CBS News

 

Yes, I’m thinking low cost derivates for storage understanding that technology is meant to be short term.

 

I very much appreciate the knowledge.

 

Allison

 

Allison Munsell

Digitization Specialist, Rights &  Reproduction

Albany Institute of History & Art
125 Washington Avenue
Albany, NY  12210
T:  (518) 463-4478 ext. 424
F:  (518) 463-5506
[log in to unmask]
www.albanyinstitute.org

 

 

From: The NDSA organization list [mailto:[log in to unmask]] On Behalf Of Peter Krogh
Sent: Thursday, August 28, 2014 3:03 PM
To: [log in to unmask]
Subject: Re: [NDSA-ALL] Story on CBS News

 

Kara,

I think that most of us would recommend magnetic storage over optical for primary storage. Optical can be a good part of a backup plan, especially good quality optical like the m-disc or MAM-A. (I am pleased to see that the m-disc now comes in Blu-ray, which makes it more workable.) This has long been a pillar of 3-2-1 Backup (3 copies, 2 media types, 1 stored offsite.) It may also be useful for archived data, as long as one is mindful of the "tech debt" one is incurring in future migrations.

 

This landscape is changing, with good low-cost cloud and very high-capacity spinning disk, but "archival" optical has not yet become obsolete.

 

Magnetic media requires more frequent migration and verification, as well as a typically higher operating costs. 

 

I'd also point out that optical is a part of some very wealthy and sophisticated operations. 

 

In the last year Facebook has announced that it has built large cold storage on optical disc arrays.

 

And some people speculate that Amazon Glacier is built on optical storage. 

 

Not cut and dried, I think.

 

In the end, storage is a process, not a place you put stuff. Optical can be part of that process.

 

Peter

 

 

 

This is just my $0.02, but I assume that the NSDA is fairly unified on this topic. Perhaps I’m wrong. In any case, I would not recommend gold DVDs or any optical discs for that matter, for the long or short term.

 

Thanks,

Kara

 

 

Kara Van Malssen
AVPreserve
350 7th Ave., Suite 1605
New York, NY 10001
 
office: 917-475-9630 x 2

 

 

 

 

 

 

On Aug 28, 2014, at 12:59 PM, Allison Munsell <[log in to unmask]> wrote:

 

Hi All,

 

I’m assuming Archival Gold DVD’s are still the choice for longevity?

 

Allison Munsell

Digitization Specialist, Rights &  Reproduction

Albany Institute of History & Art 
125 Washington Avenue 
Albany, NY  12210 
T:  (518) 463-4478 ext. 424 
F:  (518) 463-5506 
[log in to unmask]
 
www.albanyinstitute.org

 

 

From: The NDSA organization list [mailto:[log in to unmask]] On Behalf Of Margaret Hedstrom
Sent: Thursday, August 28, 2014 12:12 PM
To: [log in to unmask]
Subject: Re: [NDSA-ALL] Story on CBS News

 

Hi all,

 

Heard a similar story on NPR last week.

 

Great to see this in the popular media!

 

Except that it perpetuates the myth that not using labels or writing on CD’s is the way to preserve digital information.  Were it so simple.

 

Margaret

 

Margaret Hedstrom

Principal Investigator, Sustainable Environment - Actionable Data (SEAD)

Professor 

School of Information, University of Michigan

 



 

On Aug 28, 2014, at 8:43 AM, Kimberly A. Schroeder <[log in to unmask]> wrote:



Good morning all!

CBS This Morning is currently running a story on preserving CDs.  They were at the Library of Congress lab and the story was titled "Destroy to Preserve".

It is not on their website yet, but keep your eyes open!  They gave some helpful hints about not using labels and not writing on CDs.  They also showed how conservators are testing longevity via aging tests.

Great to see this in the popular media!

Best,


Kim Schroeder
Coordinator, Archival Program
Lecturer and Career Advisor
Wayne State University
School of Library and Information Science
Faculty Advisor for National Digital Stewardship Alliance
http://wsustudentndsa.wordpress.com/
[log in to unmask]
313 577-9783
Career Advising Page
http://students.slis.wayne.edu/students/planning.php

 

 

Peter Krogh

Author, The DAM Book

Now available in PDF at www.theDAMbook.com

Multi-Catalog workflow with Lightroom 5 - Available now

Organizing Your Photos with Lightroom 5 - Available now

 

 

 

 

 

 



Michele Kimpton
Chief Executive Officer
DuraSpace organization
[log in to unmask]