Kathryn,
Bagger provides for validating stored Bags. You might need to write a script to run that as a Batch. Also check out the AVPreserve tool Fixity, which is a fixity management / monitoring tool. Deciding on the appropriate schedule will be important if you're using the Amazon cloud for storage of one of your preservation copies (another one should be not in the Amazon cloud) because of the cost of connecting to the data being stored there and the transmission costs. Generally, storage in the cloud services is not expensive but connecting and using the digital objects is when/how they make their money.
Best,
Kari Smith
-----Original Message-----
From: Code for Libraries [mailto:[log in to unmask]] On Behalf Of Kathryn Frederick (Library)
Sent: Thursday, January 16, 2014 11:44 AM
To: [log in to unmask]
Subject: [CODE4LIB] long-term preservation of digital files
Hi,
I'm trying to develop a process for long-term preservation of the files we're creating though our digitization projects. My current plan is to bag groups of files using Bagger. Each bag would include all versions of the file (generally TIFF, JPEG, PDF and .txt transcript), a file of technical metadata (generated using exiftool), and .xml and marc files of descriptive metadata. Bagger will generate the checksums and create a file manifest. Our IT department is providing 8TB of Amazon S3 storage and have set up an AWS storage gateway. The storage will be dedicated to these files and access will strictly limited. I'm planning to regularly audit what's been stored but haven't decided on a tool to do that. Any recommendations? Is there anything else I should consider doing?
Thanks in advance for any advice!
Kathryn
|