Well, the solutions become less and less affordable most likely.  [😉]

We are adding long-term digital (initially bit-wise) preservation as an offering for our customers and recently went through a costing exercise that I think is useful.

For our service, we will be utilizing:

1) local NAS appliances, with roughly 36TB useful space in each one,

2) Amazon Glacier

3) Either BackBlase B2 or Google Coldline storage.

For retrievals, obviously our NAS is the first stop, but if we have a checksum error, we would dip into Glacier and the other solution.  Obviously, we would keep (in each location and a database) the cheksum and administrative information for all the files as well and perform occasional checksums between all storage devices.

Taking into account hardware refresh and such, the cost was somewhere around $250 annually per terrabyte just for maintaining the storage.  We included a slight profit margin on the NAS, so that break even was at 18TB, since we didn't know how quickly we would fill it completely.  Without a doubt, though, the local spinning disks storage was the most expensive when you include hardware refresh.

Just offering this in case it is useful.

Mark V. Sullivan
Application Architect
Sobek Digital Hosting and Consulting, LLC
[log in to unmask]<mailto:[log in to unmask]>
866-981-5016 (office)

From: Code for Libraries <[log in to unmask]> on behalf of Roy V Zimmer <[log in to unmask]>
Sent: Friday, January 13, 2017 2:00 PM
To: [log in to unmask]
Subject: Re: [CODE4LIB] What about 4 terabytes? RE: [CODE4LIB] 46 gigabytes

Step 2 should be relatively easy, Ray, as such drives are readily
available these days at decent prices.
Step 3 could be a stumbling block...AWS comes to mind, but I've no
experience with that.


On 1/13/2017 1:41 PM, Schwartz, Raymond wrote:
> I found this discussion very informative.  But I would like to change a parameter from 46gb to 4tb.  What affordable and simple options are there for that amount of data?
> /Ray
> -----Original Message-----
> From: Code for Libraries [mailto:[log in to unmask]] On Behalf Of Kyle Banerjee
> Sent: Tuesday, December 13, 2016 6:05 PM
> To: [log in to unmask]
> Subject: Re: [CODE4LIB] 46 gigabytes
>> Taking things like cost, convenience, and the knowledge that my
>> solution will always include migrating forward, there is what I think I will do:
>>    1. buy a pile o’ SD cards, put multiple copies
>>       of my data on each, and physically store
>>       some here and some there
>>    2. buy a networked drive, connected it to my
>>       hub, and use it locally
>>    3. break down and use some sort of cloud
>>       service to make yet more copies of my data
>>    4. re-evaluate in 365 days; this is a never
>>       -ending process
> As is this for personal data and there isn't that much of it, there are many paths that will work including the above.
> We have a cultural bias towards squirreling away copies all over the place.
> The advantage is that it's impossible to lose everything. The disadvantage is that it's labor intensive, scales poorly, and synchronization as well as knowing what you can really trust (completeness, integrity, etc) is an issue.
> Given that you can recover deleted files as well as restore previous versions from services such as Google Drive, Dropbox, etc, there's no real reason to keep copies on so many cloud services. You can just use one which can be accessed from your personal computer, cell phone, and over the web
> -- this will be far more convenient and reliable/safe than any solution involving personal hardware.
> kyle