Hi, Kyle –
I would echo the sentiment to put a copy in the dark cloud (e.g.
Amazon Glacier). Last time I checked Oracle had the cheapest dark
cloud. Go with whatever is cheap and have a plan to pull out when
There is a dizzying array of local storage options. For data this size
many will steer you towards "enterprise" storage solutions, such as a
SAN. For preservation/archiving alone, this is overkill.
What might work well is a computer with a ton of hard drives in it. I
recommend using consumer level hard drives since they are cheap and
big. Like all hard drives, they are unreliable too. Plan for some
45drives.com uses the design that the cloud hosting company Backblaze
puts out. Unlike most enterprise level products, you can see pricing
without asking for a quote.
200T of drives would be about 10k, less if you use fewer/bigger
drives. Add in ~5k for the machine itself.
It may make sense to take a hybrid approach. You could store locally
and put a copy in the dark cloud for safekeeping. If you are only
going to have one copy, a reputable cloud storage provider isn't a bad
choice. More than one copy is better.
The best path forward depends on what your IT situation looks like.
What existing resources, if any, can be leveraged? (IT staff? server
Please report back with what you end up going with.
On Thu, Mar 1, 2018 at 2:36 PM, Kyle Banerjee <[log in to unmask]> wrote:
> On Thu, Mar 1, 2018 at 11:11 AM, Kyle Breneman <[log in to unmask]>
>> 1. A combination of both. Around 60TB already digitized, with over 100TB
>> more footage still in analog form.
>> 2. Portable hard drives, a RAID, a Mac Pro tower.
>> 3. No, we do not have a digitization workflow defined yet.
>> 4. Define "quick access." I think that we do want "quick access" unless
>> we also have something like a local NAS for immediate access to the files
>> and then AWS strictly as secondary backup copies.
> Unless you move your computing to Amazon as well, I wouldn't think quick
> access would be viable. First of all, S3 is slow and works very differently
> than filesystems people are accustomed to interacting with -- I'd never use
> it for video processing. Even if that weren't a factor, the network
> latency and high bandwidth charges would be deal busters. Even just storing
> the stuff in S3 so it can quickly be accessed is going to be high -- 200TB
> is going to run you over 5 grand a month.
> I'd be much more tempted to use Amazon for preservation only. I would guess
> the Snowball devices and service fees to get your data up there and sit in
> an S3 bucket for one day while it gets transitioned to Glacier to run about
> a grand, with monthly charges after that running a little over $800. In
> case of disaster, you can recover the videos affordably enough (a few
> grand) via Snowball.
> For ongoing workflow, there would be the question of how to get material up
> there. Uploading huge files doesn't work well so you'd probably want to
> ship drives or use Snowball which will be significant.
> You could do backup yourself as Chris describes, but don't underestimate
> labor, facilities, verifying backups, bandwidth/logistics of getting the
> info to multiple places, etc. Glacier will protect the integrity of your
> data which provides much more protection than simply backing up to tape or
> drive and trusting all will be good.