If you are interested in Hadoop's distributed filesystem HDFS [1] you
might also be interested in Tahoe [2].
The downside to things like Hadoop and Tahoe as compared with S3 are
that you have to manage the machines and services yourself, rather
than paying someone else to do it in the cloud. But I guess for some
this is an upside.
Or has Yahoo built a service model around Hadoop?
//Ed
[1] http://hadoop.apache.org/core/docs/current/hdfs_design.html
[2] http://allmydata.org/trac/tahoe
|