Bill, At Michigan, all large high-res files--in fact, all source files--are retained long-term for preservation purposes. In fact, where possible, we build the system around the source files rather than derivatives in order to simplify maintenance and ensure the integrity of the source. This strategy involves replication across multiple systems (and of course storage) wherever possible. We store the high-res files in a variety of places, including DLT, CD-ROM (gold, with ISO 9660 file naming conventions), and again where possible, disk (RAID). I should add that we are able to do this in most cases, including for our Preservation-oriented page image conversion activities, though not yet for continuous tone images. In the case of the Preservation activities, we convert and mount several million pages a year, and Moore's Law has been our friend in this regard for several years. We buy disk as we need it (or, actually, just before we need it), and have been able to keep the costs of disk purchases *and* maintenance down while performance very high. We have not invested in hierarchical storage systems. In the case of our continuous tone imaging operations, the relatively large number (thousands per year) of images we create at over 120Mb each has precluded our ability to take the same sort of approach, but we are hopeful about trends in the area of imaging, and particularly JPEG2000. What we imagine might be possible here is storing a lossless compressed continuous tone image as part of the access system, and (because the compression is lossless) being able to combine the access and master versions. We're only in the exploratory phases of this idea, however. We do not compress masters, in general, though we do create derivatives of our continuous tone images to put them online. We are using Mr. Sid's wavelet compression for the continuous tone images in the online system, but store the uncompressed masters on gold CD-ROM using ISO 9660 file naming conventions. Finally, let me add that we're still in the process of developing a taxonomy of locally-stored digital files with associated responsibilities. Some material we serve up to the campus is not uniquely held by us, and in fact may be transitory. Other material is unique and even in this context we may have more or less responsibility for the long-term maintenance. This sort of taxonomy will figure into our future storage decisions. We have several terabytes of RAID online, and currently buy exclusively hardware RAID 5 subsystems for our production server, and in particular have just installed two extremely cost-effective ($7.50/GB) SCSI-IDE hardware RAIDs with which we are very happy. However, given our strategy of keeping source materials on spinning disk (as opposed to nearline tape storage or something similar) we have grown to the point where just adding direct-attached storage like this and managing multiple filesystems is no longer scalable. We're currently planning to migrate to storage appliances beginning this year to solve this problem. Because these systems are either NAS or SAN based, this strategy has the added benefit of making it very simple to bring up as many web servers as we want in a single location to handle increased load and/or protect against web server outages, which is another critical goal. --------------------------------------------------------------------------- John Price Wilkin Phone: 734.764.8016 Interim Associate Director Fax: 734.763.5080 Digital Library Services, University Library email: [log in to unmask] 818 Hatcher South http://www.lib.umich.edu/dls/ University of Michigan Ann Arbor, MI 48109-1205 On Fri, 10 Jan 2003, Bill Britten wrote: > Colleagues, > > Here at Tennessee we seem to be expanding the amount of digital data at an exponential rate. A TB of storage lasted about a year. We are buying another 1.3 TB, knowing it may only get us through another 6 months. And there is the corresponding need for our backup system to keep up with all of this storage. This email is a reality-check for us, if you could let us know the state of digital storage at your institution. > > For digital projects, are master files (i.e. large high-res files) > retained long-term for preservation purposes? > > If so, are these files stored on disk, or written off to CD, DVD, tape? > > Are you compressing these files ... using what? > > What level of storage do you currently maintain, and what are the immediate plans for purchase? > > thanks so much, > > Bill Britten > Professor and Head, Library Systems > 647 Hodges Library, UTK, Knoxville, TN 37996-1000 > voice: 865-974-1082 fax: 865-974-0626 > [log in to unmask] > >