Unfortunately, being an ISO standard, to obtain it costs 118 CHF (about $110 USD). Hard to follow a standard you can't afford to read. Is there an online version somewhere? kc [log in to unmask] wrote: > hi code4lib, > > if you're archiving web content, please use the WARC format. > > thanks, > [log in to unmask] > > > > WARC File Format Published as an International Standard > http://netpreserve.org/press/pr20090601.php > > ISO 28500:2009 specifies the WARC file format: > > * to store both the payload content and control information from > mainstream Internet application layer protocols, such as the > Hypertext Transfer Protocol (HTTP), Domain Name System (DNS), > and File Transfer Protocol (FTP); > * to store arbitrary metadata linked to other stored data > (e.g. subject classifier, discovered language, encoding); > * to support data compression and maintain data record integrity; > * to store all control information from the harvesting protocol > (e.g. request headers), not just response information; > * to store the results of data transformations linked to other > stored data; > * to store a duplicate detection event linked to other stored > data (to reduce storage in the presence of identical or > substantially similar resources); > * to be extended without disruption to existing functionality; > * to support handling of overly long records by truncation or > segmentation, where desired. > > > more info here: > http://www.digitalpreservation.gov/formats/fdd/fdd000236.shtml > > -- ----------------------------------- Karen Coyle / Digital Library Consultant [log in to unmask] http://www.kcoyle.net ph.: 510-540-7596 skype: kcoylenet fx.: 510-848-3913 mo.: 510-435-8234 ------------------------------------