Hi Edward, We're currently using the warc-tools library for WARC creation. It's written in Python, but there are a few pre-built utilities that come with the package that might suit your needs? http://code.hanzoarchives.com/warc-tools -Kurt ________________________________________ From: Code for Libraries [[log in to unmask]] on behalf of Edward M. Corrado [[log in to unmask]] Sent: Wednesday, November 23, 2011 5:30 PM To: [log in to unmask] Subject: [CODE4LIB] Web archiving and WARC Hello All, I need to harvest a few Web sites in order to preserve them. I'd really like to preserve them using the WARC file format [1] since it is a standard for digital preservation. I looked at I looked at Web Curator Tool (WCT) and Heritrix and they seem to be good at what they do but are built to work on a much larger scale then what I'd like to do -- and that comes with a cost of increased complexity. Tools like wget are simple to use and can easily be scripted to accomplish my limited task, except the standard wget and similar tools I am familiar with do not support WARC. Also, I haven't been able to find a tool that can convert zipped files created with wget to WARC. I did find a version of wget with warc support built in [1] from the Archive Team so that may be my solution, but compile software with "dirty" written into the name of the zip file is maybe not the best longterm solution. Does anyone know of any other simples tool to create a WARC file (either from harvesting or converting a wget or similar mirror/archive)? Edward [1] http://archiveteam.org/index.php?title=Wget_with_WARC_output