We are working on a more automated process for our Electronic Thesis and Dissertations, and I'm wondering if anyone here has already done this and is willing to share code and/or where to watch for potholes.
The University Graduate Student office works with students to submit their final/official ETDs to ProQuest. ProQuest does some of their own processing and then FTPs the ETDs as a zip file of PDFs and XML to a drop zone we host. In addition to accessioning them into our digital archives, we want to automate pre-loading the metadata for Connexion so our Cataloging group can verify the data and add their local, human touch before pushing it up to OCLC.
Our thinking was to script a conversion for the ProQuest XML to MarcXML and import that into Connexion. Has anyone already written a tool to do that? Is there an alternative (/better?) process?
Thanks,
Erich
--
Erich Hammer Head of Library Systems
[log in to unmask] University Libraries
518-442-3891 University @ Albany
"A man is accepted into a church for what he believes and
he is turned out for what he knows." -- Mark Twain
|