The recently released EEBO texts are available as TEI, I suggest you ask on the TEI list. If you want real vanilla htm like conversion, Tei-boilerplate is probably a good place to start. Cheers Stuart On Saturday, June 6, 2015, Eric Lease Morgan <[log in to unmask]> wrote: > On Jun 5, 2015, at 8:20 AM, Ethan Gruber <[log in to unmask] > <javascript:;>> wrote: > > >> Does anybody here have experience reading the SGML/XML files > representing > >> the content of EEBO? > > > > Are these in TEI? Back when I worked for the University of Virginia > > Library, I did a lot of clean up work and migration of Chadwyck-Healey > > stuff into TEI-P4 compliant XML (thousands of files), but unfortunately > all > > of the Perl scripts to migrate old garbage SGML into XML are probably > gone. > > > > How many of these things are really worth keeping, i.e., were not > digitized > > by any other organization that has freely published them online? > > > The data I have comes in two flavors: 1) some flavor of SGML, and 2) some > flavor of XML which is TEI-like, but not TEI. All of the files are worth > keeping because I get the basic bibliographic information (id, author, > title, date, keywords/subjects), as well as transcribed text. (No images.) > Given such data, I think I can provide interesting, cool, and “kewl” > services. Given the id number, I may then be able to link to the scanned > image. Wish me luck. —ELM > -- -- ...let us be heard from red core to black sky