Much like my HathiTrust Research Center Workset Browser, I have been able to create a (fledgling) “browser” against the EEBO-TCP content:
I have begun creating a “browser” against content from EEBO-TCP
in the same way I have created a browser against worksets from
the HathiTrust. The goal is to provide “distant reading” services
against subsets of the Early English poetry and prose. You can
see these fledgling efforts against a complete set of Richard
Baxter’s works. Baxter was an English Puritan church leader,
poet, and hymn-writer. [1, 2, 3]...
The EEBO-TCP Workset Browser is not as mature as my HathiTrust
Workset Browser, but it is coming along. [15] Next steps include:
calculating an integer denoting the number of pages in an item,
implementing a Web-based search interface to a subset’s full text
as well as metadata, putting the source code (written in Python
and Bash) on GitHub. After that I need to: identify more robust
ways to create subsets from the whole of EEBO, provide links to
the raw TEI/XML as well as HTML versions of items, implement
quite a number of cosmetic enhancements, and most importantly,
support the means to compare & contrast items of interest in each
subset. Wish me luck?
1. Richard Baxter (the person) – http://en.wikipedia.org/wiki/Richard_Baxter
2. Richard Baxter (works) – http://bit.ly/ebbo-browser-baxter-works
3. Richard Baxter (analysis of works) – http://bit.ly/eebo-browser-baxter-analysis
15. HathiTrust Workset Browser – https://github.com/ericleasemorgan/HTRC-Workset-Browser
For more detail, please see the blog posting — http://bit.ly/emorgan-eebo-browser
Fun with well-structured data, open access content, and the definition of librarianship?
—
Eric Lease Morgan
University of Notre Dame
|