I have put on GitHub a thing I call the EEBO-TCP Workset Browser. [1] From the README file:
The EEBO-TCP Workset Browser is a suite of software designed to support
"distant reading" against the corpus called the Early English Books
Online - Text Creation Partnership corpus. Using the Browser it is
possible to: 1) search a "catalog" of the corpus's metadata, 2) create a
list of identifiers representing a subset of content for study, 3) feed
the identifiers to a set of files which will mirror the content locally,
index it, and do some rudimentary analysis outputting as set of HTML
files, structured data, and graphs. The reader is then expected to
examine the output more "closely" (all puns intended) using their
favorite Web browser, text editor, spreadsheet, database, or statistical
application. The purpose and functionality of this suite is very similar
to the purpose and functionality of HathiTrust Research Center Workset
Browser.
[1] EBO-TCP Workset Browser - https://github.com/ericleasemorgan/EEBO-TCP-Workset-Browser
—
Eric Lease Morgan, Librarian
University of Notre Dame
|