I believe I have created a repository of my HTRC Workset Browser code (shell and Python scripts) on GitHub. [1] From the Quick Start section of the README:
1. Download the software putting the bin and etc directories in the same directory.
2. Change to the directory where the bin and etc directories have been saved.
3. Build a collection by issuing the following command:
./bin/build-corpus.sh thoreau etc/rsync-thoreau.sh
If all goes well, the Browser will create a new directory named thoreau,
rsync a bunch o' JSON files from the HathiTrust to your computer, index
the JSON files, do some textual analysis against the corpus, create a
simple database ("catalog"), and create a few more reports. You can then
peruse the files in the newly created thoreau directory. If this worked,
then repeat the process for the other rsync files found in the etc
directory.
Probably the first issue people will have is the path to their version of Python. (Sigh.)
[1] repository - https://github.com/ericleasemorgan/HTRC-Workset-Browser
—
Eric “Git Ignorant” Morgan
|