I don't have anything for you, but I wanted to say that the project sounds
severely cool!
Best regards,
*Jason Bengtson, MLIS, MA*
Innovation Architect
*Houston Academy of MedicineThe Texas Medical Center Library*
1133 John Freeman Blvd
Houston, TX 77030
http://library.tmc.edu/
www.jasonbengtson.com
On Sat, Jul 11, 2015 at 11:06 AM, Eric Lease Morgan <[log in to unmask]> wrote:
> I have begun working on a suite of software designed to enable a person to
> “read” the full text of hundreds (if not a thousand) articles from JSTOR
> simultaneously, and I call this software the JSTOR Workset Browser. [1]
>
> Using JSTOR’s Data For Research service, it is possible for anybody to
> first search & browse the totality of JSTOR. [2] The reader is then able to
> create and download a “dataset” describing found items of interest. This
> dataset includes a citations.xml file. The Browser takes this citations.xml
> file as input and then: 1) harvests the content, 2) indexes it, 3) does
> some analysis against the content, 4) creates a few graphs illustrating
> characteristics of the dataset, and finally 5) generates a browsable
> “catalog” in the form of an HTML table. The table includes columns for
> things like authors, titles, dates as well as page lengths, number of
> words, and coefficients denoting the use of color words, “big” names, and
> “great” ideas. In the near future the Browser will support search as well
> as the generation of a report describing each reader-generated (curated)
> collection. You can see a number of collections created to date, including
> writings about Thoreau, E!
> merson, Dickinson, Longfellow, and Poe. [3]
>
> Combined with similar tools designed to work against the HathiTrust and/or
> EEBO-TCP, the ultimate goal is to enable students and scholars to easily do
> research against massive amounts of content quickly and easily. [4, 5]
>
> I’m looking for additional sample content. If you create a dataset from
> DFR, then send me the citations.xml file, and I will use it as input for
> the Browser. “Wanna play?”
>
>
> [1] Browser on GitHub - http://bit.ly/jstor-workset-browser
> [2] Data For Research - http://dfr.jstor.org
> [3] sample collections -
> http://dh.crc.nd.edu/sandbox/jstor-workset-browser/
> [4] HathiTrust Workset Browser -
> https://github.com/ericleasemorgan/HTRC-Workset-Browser
> [5] EEBO-TCP Workset Browser -
> https://github.com/ericleasemorgan/EEBO-TCP-Workset-Browser
>
>
> —
> Eric Lease Morgan, Librarian
>
|