Print

Print


Hi Eric

I used an OCLC number match to get a sense of overlap at WFU -
http://www.erikmitchell.info/2011/05/06/how-much-overlap-do-we-have-with-the-hathitrust/,
http://www.erikmitchell.info/2011/05/07/more-on-hathitrust-overlap/.
As I recall I simply pulled the oclc numbers from the MARC files
(perhaps even just their spreadsheets) and did some simple database
querying.

More recently I have been working with the HT files using text
similarity measures (e.g. pylevenshtein) to compare holdings across
libraries.  This takes a lot of CPU time but has proven to be a pretty
good way to compare holdings at a title level and I suppose with a
detailed enough text string (title, pub date, publisher...) you could
focus the comparison on expressions/manifestations rather than just
titles.

Erik

On Fri, Aug 3, 2012 at 11:15 AM, Jon Stroop <[log in to unmask]> wrote:
> You can do an empty query in their catalog, and use the "Original Location"
> facet to filter to a holding library. Programatically, I'm not sure, but
> you'd probably need to use the Hathi files:
> http://www.hathitrust.org/hathifiles.
>
> -Jon
>
>
> On 08/03/2012 11:07 AM, Eric Lease Morgan wrote:
>>
>> If I needed/wanted to know what materials held by my library were also in
>> the HaitTrust, then programmatically how could I figure this out? In other
>> words, do you know of a way to query the HaitTrust and limit the results to
>> items my library owns? --Eric Lease Morgan