Print

Print


On a whim I created a bittorrent of the concatenated MARC files
donated to the Internet Archive by Scriblio (7,030,372 records):

  http://inkdroid.org/torrents/lc-bib.torrent

Feel free to download them, and please consider running your client to
help seed the data.

//Ed

On Tue, Apr 29, 2008 at 2:02 PM, Godmar Back <[log in to unmask]> wrote:
> Thank you all for the replies.
>
>  To summarize:
>
>  - Tim Spalding offered LibraryThing's database at
>  http://www.librarything.com/wiki/index.php/LibraryThing_APIs
>  - Roy Tennant pointed at MIT's Barton dump: available at
>  <http://simile.mit.edu/rdf-test-data/>
>
>  but the winner is probably this python script based on Ed's suggestion:
>
>  -----
>  #!/usr/bin/python
>
>  from urllib import urlopen
>  from pymarc import MARCReader
>
>  locrecordspattern =
>  'http://www.archive.org/download/marc_records_scriblio_net/part%02d.dat'
>
>  for part in range(1, 30):
>     for record in MARCReader(urlopen(locrecordspattern % part)):
>
>         if record['020'] and record['020']['a']:
>             print record['020']['a']
>  ------
>
>  Now if I could only figure out how to install "easy_install" on FC8 so
>  I didn't have to run it with:
>  env PYTHONPATH=`pwd`/pymarc-2.21 ./readloc.py
>
>   - Godmar
>
>
>
>  On Tue, Apr 29, 2008 at 8:20 AM, Ed Summers <[log in to unmask]> wrote:
>  > You could download a snapshot of the full LC back file at the Internet
>  >  Archive (kindly donated by Scriblio).
>  >
>  >   http://www.archive.org/details/marc_records_scriblio_net
>  >
>  >  Then run a script using your favorite MARC parsing library (mine
>  >  currently is pymarc):
>  >
>  >   from pymarc import MARCReader
>  >
>  >   for record in MARCReader(file('part01.dat')):
>  >       if record['020'] and record['020']['a']:
>  >           print record['020']['a']
>  >
>  >  //Ed
>  >
>  >
>  >
>  >  On Mon, Apr 28, 2008 at 9:35 AM, Godmar Back <[log in to unmask]> wrote:
>  >  > Hi,
>  >  >
>  >  >  for an investigation/study, I'm looking to obtain a representative
>  >  >  sample set (say a few hundreds) of ISBNs. For instance, the sample
>  >  >  could represent LoC's holdings (or some other acceptable/meaningful
>  >  >  population in the library world).
>  >  >
>  >  >  Does anybody have any pointers/ideas on how I might go about this?
>  >  >
>  >  >  Thanks!
>  >  >
>  >  >   - Godmar
>  >  >
>  >
>