Thank you all for the replies.
To summarize:
- Tim Spalding offered LibraryThing's database at
http://www.librarything.com/wiki/index.php/LibraryThing_APIs
- Roy Tennant pointed at MIT's Barton dump: available at
<http://simile.mit.edu/rdf-test-data/>
but the winner is probably this python script based on Ed's suggestion:
-----
#!/usr/bin/python
from urllib import urlopen
from pymarc import MARCReader
locrecordspattern =
'http://www.archive.org/download/marc_records_scriblio_net/part%02d.dat'
for part in range(1, 30):
for record in MARCReader(urlopen(locrecordspattern % part)):
if record['020'] and record['020']['a']:
print record['020']['a']
------
Now if I could only figure out how to install "easy_install" on FC8 so
I didn't have to run it with:
env PYTHONPATH=`pwd`/pymarc-2.21 ./readloc.py
- Godmar
On Tue, Apr 29, 2008 at 8:20 AM, Ed Summers <[log in to unmask]> wrote:
> You could download a snapshot of the full LC back file at the Internet
> Archive (kindly donated by Scriblio).
>
> http://www.archive.org/details/marc_records_scriblio_net
>
> Then run a script using your favorite MARC parsing library (mine
> currently is pymarc):
>
> from pymarc import MARCReader
>
> for record in MARCReader(file('part01.dat')):
> if record['020'] and record['020']['a']:
> print record['020']['a']
>
> //Ed
>
>
>
> On Mon, Apr 28, 2008 at 9:35 AM, Godmar Back <[log in to unmask]> wrote:
> > Hi,
> >
> > for an investigation/study, I'm looking to obtain a representative
> > sample set (say a few hundreds) of ISBNs. For instance, the sample
> > could represent LoC's holdings (or some other acceptable/meaningful
> > population in the library world).
> >
> > Does anybody have any pointers/ideas on how I might go about this?
> >
> > Thanks!
> >
> > - Godmar
> >
>
|