Thanks for the all the pointers; just what I wanted, and gives me plenty of
ways to test the generic meta data handling. Great!
On Jan 12, 2012 3:19 AM, "Simon Spero" <[log in to unmask]> wrote:
> You can get anything you want
> At Brewster Kahle's restaurant.
> On Wed, Jan 11, 2012 at 10:55 AM, LeVan,Ralph <[log in to unmask]> wrote:
> > http://staff.oclc.org/~levan/PearsTraining/scifi.usmarc has 10,000 marc
> > records in it. They are part of the old SiteSearch system that OCLC
> > released as open source. They date back to 2002 and will not contain
> > any Unicode, if you were hoping to include that as part of your testing.
> > Ralph
> > -----Original Message-----
> > From: Code for Libraries [mailto:[log in to unmask]] On Behalf Of
> > Alexander Johannesen
> > Sent: Wednesday, January 11, 2012 5:36 AM
> > To: [log in to unmask]
> > Subject: Open datasets
> > Hiya,
> > I'm in the middle of creating a meta data management system (including
> > merging and persistent identifier management) for a somewhat different
> > domain (intranets and business integration), but it's based on Topic
> > Maps
> > and so is well suited to other means of meta data handling / mangling.
> > It's
> > also going to be open-source, and it might be well-suited to library
> > tasks
> > as well.
> > So in order to test the integrity and performance of my system so far
> > I'm
> > wondering if there's a suitable open dataset of bibliographic records
> > that
> > aren't too obscure (meaning, I can find the titles at amazon or Open
> > Library) that you could recommend? More than 1000 records, but less than
> > a
> > million, maybe?
> > Regards,
> > Alex