On Fri, May 9, 2008 at 2:23 PM, Joe Hourcle
<[log in to unmask]> wrote:
> OpenLibrary has other datasets that you might be able to use / combine /
> whatever to meet your requirements:
>
> http://openlibrary.org/dev/docs/data
This'll get you the other MARC dumps that have been made available to
IA through OL:
http://www.archive.org/search.php?query=collection%3Aol_data%20marc
Lots to work with here.
I also wonder if rather than one large test set it wouldn't be good to
have smaller test sets which exhibit particular problems or are of a
particular type (i.e. music).
Jason
|