Godmar Back wrote: > ps: the distribution of the full text availability for the sample > considered was as follows: > > No preview: 797 (93.5%) > > For 1000 randomly drawn ISBN from 3,192,809 ISBN extracted from a > > snapshot of LoC's records [2], Google Books returned results for 852 > > ISBN. > > I found the results (85.2% recall and >99% precision, if you allow for > > the ISBN substitution; with a 3.1% margin of error) surprisingly high. > > > > [2] http://www.archive.org/details/marc_records_scriblio_net But doesn't "no preview" mean books that Google haven't scanned? If Google had downloaded [2] and incorporated the bibliographic records in their collection, then the recall would have gone from 85 to 100 %. How impressive is that really? I'm prepared to be impressed if they have indeed scanned books for 6.5% of all ISBNs in the Library of Congress. But that's not really 85% recall. -- Lars Aronsson ([log in to unmask]) Aronsson Datateknik - http://aronsson.se