Print

Print


Godmar Back wrote:

> ps: the distribution of the full text availability for the sample
> considered was as follows:
>
> No preview: 797 (93.5%)

> >  For 1000 randomly drawn ISBN from 3,192,809 ISBN extracted from a
> >  snapshot of LoC's records [2], Google Books returned results for 852
> >  ISBN.

> >  I found the results (85.2% recall and >99% precision, if you allow for
> >  the ISBN substitution; with a 3.1% margin of error) surprisingly high.
> >
> >  [2] http://www.archive.org/details/marc_records_scriblio_net


But doesn't "no preview" mean books that Google haven't scanned?
If Google had downloaded [2] and incorporated the bibliographic
records in their collection, then the recall would have gone from
85 to 100 %.  How impressive is that really?

I'm prepared to be impressed if they have indeed scanned books for
6.5% of all ISBNs in the Library of Congress.  But that's not
really 85% recall.


--
  Lars Aronsson ([log in to unmask])
  Aronsson Datateknik - http://aronsson.se