Print

Print


> From: Code for Libraries [mailto:[log in to unmask]] On
> Behalf Of Keith Jenkins
> Sent: 05 September, 2006 12:42
> To: [log in to unmask]
> Subject: Re: [CODE4LIB] LC MARC records?
>
> Worldcat.org, as nice as it is, only offers a limited subset
> of bibliographic data.  There's no way to get to the
> underlying MARC record, as far as I can see.  Or am I missing
> something?

People seem to be confused over the difference between Open
WorldCat and WorldCat.org.  It is understandable since both
have WorldCat in them.

Open WorldCat is a 4 million record set that search engines,
such as Google and Yahoo, get from OCLC.  The reason for this
reduced set is that the search engine folks couldn't ingest
all of WorldCat.  We really wanted them to ingest the whole
thing, but they balked.  In Google's case, due to how they
view resources, many bibliographic records looked like
duplicates and so they told OCLC that they didn't want the
whole thing.  So we basically gave them a FRBR-ized view of
the most commonly held records.

Due to a number of factors, one being this limitation of the
search engines to expose all of WorldCat, OCLC created the
WorldCat.org search portal which does exposes all 80 million
plus, bibliographic records.  So when you go to:
<http://worldcat.org/> you get access to all the records.  If
you come in from an Internet search engine, you will only see
the 4 million records they ingested.

Actually, most people didn't realize that even though the
search engines only incorporated 4 million records into their
result lists, people had access to the full 80 million records
through URL manipulation.

It is true, today, that you do not have access to the underlying
MARC record, however the document returned from WorldCat.org is
a well formed XHTML document that can be run through an XSLT to
be used in a mash-up.  The Open WorldCat team, who is now
responsible for both Open WorldCat and WorldCat.org is working
on an SRU/W interface to WorldCat.org.  Not sure what record
formats will be delivered, e.g., MARC, Dublin Core, etc. since
this is on their TODO list and that's about all I know.


Andy.