LISTSERV 16.5 - CODE4LIB Archives

Ross, it might not be yahoo, but that doesn't mean I know what it is. 
The pyRDFa utility returns garbage for RDF/XML and TTL, but not for 
JSON. It's only in the JSON output that I am getting any bibliographic 
data. The other two send me back a bunch of links to css files. I guess 
this is good news for folks who prefer JSON. Also, I see the OCLC number 
in the JSON, but not the URI, although the URI appears in the div with 
the RDFa:

<div itemid="http://www.worldcat.org/oclc/527725" itemscope="" 
itemtype="http://schema.org/Book" 
resource="http://www.worldcat.org/oclc/527725" 
typeof="http://schema.org/Book"><<a 
href="http://www.worldcat.org/oclc/527725">http://www.worldcat.org/oclc/527725</a>>

I must say I wonder a bit about those double "<<>>" but what do I know? 
Anywhere, here's what I get from pyRDFa:

RDF/XML:

<rdf:RDF><_4:Book rdf:about="http://schema.org/Book"/><rdf:Description 
rdf:about="http://www.worldcat.org/title/selection-of-early-statistical-papers-of-j-neyman/oclc/527725"><xhv:stylesheet 
rdf:resource="http://static1.worldcat.org/wcpa/rel20120711/css/loginpopup.css"/><xhv:stylesheet 
rdf:resource="http://static1.worldcat.org/wcpa/rel20120711/html/masthead.css"/><xhv:stylesheet 
rdf:resource="http://static1.worldcat.org/wcpa/rel20120711/css/alerts.css"/><xhv:stylesheet 
rdf:resource="http://static1.worldcat.org/wcpa/rel20120711/css/modals_jquery.css"/><xhv:stylesheet 
rdf:resource="http://static1.worldcat.org/wcpa/rel20120711/html/layered_divs.css"/><xhv:stylesheet 
rdf:resource="http://static1.worldcat.org/wcpa/cssj/N245213502/bundles/print-min.css"/><xhv:stylesheet 
rdf:resource="http://static1.worldcat.org/wcpa/rel20120711/css/cr_print.css"/><xhv:stylesheet 
rdf:resource="http://static.weread.com/css/booksiread/relbookswidget.css?0:5"/><xhv:stylesheet 
rdf:resource="http://static1.worldcat.org/wcpa/rel20120711/css/itemformat.css"/><xhv:stylesheet 
rdf:resource="http://static1.worldcat.org/wcpa/cssj/N1807112156/bundles/screen-min.css"/><xhv:stylesheet 
rdf:resource="http://static1.worldcat.org/wcpa/rel20120711/html/record.css"/><xhv:stylesheet 
rdf:resource="http://static1.worldcat.org/wcpa/rel20120711/html/yui/build/reset-fonts-grids/reset-fonts-grids.css"/><xhv:stylesheet 
rdf:resource="http://static1.worldcat.org/wcpa/rel20120711/html/new_wcorg.css"/></rdf:Description></rdf:RDF>

JSON:

{
"@context": {
"library": "http://purl.org/library/",
"oclc": "http://www.worldcat.org/oclc/",
"skos": "http://www.w3.org/2004/02/skos/core#",
"madsrdf": "http://www.loc.gov/mads/rdf/v1#",
"schema": "http://schema.org/",
"http://purl.org/library/placeOfPublication": {
"@type": "@id"
},
"http://schema.org/about": {
"@type": "@id"
},
"http://schema.org/publisher": {
"@type": "@id"
},
"http://schema.org/author": {
"@type": "@id"
},
"http://www.w3.org/2004/02/skos/core#inScheme": {
"@type": "@id"
},
"http://www.loc.gov/mads/rdf/v1#isIdentifiedByAuthority": {
"@type": "@id"
}
},
"@id": "oclc:527725",
"@type": "schema:Book",
"schema:inLanguage": {
"@value": "en",
"@language": "en"
},
"library:holdingsCount": {
"@value": "285",
"@language": "en"
},
"schema:author": {
"@id": "http://viaf.org/viaf/24666861",
"@type": "schema:Person",
"madsrdf:isIdentifiedByAuthority": 
"http://id.loc.gov/authorities/names/n50066374",
"schema:name": {
"@value": "Neyman, Jerzy, 1894-1981.",
"@language": "en"
}
},
"schema:name": {
"@value": "A selection of early statistical papers of J. Neyman.",
"@language": "en"
},
"schema:datePublished": {
"@value": "1967.",
"@language": "en"
},
"schema:numberOfPages": {
"@value": "429",
"@language": "en"
},
"library:oclcnum": {
"@value": "527725",
"@language": "en"
},
"schema:about": [
{
"@type": "skos:Concept",
"madsrdf:isIdentifiedByAuthority": 
"http://id.loc.gov/authorities/subjects/sh85082133",
"schema:name": {
"@value": "Mathematical statistics.",
"@language": "en"
}
},
{
"@id": "http://dewey.info/class/519/",
"@type": "skos:Concept",
"skos:inScheme": "http://dewey.info/scheme/"
},
{
"@type": "skos:Concept",
"schema:name": {
"@value": "Statistique mathématique.",
"@language": "en"
}
},
{
"@id": "http://id.worldcat.org/fast/1012127",
"@type": "skos:Concept",
"schema:name": {
"@value": "Mathematical statistics‍",
"@language": "en"
}
}
],
"schema:publisher": {
"@type": "schema:Organization",
"schema:name": {
"@value": "University of California Press",
"@language": "en"
}
},
"library:placeOfPublication": {
"@type": "schema:Place",
"schema:name": {
"@value": "Berkeley,",
"@language": "en"
}
}
}

kc

On 7/12/12 2:13 PM, Ross Singer wrote:
> Ok, the Pipe didn't quite work as planned.  Yahoo! is stripping out
> all of the relevant html attributes when it's converting the WC
> microdata html to a string, which renders the whole thing useless.
>
> If I don't convert it to a string, it maintains all of the necessary
> attributes in the JSON output, but it strips them from the RSS and
> html outputs.
>
> I mean, it's hard to complain about "free thing doesn't handle my
> niche problem", but when has that ever stopped me?
>
> Anyway, it's there for somebody to clone and poke around with.  Maybe
> somebody more familiar with Pipes can figure a way around this
> problem.
>
> -Ross.
>
> On Thu, Jul 12, 2012 at 3:03 PM, Ross Singer <[log in to unmask]> wrote:
>> I made a Yahoo Pipe that merges the WorldCat Basic OpenSearch RSS
>> result with the microdata div in the Worldcat pages referred to in the
>> search results:
>>
>> http://pipes.yahoo.com/pipes/pipe.info?_id=05ae2a7bc180f3abe36b11bcaf1adc52
>>
>> You'll need to enter your wskey for it to work.
>>
>> You can get the output as RSS (which will require the item/description
>> to be unescaped to use) or JSON (which wouldn't require unescaping).
>>
>> It's not terribly fast, but it least should help somebody get started.
>>
>> -Ross.
>>
>> On Thu, Jul 12, 2012 at 1:09 PM, Karen Coyle <[log in to unmask]> wrote:
>>> It isn't unfortunate, it was deliberate. I have a key for the basic api, but
>>> I was being advised that I had overlooked the obvious answer of the worldcat
>>> search API. I have no confusion between the two, except for the confusion
>>> that seems to be promulgated by OCLC itself.
>>>
>>> kc
>>>
>>>
>>>
>>> On 7/12/12 9:46 AM, Karen Coombs wrote:
>>>> Karen,
>>>>
>>>> Unfortunately it looks like you requested a key for the WorldCat
>>>> Search API which does have specific eligibility criteria. The WorldCat
>>>> Basic API which Ross mentions is available to anyone -
>>>> http://www.oclc.org/developer/services/worldcat-basic-api
>>>>
>>>> It allows you to do an OpenSearch keyword query of WorldCat and get
>>>> back basic metadata including the link to the worldcat.org page for
>>>> each record returned.
>>>>
>>>> The easiest way to get a key is to go to http://worldcat.org/config/
>>>> and login with a WorldCat username/password. You should see a link
>>>> that says WorldCat Basic API Key which you can use to get a key.
>>>>
>>>> I apologize for the confusion between the two APIs (WorldCat Search
>>>> and WorldCat Basic). The difference is something we've tried to make
>>>> clearer in our documentation but unfortunately given your experience
>>>> it is still an issue.
>>>>
>>>> Karen
>>>>
>>>>
>>>> On Thu, Jul 12, 2012 at 11:33 AM, Karen Coyle <[log in to unmask]> wrote:
>>>>> On 7/10/12 5:07 PM, Karen Coyle wrote:
>>>>>> On 7/10/12 4:02 PM, Richard Wallis wrote:
>>>>>>>
>>>>>>> But is it available to everyone, and is the data retrieved also usable
>>>>>>> as
>>>>>>> ODC-BY by any member of the Web public?
>>>>>>>
>>>>>>> Yes it is, and at this stage it is only available from within a html
>>>>>>> page.
>>>>>>
>>>>>> The "it" I was referring to was the API. Roy is telling me that people
>>>>>> should use the API, as if that is an obvious option that I am
>>>>>> overlooking. I
>>>>>> am asking if the general web public can use the API to get this data. I
>>>>>> believe that should be a yes/no question/answer.
>>>>>
>>>>> Since no one here from OCLC had the integrity to answer this question, I
>>>>> went ahead and applied for a Worldcat API key, and here is the reply:
>>>>>
>>>>> *****
>>>>>
>>>>> Hello,
>>>>>
>>>>> Thank you for your interest in the WorldCat Search API, however at this
>>>>> time
>>>>> the web service is only available to institutions, primarily libraries,
>>>>> that
>>>>> have a specific relationship with OCLC and then only for work related to
>>>>> that library's services. The specific relationship is explained further
>>>>> here,
>>>>> http://oclc.org/developer/documentation/worldcat-search-api/who-can-use.
>>>>>
>>>>> However, there are other OCLC services that are available to individual's
>>>>> non-commercial use.  Looking at the list of services available on
>>>>> http://www.worldcat.org/wcpa/content/affiliate/ you'll see that the
>>>>> WorldCat
>>>>> search box and WorldCat links with embedded searches are available to
>>>>> anyone.   You may also be interested in checking out the WorldCat
>>>>> Registry,
>>>>> or low-volume use of the xISBN and xISSN services.
>>>>>
>>>>> If you have questions about the service, please contact the product
>>>>> manager,
>>>>> Dawn Hendricks at [log in to unmask] <mailto:[log in to unmask]>.
>>>>>
>>>>> *****
>>>>>
>>>>> There is nothing wrong with having a proprietary API; but pretending that
>>>>> it
>>>>> isn't (either directly or through omission), or being afraid to say it,
>>>>> is
>>>>> the kind of thing that has caused me to lose respect for OCLC. Nothing
>>>>> should be declared "open" that isn't available to all, not just members.
>>>>> And
>>>>> advertisements for WC API classes should state "members only." That would
>>>>> be
>>>>> honest. And telling folks on a wide-open list that they should use the
>>>>> Worldcat API (without mentioning "if you are in a member institution and
>>>>> using this for library services) is at best deceiving, at worst
>>>>> dishonest.
>>>>>
>>>>> I, for one, am tired of OCLC's lies, and I'm not afraid to say it.
>>>>> Fortunately for me, retirement is looming and I don't need to care who
>>>>> likes
>>>>> what I say. This is a relief, to say the least.
>>>>>
>>>>> kc
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> kc
>>>>>>
>>>>>>> This experiment is the first step in a process to make linked data
>>>>>>> about
>>>>>>> WorldCat resources available.  As it will evolve over time other areas
>>>>>>> such
>>>>>>> as API access, content-negotiation, search & other query methods,
>>>>>>> additional RDF data vocabularies, etc., etc., will be considered in
>>>>>>> concert
>>>>>>> with community feedback (such as this thread) as to the way forward.
>>>>>>>
>>>>>>> Karen I know you are eager to work with and demonstrate the benefits of
>>>>>>> this way of publishing data.  But these things take time and effort, so
>>>>>>> please be a little patient, and keep firing off these use cases and
>>>>>>> issues
>>>>>>> they are all valuable input.
>>>>>>>
>>>>>>> ~Richard.
>>>>>>>
>>>>>>>> kc
>>>>>>>>
>>>>>>>>
>>>>>>>>     Roy
>>>>>>>>> On Tue, Jul 10, 2012 at 2:08 PM, Kevin Ford <[log in to unmask]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> The use case clarifies perfectly.
>>>>>>>>>>
>>>>>>>>>> Totally feasible.  Well, I should say "totally feasible" with the
>>>>>>>>>> caveat
>>>>>>>>>> that I've never used the Worldcat Search API.  Not letting that stop
>>>>>>>>>> me,
>>>>>>>>>> so
>>>>>>>>>> long as it is what I imagine it is, then a developer should be able
>>>>>>>>>> to
>>>>>>>>>> perform a search, retrieve the response, and, by integrating one of
>>>>>>>>>> the
>>>>>>>>>> tools advertised on the schema.org website into his/her code, then
>>>>>>>>>> retrieve
>>>>>>>>>> the microdata for each resource returned from the search (and save
>>>>>>>>>> it
>>>>>>>>>> as
>>>>>>>>>> RDF
>>>>>>>>>> or whatever).
>>>>>>>>>>
>>>>>>>>>> If someone has created something like this, do speak up.
>>>>>>>>>>
>>>>>>>>>> Yours,
>>>>>>>>>>
>>>>>>>>>> Kevin
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 07/10/2012 04:48 PM, Karen Coyle wrote:
>>>>>>>>>>
>>>>>>>>>>> Kevin, if you misunderstand then I undoubtedly haven't been clear
>>>>>>>>>>> (let's
>>>>>>>>>>> at least share the confusion :-)). Here's the use case:
>>>>>>>>>>>
>>>>>>>>>>> PersonA wants to create a comprehensive bibliography of works by
>>>>>>>>>>> AuthorB. The goal is to do a search on AuthorB in WorldCat and
>>>>>>>>>>> extract
>>>>>>>>>>> the RDFa data from those pages in order to populate the
>>>>>>>>>>> bibliography.
>>>>>>>>>>>
>>>>>>>>>>> Apart from all of the issues of getting a perfect match on authors
>>>>>>>>>>> and
>>>>>>>>>>> of manifestation duplicates (there would need to be editing of the
>>>>>>>>>>> results after retrieval at the user's end), how feasible is this?
>>>>>>>>>>> Assume
>>>>>>>>>>> that the author is prolific enough that one wouldn't want to look
>>>>>>>>>>> up
>>>>>>>>>>> all
>>>>>>>>>>> of the records by hand.
>>>>>>>>>>>
>>>>>>>>>>> kc
>>>>>>>>>>>
>>>>>>>>>>> On 7/10/12 1:43 PM, Kevin Ford wrote:
>>>>>>>>>>>
>>>>>>>>>>>> As for someone who might want to do this programmatically, he/she
>>>>>>>>>>>> should take a look at the "Programming languages" section of the
>>>>>>>>>>>> second link I sent along:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> http://schema.rdfs.org/tools.**html<http://schema.rdfs.org/tools.html>
>>>>>>>>>>>>
>>>>>>>>>>>> There one can find Ruby, Python, and Java extractors and parsers
>>>>>>>>>>>> capable of outputting RDF.  A developer can take one of these and
>>>>>>>>>>>> programmatically get at the data.
>>>>>>>>>>>>
>>>>>>>>>>>> Apologies if I am misunderstanding your intent.
>>>>>>>>>>>>
>>>>>>>>>>>> Yours,
>>>>>>>>>>>>
>>>>>>>>>>>> Kevin
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 07/10/2012 04:34 PM, Karen Coyle wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks, Kevin! And Richard!
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'm thinking we need a good web site with links to tools. I had
>>>>>>>>>>>>> already
>>>>>>>>>>>>> been introduced to
>>>>>>>>>>>>>
>>>>>>>>>>>>> http://www.w3.org/2012/pyRdfa/
>>>>>>>>>>>>>
>>>>>>>>>>>>> where you can past a URI and get ttl or rdf/xml. These are all
>>>>>>>>>>>>> good
>>>>>>>>>>>>> resources. But what about someone who wants to do this
>>>>>>>>>>>>> programmatically,
>>>>>>>>>>>>> not through a web site? Richard's message indicates that this
>>>>>>>>>>>>> isn't
>>>>>>>>>>>>> yet
>>>>>>>>>>>>> available, so perhaps we should be gathering use cases to support
>>>>>>>>>>>>> the
>>>>>>>>>>>>> need? And have a place to post various solutions, even ones that
>>>>>>>>>>>>> are
>>>>>>>>>>>>> not
>>>>>>>>>>>>> OCLC-specific? (Because I am hoping that the use of microformats
>>>>>>>>>>>>> will
>>>>>>>>>>>>> increase in general.)
>>>>>>>>>>>>>
>>>>>>>>>>>>> kc
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 7/10/12 12:12 PM, Kevin Ford wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> is there an open search to get one to the desired records in the
>>>>>>>>>>>>>> first
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> place?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -- I'm not certain this will fully address your question, but
>>>>>>>>>>>>>> try
>>>>>>>>>>>>>> these two sites:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Website:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> http://www.google.com/**webmasters/tools/richsnippets<http://www.google.com/webmasters/tools/richsnippets>
>>>>>>>>>>>>>> Example: http://tinyurl.com/dx3h5bg
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Website:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> http://linter.structured-data.**org/<http://linter.structured-data.org/>
>>>>>>>>>>>>>> Example: http://tinyurl.com/bmm8bbc
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> These sites will extract the data, but I don't think you get
>>>>>>>>>>>>>> your
>>>>>>>>>>>>>> choice of serialization.  The data are extracted and displayed
>>>>>>>>>>>>>> on
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> resulting page in the HTML, but at least you can *see* the data.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Additionally, there are a number of "tools" to help with
>>>>>>>>>>>>>> microdata
>>>>>>>>>>>>>> extraction here:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> http://schema.rdfs.org/tools.**html<http://schema.rdfs.org/tools.html>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Some of these will allow you to output specific (RDF)
>>>>>>>>>>>>>> serializations.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> HTH,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Kevin
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 07/10/2012 02:42 PM, Karen Coyle wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I have demonstrated the schema.org/RDFa microdata in the WC
>>>>>>>>>>>>>>> database to
>>>>>>>>>>>>>>> various folks and the question always is: how do I get access
>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>> this?
>>>>>>>>>>>>>>> (The only source I have is the Facebook API, me being a "user"
>>>>>>>>>>>>>>> rather
>>>>>>>>>>>>>>> than a "maker".) The microdata is CC-BY once you get a Worldcat
>>>>>>>>>>>>>>> URI, but
>>>>>>>>>>>>>>> is there an open search to get one to the desired records in
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> first
>>>>>>>>>>>>>>> place? I'm poorly-versed in WC APIs so I'm hoping others have a
>>>>>>>>>>>>>>> better
>>>>>>>>>>>>>>> grasp.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> @rjw: the OCLC website does a thorough job of hiding email
>>>>>>>>>>>>>>> addresses or
>>>>>>>>>>>>>>> I would have asked this directly. Then again, a discussion here
>>>>>>>>>>>>>>> could
>>>>>>>>>>>>>>> have added value.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> kc
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>> --
>>>>>>>> Karen Coyle
>>>>>>>> [log in to unmask] http://kcoyle.net
>>>>>>>> ph: 1-510-540-7596
>>>>>>>> m: 1-510-435-8234
>>>>>>>> skype: kcoylenet
>>>>>>>>
>>>>> --
>>>>> Karen Coyle
>>>>> [log in to unmask] http://kcoyle.net
>>>>> ph: 1-510-540-7596
>>>>> m: 1-510-435-8234
>>>>> skype: kcoylenet
>>>
>>> --
>>> Karen Coyle
>>> [log in to unmask] http://kcoyle.net
>>> ph: 1-510-540-7596
>>> m: 1-510-435-8234
>>> skype: kcoylenet

-- 
Karen Coyle
[log in to unmask] http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet