LISTSERV 16.5 - CODE4LIB Archives

I think we have a catch-22 here. You need an OCLC developer license to 
use WC to "discover" WC URIs using an application; you need WC URIs (or 
other URIs that are not very diffuse on the Web) to make use of the OCLC 
linked data. The OCLC linked data is ODC-BY for anyone wishing to use 
the data, but, if I'm not mistaken, the APIs are not publicly open to 
the Web public. Thus the schema.org data is ODC-BY but most applications 
on the web will have little opportunity to discover the OCLC-specific 
URIs. So the gatekeeper is the API access, that is, the ability to 
search WC for URI discovery (e.g. with an author's name). So you can 
link, but you can't easily discover the linking URIs.

I suppose that one could discover publications as linked data using the 
topical access of LCSH, the VIAF links in Wikipedia, or by going through 
databases like Open Library, which has some OCLC numbers associated with 
bibliographic data. All of these are accessible via open APIs, I 
believe, and are linked DBPedia. I understand that "linking is linking" 
but unless we are developing data for SkyNet, somewhere along the way 
the user needs to begin with a human-understandable query. Searching and 
linked data are not in conflict with each other, they give each other 
mutual support. It only makes sense that URIs will be discovered through 
searching at some point in the process of access, as applications like 
Wikipedia illustrate. (As does the Facebook API, which is a search.)

I've tried to find a clear statement of who can get access to the OCLC 
APIs, but I'm afraid that I can't find a page that clarifies that. I 
guess one is expected to apply for developer key in order to find out if 
they qualify. I'll pass that information along.

kc


On 7/10/12 2:32 PM, Kevin Ford wrote:
> Does the worldcat search api return the data as described with the 
> schema.org and OCLC extension vocabularies?
>
> The use case mentioned extracting the RDFa data from those pages. 
> Without knowing the answer to the leading question above, the mock 
> solution addressed that condition.  If one simply wanted "to create a 
> comprehensive bibliography of works" by a particular author, then, 
> yes, the search response would suffice.
>
> Kevin
>
>
> On 07/10/2012 05:10 PM, Roy Tennant wrote:
>> Uh...what? For the given use case you would be much better off simply
>> using the WorldCat Search API response. Using it only to retrieve an
>> identifier and then going and scraping the Linked Data out of a
>> WorldCat.org page is, at best, redundant.
>>
>> As Richard pointed out, some use cases -- like the one Karen provided
>> -- are not really a good use case for linked data. It's a better use
>> case for an API, which has been available for years.
>> Roy
>>
>> On Tue, Jul 10, 2012 at 2:08 PM, Kevin Ford <[log in to unmask]> wrote:
>>> The use case clarifies perfectly.
>>>
>>> Totally feasible.  Well, I should say "totally feasible" with the 
>>> caveat
>>> that I've never used the Worldcat Search API.  Not letting that stop 
>>> me, so
>>> long as it is what I imagine it is, then a developer should be able to
>>> perform a search, retrieve the response, and, by integrating one of the
>>> tools advertised on the schema.org website into his/her code, then 
>>> retrieve
>>> the microdata for each resource returned from the search (and save 
>>> it as RDF
>>> or whatever).
>>>
>>> If someone has created something like this, do speak up.
>>>
>>> Yours,
>>>
>>> Kevin
>>>
>>>
>>>
>>>
>>>
>>> On 07/10/2012 04:48 PM, Karen Coyle wrote:
>>>>
>>>> Kevin, if you misunderstand then I undoubtedly haven't been clear 
>>>> (let's
>>>> at least share the confusion :-)). Here's the use case:
>>>>
>>>> PersonA wants to create a comprehensive bibliography of works by
>>>> AuthorB. The goal is to do a search on AuthorB in WorldCat and extract
>>>> the RDFa data from those pages in order to populate the bibliography.
>>>>
>>>> Apart from all of the issues of getting a perfect match on authors and
>>>> of manifestation duplicates (there would need to be editing of the
>>>> results after retrieval at the user's end), how feasible is this? 
>>>> Assume
>>>> that the author is prolific enough that one wouldn't want to look 
>>>> up all
>>>> of the records by hand.
>>>>
>>>> kc
>>>>
>>>> On 7/10/12 1:43 PM, Kevin Ford wrote:
>>>>>
>>>>> As for someone who might want to do this programmatically, he/she
>>>>> should take a look at the "Programming languages" section of the
>>>>> second link I sent along:
>>>>>
>>>>> http://schema.rdfs.org/tools.html
>>>>>
>>>>> There one can find Ruby, Python, and Java extractors and parsers
>>>>> capable of outputting RDF.  A developer can take one of these and
>>>>> programmatically get at the data.
>>>>>
>>>>> Apologies if I am misunderstanding your intent.
>>>>>
>>>>> Yours,
>>>>>
>>>>> Kevin
>>>>>
>>>>>
>>>>>
>>>>> On 07/10/2012 04:34 PM, Karen Coyle wrote:
>>>>>>
>>>>>> Thanks, Kevin! And Richard!
>>>>>>
>>>>>> I'm thinking we need a good web site with links to tools. I had 
>>>>>> already
>>>>>> been introduced to
>>>>>>
>>>>>> http://www.w3.org/2012/pyRdfa/
>>>>>>
>>>>>> where you can past a URI and get ttl or rdf/xml. These are all good
>>>>>> resources. But what about someone who wants to do this 
>>>>>> programmatically,
>>>>>> not through a web site? Richard's message indicates that this 
>>>>>> isn't yet
>>>>>> available, so perhaps we should be gathering use cases to support 
>>>>>> the
>>>>>> need? And have a place to post various solutions, even ones that 
>>>>>> are not
>>>>>> OCLC-specific? (Because I am hoping that the use of microformats 
>>>>>> will
>>>>>> increase in general.)
>>>>>>
>>>>>> kc
>>>>>>
>>>>>>
>>>>>> On 7/10/12 12:12 PM, Kevin Ford wrote:
>>>>>>>
>>>>>>>> is there an open search to get one to the desired records in the
>>>>>>> first
>>>>>>>> place?
>>>>>>> -- I'm not certain this will fully address your question, but try
>>>>>>> these two sites:
>>>>>>>
>>>>>>> Website: http://www.google.com/webmasters/tools/richsnippets
>>>>>>> Example: http://tinyurl.com/dx3h5bg
>>>>>>>
>>>>>>> Website: http://linter.structured-data.org/
>>>>>>> Example: http://tinyurl.com/bmm8bbc
>>>>>>>
>>>>>>> These sites will extract the data, but I don't think you get your
>>>>>>> choice of serialization.  The data are extracted and displayed 
>>>>>>> on the
>>>>>>> resulting page in the HTML, but at least you can *see* the data.
>>>>>>>
>>>>>>> Additionally, there are a number of "tools" to help with microdata
>>>>>>> extraction here:
>>>>>>>
>>>>>>> http://schema.rdfs.org/tools.html
>>>>>>>
>>>>>>> Some of these will allow you to output specific (RDF) 
>>>>>>> serializations.
>>>>>>>
>>>>>>>
>>>>>>> HTH,
>>>>>>>
>>>>>>> Kevin
>>>>>>>
>>>>>>>
>>>>>>> On 07/10/2012 02:42 PM, Karen Coyle wrote:
>>>>>>>>
>>>>>>>> I have demonstrated the schema.org/RDFa microdata in the WC
>>>>>>>> database to
>>>>>>>> various folks and the question always is: how do I get access 
>>>>>>>> to this?
>>>>>>>> (The only source I have is the Facebook API, me being a "user" 
>>>>>>>> rather
>>>>>>>> than a "maker".) The microdata is CC-BY once you get a Worldcat
>>>>>>>> URI, but
>>>>>>>> is there an open search to get one to the desired records in 
>>>>>>>> the first
>>>>>>>> place? I'm poorly-versed in WC APIs so I'm hoping others have a 
>>>>>>>> better
>>>>>>>> grasp.
>>>>>>>>
>>>>>>>> @rjw: the OCLC website does a thorough job of hiding email
>>>>>>>> addresses or
>>>>>>>> I would have asked this directly. Then again, a discussion here 
>>>>>>>> could
>>>>>>>> have added value.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> kc
>>>>>>>>
>>>>>>
>>>>
>>>

-- 
Karen Coyle
[log in to unmask] http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet