Have you had a gander at what's being done with DOI harvested RDF data
by Wikicite? Here are some links:

The second one links to tools, including their DOI tool and data
cleanup. This might not be what you want to do but there's a lot of rich
experience there. If you are interested, do check the Discussion pages
because that's where stuff gets hashed out - questions asked and
answered, wild ideas proffered.


On 1/15/19 6:37 AM, Eric Lease Morgan wrote:
> How might I exploit & learn from a set of RDF files harvested from DOI's?
> For a good time, I have written a suite of software to harvest bibliographic data from Web of Science, cache the results, and report on the whole. [1] Along the way I programmatically collect DOI's and then resolve them. The results include RDF streams. ("Thanks, Kevin Ford!") For example:
>   curl -i -L -H "Accept: application/rdf+xml"
> And:
>   <rdf:RDF
>     xmlns:rdf=""
>     xmlns:j.0=""
>     xmlns:j.1=""
>     xmlns:owl=""
>     xmlns:j.2=""
>     xmlns:j.3="">
>   <rdf:Description rdf:about="">
>     <j.0:isPartOf>
>     <j.2:Journal rdf:about="">
>       <owl:sameAs>urn:issn:1975-5937</owl:sameAs>
>       <j.0:title>Journal of Educational Evaluation for Health Professions</j.0:title>
>       <j.1:issn>1975-5937</j.1:issn>
>       <j.2:issn>1975-5937</j.2:issn>
>     </j.2:Journal>
>     </j.0:isPartOf>
>     <j.0:creator>
>     <j.3:Person rdf:about="">
>       <j.3:name>Sun Huh</j.3:name>
>       <j.3:familyName>Huh</j.3:familyName>
>       <j.3:givenName>Sun</j.3:givenName>
>     </j.3:Person>
>     </j.0:creator>
>     <j.0:title>Revision of the instructions to authors to require... </j.0:title>
>     <j.1:doi>10.3352/jeehp.2013.10.3</j.1:doi>
>     <j.0:date rdf:datatype=""
>     >2013-04-30</j.0:date>
>     <owl:sameAs rdf:resource="info:doi/10.3352/jeehp.2013.10.3"/>
>     <j.0:identifier>10.3352/jeehp.2013.10.3</j.0:identifier>
>     <j.2:volume>10</j.2:volume>
>     <j.2:pageStart>3</j.2:pageStart>
>     <j.1:startingPage>3</j.1:startingPage>
>     <j.0:publisher>XMLArchive</j.0:publisher>
>     <owl:sameAs rdf:resource="doi:10.3352/jeehp.2013.10.3"/>
>     <j.1:volume>10</j.1:volume>
>     <j.2:doi>10.3352/jeehp.2013.10.3</j.2:doi>
>   </rdf:Description>
>   </rdf:RDF>
> That's a pretty rich RDF stream! [2]
> As of right now, I have about 8000 of these streams representing publications of faculty here at my university. I can easily get 10's of thousands more. How might I take advantage of this data? How can I go beyond parsing the RDF with XPath, stuffing the results into a database, and applying SQL to the result? How can truly exploit the nature of the RDF and possibly manifest it as linked data? 
> To answer my own question, I might put the data into a triple store, and then try to answer questions such as: what authors are central, what journals are central, what authors are "related" to whom, etc. 
> What do you think?
> [1]
> [2] And this rich data does not even take into account the cool, sometimes full text URLs/URIs found in the HTTP link header!

Karen Coyle
[log in to unmask]
m: +1-510-435-8234
skype: kcoylenet/+1-510-984-3600