LISTSERV 16.5 - CODE4LIB Archives

Really helpful responses all. Moving forward with a plan that is much simpler than before. Thanks so much!

Mike Beccaria
Systems Librarian
Head of Digital Initiative
Paul Smith's College
518.327.6376
[log in to unmask]
Become a friend of Paul Smith's Library on Facebook today!

-----Original Message-----
From: Code for Libraries [mailto:[log in to unmask]] On Behalf Of Dan Scott
Sent: Saturday, August 09, 2014 1:41 AM
To: [log in to unmask]
Subject: Re: [CODE4LIB] Creating a Linked Data Service

On Wed, Aug 6, 2014 at 2:45 PM, Michael Beccaria <[log in to unmask]>
wrote:

> I have recently had the opportunity to create a new library web page 
> and host it on my own servers. One of the elements of the new page 
> that I want to improve upon is providing live or near live information 
> on technology availability (10 of 12 laptops available, etc.). That 
> data resides on my ILS server and I thought it might be a good time to 
> upgrade the bubble gum and duct tape solution I now have to creating a 
> real linked data service that would provide that availability information to the web server.
>
> The problem is there is a lot of overly complex and complicated 
> information out there on linked data and RDF and the semantic web etc.


Yes... this is where I was a year or two ago. Content negotiation / triple stores / ontologies / Turtle / n-quads / blah blah blah / head hits desk.


> and I'm looking for a simple guide to creating a very simple linked 
> data service with php or python or whatever. Does such a resource 
> exist? Any advice on where to start?
>

Adding to the barrage of suggestions, I would suggest a simple structured data approach:

a) Get your web page working first, clearly showing the availability of the
hardware: make the humans happy!
b) Enhance the markup of your web page to use microdata or RDFa to provide structured data around the web page content: make the machines happy!

Let's assume your web page lists hardware as follows:

<h1>Laptops</h1>
<ul>
  <li>Laptop 1: available (circulation desk)</li>
  <li>Laptop 2: loaned out</li>
   ...
</ul>

Assuming your hardware has the general attributes of "type", "location", "name", and "status", you could use microdata to mark this up like so:

<h1>Laptops</h1>
<ul>
  <li itemscope itemtype="http://example.org/laptop"><span
itemprop="name">Laptop 1</span>: <span itemprop="status">available</span>
(<span itemprop="location">circulation desk</span>)</li>
  <li itemscope itemtype="http://example.org/laptop"><span
itemprop="name">Laptop 2</span>: <span itemprop="status">loaned out</span></li>
   ...
</ul>

(We're using the itemtype attribute to specify the type of the object, using a made-up vocabulary... which is fine to start with).

Toss that into the structured data linter at http://linter.structured-data.org and you can see (roughly) what any microdata parser will spit out. That's already fairly useful to machines that would want to parse the page for their own purposes (mobile apps, or aggregators of all available library hardware across public and academic libraries in your area, or whatever). The advantage of using structured data is that you can later on decide to use <div> or <table> markup, and as long as you keep the itemscope/itemtype/itemprop properties generating the same output, any clients using microdata parsers are going to just keep on working... whereas screen-scraping approaches will generally crash and burn if you change the HTML out from underneath them.

For what it's worth, you're not serving up linked data at this point, because you're not really linking to anything, and you're not providing any identifiers to which others could link. You can add itemid attributes to satisfy the latter goal:

<h1>Laptops</h1>
<ul>
  <li itemscope itemtype="http://example.org/laptop"
itemid="#laptop1"><span itemprop="name">Laptop 1</span>: <span itemprop="status">available</span> (<span itemprop="location">circulation desk</span>)</li>
  <li itemscope itemtype="http://example.org/laptop" itemid="#laptop2"><span itemprop="name">Laptop 2</span>: <span itemprop="status">loaned out</span></li>
   ...
</ul>

I guess if you wanted to avoid this being a linked data silo, you could link out from the web page to the manufacturer's page to identify the make/model of each piece of hardware; but realistically that's probably not going to help anyone, so why bother?

Long story short, you can achieve a lot of linked data / semantic web goals by simply generating basic structured data without having to worry about content negotiation to serve up RDF/XML and JSON-LD and Turtle, setting up triple stores, or other such nonsense. You can use whatever technology you're using to generate your web pages (assuming they're dynamically
generated) to add in this structured data.

If you're interested, over the last year I've put together a couple of gentle self-guiding tutorials on using RDFa (fulfills roughly the same role as microdata) with schema.org (a general vocabulary of types and their properties). The shorter one is at https://coffeecode.net/rdfa/codelab/ and the longer, library-specific one is at http://stuff.coffeecode.net/2014/lld_preconference/

Hope this helps!