LISTSERV 16.5 - CODE4LIB Archives

On Wed, Aug 6, 2014 at 2:45 PM, Michael Beccaria <[log in to unmask]>
wrote:

> I have recently had the opportunity to create a new library web page and
> host it on my own servers. One of the elements of the new page that I want
> to improve upon is providing live or near live information on technology
> availability (10 of 12 laptops available, etc.). That data resides on my
> ILS server and I thought it might be a good time to upgrade the bubble gum
> and duct tape solution I now have to creating a real linked data service
> that would provide that availability information to the web server.
>
> The problem is there is a lot of overly complex and complicated
> information out there on linked data and RDF and the semantic web etc.


Yes... this is where I was a year or two ago. Content negotiation / triple
stores / ontologies / Turtle / n-quads / blah blah blah / head hits desk.


> and I'm looking for a simple guide to creating a very simple linked data
> service with php or python or whatever. Does such a resource exist? Any
> advice on where to start?
>

Adding to the barrage of suggestions, I would suggest a simple structured
data approach:

a) Get your web page working first, clearly showing the availability of the
hardware: make the humans happy!
b) Enhance the markup of your web page to use microdata or RDFa to provide
structured data around the web page content: make the machines happy!

Let's assume your web page lists hardware as follows:

<h1>Laptops</h1>
<ul>
  <li>Laptop 1: available (circulation desk)</li>
  <li>Laptop 2: loaned out</li>
   ...
</ul>

Assuming your hardware has the general attributes of "type", "location",
"name", and "status", you could use microdata to mark this up like so:

<h1>Laptops</h1>
<ul>
  <li itemscope itemtype="http://example.org/laptop"><span
itemprop="name">Laptop 1</span>: <span itemprop="status">available</span>
(<span itemprop="location">circulation desk</span>)</li>
  <li itemscope itemtype="http://example.org/laptop"><span
itemprop="name">Laptop 2</span>: <span itemprop="status">loaned
out</span></li>
   ...
</ul>

(We're using the itemtype attribute to specify the type of the object,
using a made-up vocabulary... which is fine to start with).

Toss that into the structured data linter at
http://linter.structured-data.org and you can see (roughly) what any
microdata parser will spit out. That's already fairly useful to machines
that would want to parse the page for their own purposes (mobile apps, or
aggregators of all available library hardware across public and academic
libraries in your area, or whatever). The advantage of using structured
data is that you can later on decide to use <div> or <table> markup, and as
long as you keep the itemscope/itemtype/itemprop properties generating the
same output, any clients using microdata parsers are going to just keep on
working... whereas screen-scraping approaches will generally crash and burn
if you change the HTML out from underneath them.

For what it's worth, you're not serving up linked data at this point,
because you're not really linking to anything, and you're not providing any
identifiers to which others could link. You can add itemid attributes to
satisfy the latter goal:

<h1>Laptops</h1>
<ul>
  <li itemscope itemtype="http://example.org/laptop"
itemid="#laptop1"><span itemprop="name">Laptop 1</span>: <span
itemprop="status">available</span> (<span itemprop="location">circulation
desk</span>)</li>
  <li itemscope itemtype="http://example.org/laptop" itemid="#laptop2"><span
itemprop="name">Laptop 2</span>: <span itemprop="status">loaned
out</span></li>
   ...
</ul>

I guess if you wanted to avoid this being a linked data silo, you could
link out from the web page to the manufacturer's page to identify the
make/model of each piece of hardware; but realistically that's probably not
going to help anyone, so why bother?

Long story short, you can achieve a lot of linked data / semantic web goals
by simply generating basic structured data without having to worry about
content negotiation to serve up RDF/XML and JSON-LD and Turtle, setting up
triple stores, or other such nonsense. You can use whatever technology
you're using to generate your web pages (assuming they're dynamically
generated) to add in this structured data.

If you're interested, over the last year I've put together a couple of
gentle self-guiding tutorials on using RDFa (fulfills roughly the same role
as microdata) with schema.org (a general vocabulary of types and their
properties). The shorter one is at https://coffeecode.net/rdfa/codelab/ and
the longer, library-specific one is at
http://stuff.coffeecode.net/2014/lld_preconference/

Hope this helps!