Print

Print


I am kind of new to this linked data thing, but it seems like the real
power of it is not full-text search, but linking through the use of shared
vocabularies. So if you have data about Jane Austen in your database and
you are using the same URI as other databases to represent Jane Austen in
your data (say http://dbpedia.org/resource/Jane_Austen), then you (or
rather, your software) can do an exact search on that URI in remote
resources vs. a fuzzy text search. In other words, linked data is really
supposed to be linked by machines and discoverable through URIs. If you
visit the URL: http://dbpedia.org/page/Jane_Austen you can see a
human-interpretable representation of the data a SPARQL endpoint would
return for a query for triples {http://dbpedia.org/page/Jane_Austen ?p ?o}.
This is essentially asking the database for all subject-predicate-object
facts it contains where Jane Austen is the subject. (Sorry if this is stuff
you already know.)

So yeah, to get full text search, I think you'd need to both cache and
index the data locally. I believe most triplestore implementations index on
subject and object URIs to make lookups like the one mentioned above
relatively efficient, but most would not have efficient full text search
unless through some external indexing application like Solr.


On Wed, Feb 25, 2015 at 2:30 PM, Harper, Cynthia <[log in to unmask]> wrote:

> Well, that's my question.  I have the micro view of linked data, I think -
> it's a distribution/self-describing format. But I don't see the big picture.
>
> In the non-techie library world, linked data is being talked about
> (perhaps only in listserv traffic) as if the data (bibliographic data, for
> instance) will reside on remote sites (as a SPARQL endpoint??? We don't
> know the technical implications of that), and be displayed by <your local
> catalog/the centralized inter-national catalog> by calling data from that
> remote site. But the original question was how the data on those remote
> sites would be <access points> - how can I start my search by searching for
> that remote content?  I assume there has to be a database implementation
> that visits that data and pre-indexes it for it to be searchable, and
> therefore the index has to be local (or global a la Google or OCLC or its
> bibliographic-linked-data equivalent).
>
> All of the above parenthesized or bracketed concepts are nebulous to me.
>
> Cindy
>
> -----Original Message-----
> From: Code for Libraries [mailto:[log in to unmask]] On Behalf Of
> Sarah Weissman
> Sent: Tuesday, February 24, 2015 11:02 PM
> To: [log in to unmask]
> Subject: Re: [CODE4LIB] linked data question
>
> > I think Code4libbers will know more about my question about
> > distributed INDEXES?  This is my rudimentary knowledge of linked data
> > - that the indexing process will have to transit the links, and build
> > a local index to the data, even if in displaying the individual
> > "records", it goes again out to the source.  But are there examples of
> > distributed systems that have distributed INDEXES?  Or Am I wrong in
> > envisioning an index as a separate entity from the data in today's
> technology?
> >
> >
> I'm a little confused by what you mean by distributed index in a linked
> data context. I assume an index would have to be database implementation
> specific, while data is typically exposed for external consumption via
> implementation-agnostic protocols/formats, like a SPARQL endpoint or a REST
> API. How do you locally index something remote under these constraints?
>
> -Sarah
>
>
>
> > Cindy Harper
> >
> > -----Original Message-----
> > From: Harper, Cynthia
> > Sent: Tuesday, February 24, 2015 1:20 PM
> > To: [log in to unmask]; 'Williams, Ann'
> > Subject: RE: linked data question
> >
> > What I haven't read, but what I have wondered about, is whether so
> > far, linked DATA is distributed, but the INDEXES are local?  Is there
> > any example of a system with distributed INDEXES?
> >
> > Cindy Harper
> > [log in to unmask]
> >
> > -----Original Message-----
> > From: AUTOCAT [mailto:[log in to unmask]] On Behalf Of Williams,
> > Ann
> > Sent: Tuesday, February 24, 2015 10:26 AM
> > To: [log in to unmask]
> > Subject: [ACAT] linked data question
> >
> > I was just wondering how linked data will affect OPAC searching and
> > discovery vs. a record with text approach. For example, we have
> > various 856 links to publisher, summary and biographical information
> > in our OPAC as well as ISBNs linking to ContentCafe. But none of that
> > content is discoverable in the OPAC and it requires a further click on
> > the part of patrons (many of whom won't click).
> >
> > Ann Williams
> > USJ
> > --
> > **********************************************************************
> > *
> >
> > AUTOCAT quoting guide: http://www.cwu.edu/~dcc/Autocat/copyright.html
> > E-mail AUTOCAT listowners:             [log in to unmask]
> > Search AUTOCAT archives:  http://listserv.syr.edu/archives/autocat.html
> >   By posting messages to AUTOCAT, the author does not cede copyright
> >
> > **********************************************************************
> > *
> >
>