Print

Print


Hi Cindy. I don't think that you have offended anyone. (??) But I'm mostly
a CODE4LIB lurker and don't pretend to speak for anyone else on the list.

I think I see what you mean now about a local or centralized index vs. a
distributed index. It seems like this would correspond to pattern 3 (query
federation) that Owen listed above. In that case I guess a distributed
index would be a bunch of local, independent indexes remote from one
another that would have some agreed upon indexing scheme so that you could
retrieve search results from each of them and combine them to present them
to your local user. Even though this is less efficient than a centralized
index, it seems like it could be more cost efficient than replicating huge
numbers of records locally or paying to have your records included in a
centralized indexing service. (I have no idea what services like WorldCat
cost.) Results could even be returned asynchronously to users.

But this kind of federation of remote search results isn't unique to linked
data. Linked data standards might make this type of federation easier
because triple stores are schemaless, so you can throw a bunch of linked
data together even if it is completely unrelated without breaking your
system (although it wouldn't be very useful). I work in an astronomy
archive and it seems like most of the headache for us in retrieving and
presenting data from remote resources to our users comes from incompatible
schemas, rather than the speed of querying remote resources.



On Thu, Feb 26, 2015 at 10:45 AM, Harper, Cynthia <[log in to unmask]> wrote:

> I apologize to both lists for this observation. I don't mean to offend
> anyone, and now it's clear to me that this will potentially do so.  I don't
> plan on commenting further.  I do hold both new technologists and
> traditional librarians in respect - I just may generalize too much in
> trying to describe to myself where the viewpoints differ.
>
> Cindy Harper
>
> -----Original Message-----
> From: Harper, Cynthia
> Sent: Thursday, February 26, 2015 10:22 AM
> To: [log in to unmask]
> Cc: 'AUTOCAT'
> Subject: RE: [CODE4LIB] linked data question
>
> So the issue being discussed on AUTOCAT was the availability/fault
> tolerance of the database, given that it's spread over numerous remote
> systems, and I suppose local caching and mirroring are the answers there.
>
> The other issue was skepticism about the feasibility of indexing all these
> remote sources, which led me to thinking about remote indexes, but I see
> the answer is that that's why we won't be using single-site local systems
> so much, but instead using Google-like web-scale indexes.  That's putting
> pressure on the old vision of "the library catalog" as "our database".
>
> Is that a fair understanding?
>
> Cindy Harper
> [log in to unmask]
>
> -----Original Message-----
> From: Code for Libraries [mailto:[log in to unmask]] On Behalf Of
> Eric Lease Morgan
> Sent: Thursday, February 26, 2015 9:44 AM
> To: [log in to unmask]
> Subject: Re: [CODE4LIB] linked data question
>
> On Feb 25, 2015, at 3:12 PM, Sarah Weissman <[log in to unmask]> wrote:
>
> > I am kind of new to this linked data thing, but it seems like the real
> > power of it is not full-text search, but linking through the use of
> > shared vocabularies. So if you have data about Jane Austen in your
> > database and you are using the same URI as other databases to
> > represent Jane Austen in your data (say
> > http://dbpedia.org/resource/Jane_Austen), then you (or rather, your
> > software) can do an exact search on that URI in remote resources vs. a
> > fuzzy text search. In other words, linked data is really
>                                                     ^^^^^^^^^^^^^^^^^^^^^
> > supposed to be linked by machines and discoverable through URIs. If
> > you
>  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > visit the URL: http://dbpedia.org/page/Jane_Austen you can see a
> > human-interpretable representation of the data a SPARQL endpoint would
> > return for a query for triples {http://dbpedia.org/page/Jane_Austen ?p
> ?o}.
> > This is essentially asking the database for all
> > subject-predicate-object facts it contains where Jane Austen is the
> subject.
>
>
> Again, seweissman++  The implementation of linked data is VERY much like
> the implementation of a relational database over HTTP, and in such a
> scenario, the URIs are the database keys. —ELM
>