LISTSERV 16.5 - CODE4LIB Archives

> ...you'd want to create a caching service...


One solution for a relevant particular problem (not full-blown linked-data caching):

http://en.wikipedia.org/wiki/XML_Catalog

excerpt: "However, if they are absolute URLs, they only work when your network can reach them. Relying on remote resources makes XML processing susceptible to both planned and unplanned network downtime."

We'd heard about this a while ago, but, Jodi, you and David Riordan and Congress have caused a temporary retreat from normal sprint-work here at Brown today to investigate implementing this!  :/

The particular problem that would affect us: if your processing tool checks, say, an loc.gov mods namespace url, that processing will fail if the loc.gov url isn't available, unless you've implemented xml catalog, which is a formal way to locally resolve such external references.

-b
---
Birkin James Diana
Programmer, Digital Technologies
Brown University Library
[log in to unmask]


On Sep 30, 2013, at 7:15 AM, Uldis Bojars <[log in to unmask]> wrote:

> What are best practices for preventing problems in cases like this when an
> important Linked Data service may go offline?
> 
> --- originally this was a reply to Jodi which she suggested to post on the
> list too ---
> 
> A safe [pessimistic?] approach would be to say "we don't trust [reliability
> of] linked data on the Web as services can and will go down" and to cache
> everything.
> 
> In that case you'd want to create a caching service that would keep updated
> copies of all important Linked Data sources and a fall-back strategy for
> switching to this caching service when needed. Like archive.org for Linked
> Data.
> 
> Some semantic web search engines might already have subsets of Linked Data
> web cached, but not sure how much they cover (e.g., if they have all of LoC
> data, up-to-date).
> 
> If one were to create such a service how to best update it, considering
> you'd be requesting *all* Linked Data URIs from each source? An efficient
> approach would be to regularly load RDF dumps for every major source if
> available (e.g., LoC says - here's a full dump of all our RDF data ... and
> a .torrent too).
> 
> What do you think?
> 
> Uldis
> 
> 
> On 29 September 2013 12:33, Jodi Schneider <[log in to unmask]> wrote:
> 
>> Any best practices for caching authorities/vocabs to suggest for this
>> thread on the Code4Lib list?
>> 
>> Linked Data authorities & vocabularies at Library of Congress (id.loc.gov)
>> are going to be affected by the website shutdown -- because of lack of
>> government funds.
>> 
>> -Jodi