My opinion... Of course it's possible to mirror a purl server, although
I can't say how easy it is for that particular software. And yes, this
problem could still occur with a different architecture that
accomplishes the same basic thing, like FDsys.
If GPO is going to encourage linking via purl, as they have been for
years, then GPO needs to take responsibility for mirroring their purl
server and generally maintaining it with a very high level of
reliability. This can be done with failover backup servers with mirrored
data, ideally failover backup servers at distinct physical locations
with seperate power and network connections. Corporate IT does this for
important services that demand 99.9%+ uptime all the time, there's
nothing specially novel about doing it for the GPO purl server, the
'best practices' are established. There's nothing to it except for that.
Of course, the organization involved needs to define what kind of uptime
they expect (99.999%? 95%?) and provision resources (staff time and
hardware etc) correspondingly. If GPO lacks the resources to maintain
the kind of uptime the community thinks should be maintained, then....
well, that's how it goes. Or if GPO hasn't even thought about it like
this, because they lack the staff expertise to even plan accordingly...
again, them's the breaks. Doing things right takes money, including
money for staff with appropriate expertise.
Of course, one failure in X (10?) years is fairly good reliability...
depending on how long it takes them to get everything back working 100%.
If it's back by tomorrow, one outage in 10 years pretty good. If it
takes a week to get back, not so good.
I am not going to get on the 'see, this shows why purl is bad'
bandwagon. Sure, it's a single organizational point of failure. But
there are many single points of failure in GPO's services, aside from
purl. If it wasn't purl breaking, then the actual server that hosted the
docs could still go down, the server the purl's are pointing to. Does
GPO maintain appropriate failover backup for the actual document hosting
servers? Who knows. So if you want to talk about 'distributed
infrastructure' it's not just about purl server, it's about document
hosting, and probably a dozen other things GPO does.
Sure distributed infrastructure would be cool, and would, for instance,
help maintain access even if somehow GPO itself disappeared or became
evil or something. But geeks like to talk about distributed
infrastructure becuase it's an 'interesting problem', meaning that the
standard 'best practice' way to do it _isn't_ entirely clear -- it's a
more complicated problem then, for example, simply maintaining robust
failover backup infrastructure. If GPO lacks the resources/expertise to
even maintain robust failover backup infrastructure.... pressuring GPO
(and/or the federal government that funds/oversees them!) to adequately
maintain a failover backup infrastructure seems a lot more realistic way
to achieve reliable access than pressuring them to do R&D in as yet not
entirely clear methods of creating a distributed infrastructure!
James Jacobs wrote:
> Hi all, (cross-posted to purl-dev)
> I'm a documents librarian (and member of the Depository Library Council)
> and usually just a lurker over here. Thanks Keith and Patricia for the
> easy workaround. I shared this with govdoc-l and on my blog:
> See especially the comment that as of today, only 3,677 PURLs out of
> 116,237 have been restored (3.1%). I would love to hear your
> thoughts/ideas for how this kind of critical system failure can be
> averted in the future from a technological standpoint. Is it possible to
> mirror a purl server? Will the same issue occur when GPO moves to
> handles in FDsys (http://www.handle.net/)? Will a distributed
> infrastructure as I've briefly mapped out be able to handle these types
> of critical system crashes better?
> Please let me know and I'd be happy to share your ideas with GPO and the
> documents community.
> James Jacobs
> Keith Jenkins wrote:
>> Thanks to everyone who helped me confirm that the GPO PURL server is
>> down. An official announcement on the GPO Listserv said:
>> "The PURL Server is currently inaccessible. GPO is working with IT
>> staff to restore service as soon as possible. We regret any
>> inconvenience caused by the server problems. An updated listserv will
>> be sent once service is restored."
>> While the server is down, here is one workaround (thanks to Patricia Duplantis):
>> 1. Go to http://catalog.gpo.gov/
>> 2. Click "Advanced Search"
>> 3. Search for word in "URL/PURL", enter the PURL
>> 4. Click "Go"
>> 5. The original URL at the time of cataloging should appear in a 53x note.
>> This incident, however, illuminates a weakness in PURL systems: access
>> is broken when the PURL server breaks, even though the documents are
>> still online at their original URLs.
>> Maybe someone more familiar with PURL systems can tell me... is there
>> any way to harvest data from a PURL server, so that a backup/mirror
>> can be available?