Actually, Ed, this would not only make for a good blog post (please, so
it doesn't get lost in email space), but I would love to see a
discussion of what kind of revision control would work:
1) for libraries (git is gawdawful nerdy)
2) for linked data
kc
p.s. the Ramsay book is now showing on Open Library, and the subtitle is
correct... perhaps because the record is from the LC MARC service :-)
http://openlibrary.org/works/OL16528530W/Reading_machines
On 8/26/12 6:32 PM, Ed Summers wrote:
> Thanks for sharing this bit of detective work. I noticed something
> similar fairly recently myself [1], but didn't discover as plausible
> of a scenario for what had happened as you did. I imagine others have
> noticed this network effect before as well.
>
> On Tue, Aug 21, 2012 at 11:42 AM, Lars Aronsson <[log in to unmask]> wrote:
>> And sure enough, there it is,
>> http://clio.cul.columbia.edu:7018/vwebv/holdingsInfo?bibId=1439352
>> But will my error report to Worldcat find its way back
>> to CLIO? Or if I report the error to Columbia University,
>> will the correction propagate to Google, Hathi and Worldcat?
>> (Columbia asks me for a student ID when I want to give
>> feedback, so that removes this option for me.)
> I realize this probably will sound flippant (or overly grandiose), but
> innovating solutions to this problem, where there isn't necessarily
> one metadata master that everyone is slaved to seems to be one of the
> more important and interesting problems that our sector faces.
>
> When Columbia University can become the source of a bibliographic
> record for Google Books, HathiTrust and OpenLibrary, etc how does this
> change the hub and spoke workflows (with OCLC as the hub) that we are
> more familiar with? I think this topic is what's at the heart of the
> discussions about a "github-for-data" [2,3], since decentralized
> version control systems [4] allow for the evolution of more organic,
> push/pull, multimaster workflows...and platforms like Github make them
> socially feasible, easy and fun.
>
> I also think Linked Library Data, where bibliographic descriptions are
> REST enabled Web resources identified with URLs, and patterns such as
> webhooks [5] make it easy to trigger update events could be part of an
> answer. Feed technologies like Atom, RSS and the work being done on
> ResourceSync also seem important technologies for us to use to allow
> people to poll for changes [6]. And being able to say where you have
> obtained data from, possibly using something like the W3C Provenance
> vocabulary [7] also seems like an important part of the puzzle.
>
> I'm sure there are other (and perhaps better) creative analogies or
> tools that could help solve this problem. I think you're probably
> right that we are starting to see the errors more now that more
> library data is becoming part of the visible Web via projects like
> GoogleBooks, HathiTrust, OpenLibrary and other enterprising libraries
> that design their catalogs to be crawlable and indexable by search
> engines.
>
> But I think it's more fun to think about (and hack on) what grassroots
> things we could be doing to help these new bibliographic data
> workflows to grow and flourish than to get piled under by the errors,
> and a sense of futility...
>
> Or it might make for a good article or dissertation topic :-)
>
> //Ed
>
> [1] http://inkdroid.org/journal/2011/12/25/genealogy-of-a-typo/
> [2] http://www.informationdiet.com/blog/read/we-need-a-github-for-data
> [3] http://sunlightlabs.com/blog/2010/we-dont-need-a-github-for-data/
> [4] http://en.wikipedia.org/wiki/Distributed_revision_control
> [5] https://help.github.com/articles/post-receive-hooks
> [6] http://www.niso.org/workrooms/resourcesync/
> [7] http://www.w3.org/TR/prov-primer/
--
Karen Coyle
[log in to unmask] http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet
|