Print

Print


Karen,

I think there's a useful distinction here. Ed can correct me if I'm
wrong, but I suspect he was not actually suggesting that Git itself be
the user-interface to a github-for-data type service, but rather that
such a service can be built *on top* of an infrastructure component
like GitHub.

I agree that there's a barrier to use if we just plunk a bunch of our
bib data in GitHub and call it done, but the version control model and
implementation there could definitely provide a good bit of the
library-version-control stack down below the UI layer.

Cheers,
-Corey

On Mon, Aug 27, 2012 at 8:49 AM, Karen Coyle <[log in to unmask]> wrote:
> Actually, Ed, this would not only make for a good blog post (please, so it
> doesn't get lost in email space), but I would love to see a discussion of
> what kind of revision control would work:
>
> 1) for libraries (git is gawdawful nerdy)
> 2) for linked data
>
> kc
> p.s. the Ramsay book is now showing on Open Library, and the subtitle is
> correct... perhaps because the record is from the LC MARC service :-)
> http://openlibrary.org/works/OL16528530W/Reading_machines
>
>
> On 8/26/12 6:32 PM, Ed Summers wrote:
>>
>> Thanks for sharing this bit of detective work. I noticed something
>> similar fairly recently myself [1], but didn't discover as plausible
>> of a scenario for what had happened as you did. I imagine others have
>> noticed this network effect before as well.
>>
>> On Tue, Aug 21, 2012 at 11:42 AM, Lars Aronsson <[log in to unmask]> wrote:
>>>
>>> And sure enough, there it is,
>>> http://clio.cul.columbia.edu:7018/vwebv/holdingsInfo?bibId=1439352
>>> But will my error report to Worldcat find its way back
>>> to CLIO? Or if I report the error to Columbia University,
>>> will the correction propagate to Google, Hathi and Worldcat?
>>> (Columbia asks me for a student ID when I want to give
>>> feedback, so that removes this option for me.)
>>
>> I realize this probably will sound flippant (or overly grandiose), but
>> innovating solutions to this problem, where there isn't necessarily
>> one metadata master that everyone is slaved to seems to be one of the
>> more important and interesting problems that our sector faces.
>>
>> When Columbia University can become the source of a bibliographic
>> record for Google Books, HathiTrust and OpenLibrary, etc how does this
>> change the hub and spoke workflows (with OCLC as the hub) that we are
>> more familiar with? I think this topic is what's at the heart of the
>> discussions about a "github-for-data" [2,3], since decentralized
>> version control systems [4] allow for the evolution of more organic,
>> push/pull, multimaster workflows...and platforms like Github make them
>> socially feasible, easy and fun.
>>
>> I also think Linked Library Data, where bibliographic descriptions are
>> REST enabled Web resources identified with URLs, and patterns such as
>> webhooks [5] make it easy to trigger update events could be part of an
>> answer. Feed technologies like Atom, RSS and the work being done on
>> ResourceSync also seem important technologies for us to use to allow
>> people to poll for changes [6]. And being able to say where you have
>> obtained data from, possibly using something like the W3C Provenance
>> vocabulary [7] also seems like an important part of the puzzle.
>>
>> I'm sure there are other (and perhaps better) creative analogies or
>> tools that could help solve this problem. I think you're probably
>> right that we are starting to see the errors more now that more
>> library data is becoming part of the visible Web via projects like
>> GoogleBooks, HathiTrust, OpenLibrary and other enterprising libraries
>> that design their catalogs to be crawlable and indexable by search
>> engines.
>>
>> But I think it's more fun to think about (and hack on) what grassroots
>> things we could be doing to help these new bibliographic data
>> workflows to grow and flourish than to get piled under by the errors,
>> and a sense of futility...
>>
>> Or it might make for a good article or dissertation topic :-)
>>
>> //Ed
>>
>> [1] http://inkdroid.org/journal/2011/12/25/genealogy-of-a-typo/
>> [2] http://www.informationdiet.com/blog/read/we-need-a-github-for-data
>> [3] http://sunlightlabs.com/blog/2010/we-dont-need-a-github-for-data/
>> [4] http://en.wikipedia.org/wiki/Distributed_revision_control
>> [5] https://help.github.com/articles/post-receive-hooks
>> [6] http://www.niso.org/workrooms/resourcesync/
>> [7] http://www.w3.org/TR/prov-primer/
>
>
> --
> Karen Coyle
> [log in to unmask] http://kcoyle.net
> ph: 1-510-540-7596
> m: 1-510-435-8234
> skype: kcoylenet



-- 
Corey A Harper
Metadata Services Librarian
New York University Libraries
20 Cooper Square, 3rd Floor
New York, NY 10003-7112
212.998.2479
[log in to unmask]