On Aug 28, 2012, at 12:05 PM, Galen Charlton wrote:
> Hi,
>
> On 08/27/2012 04:36 PM, Karen Coyle wrote:
>> I also assumed that Ed wasn't suggesting that we literally use github as
>> our platform, but I do want to remind folks how far we are from having
>> "people friendly" versioning software -- at least, none that I have seen
>> has felt "intuitive." The features of git are great, and people have
>> built interfaces to it, but as Galen's question brings forth, the very
>> *idea* of versioning doesn't exist in library data processing, even
>> though having central-system based versions of MARC records (with a
>> single time line) is at least conceptually simple.
>
> What's interesting, however, is that at least a couple parts of the concept of distributed version control, viewed broadly, have been used in traditional library cataloging.
>
> For example, RLIN had a concept of a "cluster" of MARC records for the same title, with each library having their own record in the cluster. I don't know if RLIN kept track of previous versions of a library's record in a cluster as it got edited, but it means that there was the concept of a "spatial" distribution of record versions if not a temporal one. I've never used RLIN myself, but I'd be curious to know if it provided any tools to readily compare records in the same cluster and if there were any mechanisms (formal or informal) for a library to grab improvements from another library's record and apply it to their own.
>
> As another example, the MARC cataloging source field has long been used, particularly in central utilities, to record institution-level attribution for changes to a MARC record. I think that's mostly been used by catalogers to help decide which version of a record to start from when copy cataloging, but I suppose it's possible that some catalogers were also looking at the list of modifying agencies ("library A touched this record and is particularly good at subject analysis, so I'll grab their 650s").
I seem to recall seeing a presentation a couple of years ago from someone in the intelligence community, where they'd keep all of their intelligence, but they stored RDF quads so they could track the source.
They'd then assign a confidence level to each source, so they could get an overall level of confidence on their inferences.
... it'd get a bit messier if you have to do some sort of analysis of which sources are good for what type of information, but it might be a start.
Unfortunately, I'm not having luck finding the reference again.
It's possible that it was in the context of provenance, but I'm getting bogged down in too many articles about people storing provenance information using RDF-triples (without actually tracking the provenance of the triple itself)
-Joe
ps. I just realized this discussion's been on CODE4LIB, and not NGC4LIB ... would it make sense to move it over there?
|