On Wed, Jul 7, 2010 at 7:00 PM, Doran, Michael D <[log in to unmask]> wrote:
> Of course, subfield $3 values are not any kind of controlled vocabulary, so it's hard to do much with them programmatically.
A few years ago I analyzed the subfield 3 values in the Library of
Congress data up at the Internet Archive . Of course it's really
simple to extract, but I just pushed it up to GitHub, mainly to share
the results .
I extracted all the subfield 3 values from the 12M? records, and then
counted them up to see how often they repeated . As you can see
it's hardly controlled, but it might be worthwhile coming up with some
simple heuristics and properties for the familiar ones: you could
imagine dcterms:description being used for "Publisher description",
Of course the $3 in your catalog data might be different from LCs, but
maybe we could come up with a list of common ones on a wiki somewhere,
and publish a little vocabulary that covered the important relations?