At one point, much to my surprise, someone told me that 050 is defined for
numbers assigned by LC not for LCC numbers per se. It doesn't really sound
like that from the current definition
(http://www.loc.gov/marc/bibliographic/bd050.html), but if you look on the
ITS page (http://www.itsmarc.com/crs/edit7592.htm), which I think is not
up-to-date, you'll see a discussion of "Pseudo call numbers and other forms
of LC call numbers"
As someone pointed out, only a very few classes start with three letters
(off the top of my head; a couple in D and a number in K; see
there are more in K than are listed here).
The pseudo or shelf numbers I've seen most often in 050 are MLC and SD
(which unfortunately is the same as the class for forestry). Look for SD on
musical recording records (it used to really mess up the attempts of the
catalog where I used to work to facet music CDs on LC class; there were a
few other common ones, but I've forgotten).
Depending what you're doing, you might try to prefer a call number in 090 if
there is one. These are more likely to reflect local preference.
Looking up 090 (http://www.oclc.org/bibformats/en/0xx/090.shtm) produced
some other examples of non-LCC 050's: PAR, Newspaper, UNC, or NOT IN LC.
Except now I wonder if those annoying MLCS call numbers might actually be
properly MATCHED by this regex, when I need em excluded. They are annoying
_similar_ to a classified call number. Well, one way to find out.
And the reason this matters is to try and use an LCC to map to a
'discipline' or other broad category, either directly from the LCC schedule
labels, or using a mapping like umich's:
But if it's not really an LCC at all, and you try to map it, you'll get bad
On 3/31/2011 1:03 PM, Jonathan Rochkind wrote:
> Thanks, that looks good!
> It's hosted on Google Code, but I don't think that code is anything
> "Google uses", it looks like it's from our very own Bill Dueber.
> On 3/31/2011 12:38 PM, Tod Olson wrote:
>> Check the regexp that Google uses in their call number normalization:
>> You may want to remove the prefix part, and allow for a fourth cutter.
>> The folks at UNC pointed me to this a few months ago.
>> On Mar 31, 2011, at 11:29 AM, Jonathan Rochkind wrote:
>>> Does anyone have a good regular expression that will match all legal
>>> LC Call Numbers from the LC Classified Schedule, but will generally
>>> not match things that could not possibly be an LC Call Number from
>>> the LC Classified Schedule?
>>> In particular, I need it to NOT match an "MLC" call number, which is
>>> an LC assigned call number that shows up in an 050 with no way to
>>> distinguish based on indicators, but isn't actually from the LC
>>> Schedules. Here's an example of an "MLC" call number:
>>> "MLCS 83/5180 (P)"
>>> Hmm, maybe all MLC call numbers begin with MLC, okay I guess I can
>>> exclude them just like that. But it looks like there are also OTHER
>>> things that can show up in the 050 but aren't actually from the
>>> classified schedule, the OCLC documentation even contains an example
>>> of "Microfilm 19072 E".
>>> What a mess, huh? So, yeah, regex anyone?
>>> [You can probably guess why I care if it's from the LC Classified
>>> Schedule or not].
>> Tod Olson<[log in to unmask]>
>> Systems Librarian
>> University of Chicago Library