Except now I wonder if those annoying MLCS call numbers might actually
be properly MATCHED by this regex, when I need em excluded. They are
annoying _similar_ to a classified call number. Well, one way to find out.
And the reason this matters is to try and use an LCC to map to a
'discipline' or other broad category, either directly from the LCC
schedule labels, or using a mapping like umich's:
But if it's not really an LCC at all, and you try to map it, you'll get
On 3/31/2011 1:03 PM, Jonathan Rochkind wrote:
> Thanks, that looks good!
> It's hosted on Google Code, but I don't think that code is anything
> "Google uses", it looks like it's from our very own Bill Dueber.
> On 3/31/2011 12:38 PM, Tod Olson wrote:
>> Check the regexp that Google uses in their call number normalization:
>> You may want to remove the prefix part, and allow for a fourth cutter.
>> The folks at UNC pointed me to this a few months ago.
>> On Mar 31, 2011, at 11:29 AM, Jonathan Rochkind wrote:
>>> Does anyone have a good regular expression that will match all legal LC
>>> Call Numbers from the LC Classified Schedule, but will generally not
>>> match things that could not possibly be an LC Call Number from the LC
>>> Classified Schedule?
>>> In particular, I need it to NOT match an "MLC" call number, which is an
>>> LC assigned call number that shows up in an 050 with no way to
>>> distinguish based on indicators, but isn't actually from the LC
>>> Schedules. Here's an example of an "MLC" call number:
>>> "MLCS 83/5180 (P)"
>>> Hmm, maybe all MLC call numbers begin with MLC, okay I guess I can
>>> exclude them just like that. But it looks like there are also OTHER
>>> things that can show up in the 050 but aren't actually from the
>>> classified schedule, the OCLC documentation even contains an example of
>>> "Microfilm 19072 E".
>>> What a mess, huh? So, yeah, regex anyone?
>>> [You can probably guess why I care if it's from the LC Classified
>>> Schedule or not].
>> Tod Olson<[log in to unmask]>
>> Systems Librarian
>> University of Chicago Library