Thanks, that looks good!
It's hosted on Google Code, but I don't think that code is anything
"Google uses", it looks like it's from our very own Bill Dueber.
On 3/31/2011 12:38 PM, Tod Olson wrote:
> Check the regexp that Google uses in their call number normalization:
> You may want to remove the prefix part, and allow for a fourth cutter.
> The folks at UNC pointed me to this a few months ago.
> On Mar 31, 2011, at 11:29 AM, Jonathan Rochkind wrote:
>> Does anyone have a good regular expression that will match all legal LC
>> Call Numbers from the LC Classified Schedule, but will generally not
>> match things that could not possibly be an LC Call Number from the LC
>> Classified Schedule?
>> In particular, I need it to NOT match an "MLC" call number, which is an
>> LC assigned call number that shows up in an 050 with no way to
>> distinguish based on indicators, but isn't actually from the LC
>> Schedules. Here's an example of an "MLC" call number:
>> "MLCS 83/5180 (P)"
>> Hmm, maybe all MLC call numbers begin with MLC, okay I guess I can
>> exclude them just like that. But it looks like there are also OTHER
>> things that can show up in the 050 but aren't actually from the
>> classified schedule, the OCLC documentation even contains an example of
>> "Microfilm 19072 E".
>> What a mess, huh? So, yeah, regex anyone?
>> [You can probably guess why I care if it's from the LC Classified
>> Schedule or not].
> Tod Olson<[log in to unmask]>
> Systems Librarian
> University of Chicago Library