At one point, much to my surprise, someone told me that 050 is defined for numbers assigned by LC not for LCC numbers per se. It doesn't really sound like that from the current definition (http://www.loc.gov/marc/bibliographic/bd050.html), but if you look on the ITS page (http://www.itsmarc.com/crs/edit7592.htm), which I think is not up-to-date, you'll see a discussion of "Pseudo call numbers and other forms of LC call numbers" As someone pointed out, only a very few classes start with three letters (off the top of my head; a couple in D and a number in K; see http://library.duke.edu/services/instruction/libraryguide/lcclass.html, but there are more in K than are listed here). The pseudo or shelf numbers I've seen most often in 050 are MLC and SD (which unfortunately is the same as the class for forestry). Look for SD on musical recording records (it used to really mess up the attempts of the catalog where I used to work to facet music CDs on LC class; there were a few other common ones, but I've forgotten). Depending what you're doing, you might try to prefer a call number in 090 if there is one. These are more likely to reflect local preference. Looking up 090 (http://www.oclc.org/bibformats/en/0xx/090.shtm) produced some other examples of non-LCC 050's: PAR, Newspaper, UNC, or NOT IN LC. Good luck! Kelley *** Except now I wonder if those annoying MLCS call numbers might actually be properly MATCHED by this regex, when I need em excluded. They are annoying _similar_ to a classified call number. Well, one way to find out. And the reason this matters is to try and use an LCC to map to a 'discipline' or other broad category, either directly from the LCC schedule labels, or using a mapping like umich's: http://www.lib.umich.edu/browse/categories/ But if it's not really an LCC at all, and you try to map it, you'll get bad postings. On 3/31/2011 1:03 PM, Jonathan Rochkind wrote: > > Thanks, that looks good! > > It's hosted on Google Code, but I don't think that code is anything > "Google uses", it looks like it's from our very own Bill Dueber. > > On 3/31/2011 12:38 PM, Tod Olson wrote: >> >> Check the regexp that Google uses in their call number normalization: >> >> http://code.google.com/p/library-callnumber-lc/wiki/Home >> >> You may want to remove the prefix part, and allow for a fourth cutter. >> >> The folks at UNC pointed me to this a few months ago. >> >> -Tod >> >> On Mar 31, 2011, at 11:29 AM, Jonathan Rochkind wrote: >> >>> Does anyone have a good regular expression that will match all legal >>> LC Call Numbers from the LC Classified Schedule, but will generally >>> not match things that could not possibly be an LC Call Number from >>> the LC Classified Schedule? >>> >>> In particular, I need it to NOT match an "MLC" call number, which is >>> an LC assigned call number that shows up in an 050 with no way to >>> distinguish based on indicators, but isn't actually from the LC >>> Schedules. Here's an example of an "MLC" call number: >>> >>> "MLCS 83/5180 (P)" >>> >>> Hmm, maybe all MLC call numbers begin with MLC, okay I guess I can >>> exclude them just like that. But it looks like there are also OTHER >>> things that can show up in the 050 but aren't actually from the >>> classified schedule, the OCLC documentation even contains an example >>> of "Microfilm 19072 E". >>> >>> What a mess, huh? So, yeah, regex anyone? >>> >>> [You can probably guess why I care if it's from the LC Classified >>> Schedule or not]. >> >> Tod Olson<[log in to unmask]> >> Systems Librarian >> University of Chicago Library >>