I've did some mining of DDC/LCSH correlations from the scriblio.net LC
files (Dec 2006 - ~ 7m records) as an experiment a long time ago; I should
see if I can find the right repo.
I was looking at 1st 6XX <-> 082 associations as evidence for or against
LCSH syndetic structure. I didn't do anything with LCC, as at the time
there wasn't even partial open LCC information available, and the
randomness of the coding of the LCC hierarchy needs the class specific
decoder rings. I didn't do any intelligent decomposition of subdivisions
on either side, but the results weren't as far off as I thought they would
The classification is supposed to map to the first subject heading, which
simplifies learning. This dataset is very tractable, although binary
classifiers aren't a good fit.
I would suggest that Irina ask someone in Gerhard Weikumhe's group or any
contacts in the DFKI LT group if they have something that could be tweaked
quickly to learn the structured mappings.
On Wed, Dec 11, 2013 at 10:49 AM, Jonathan Rochkind <[log in to unmask]>wrote:
> Ah right it's ClassificationWeb that has this. Alas, ClassificationWeb is
> both not open (requires a subscription), and also, as far as I know, offers
> no machine API, it's purely manual human access.
> This would definitely be an interesting project for someone to do to
> create a source of open data on LCSH/LCC correlations. Somewhat
> challenging, but definitely not 'rocket science' as they say. I think the
> data would be desirable by many.
> Also, as I mentioned in another email, some but not all LCSH authority
> records already identify suggested correspoinding LCC's; LCSH authority
> data can be obtained from id.loc.gov among other places.
> On 12/10/13 5:02 PM, Bryan Baldus wrote:
>> On Tuesday, December 10, 2013 7:18 AM, Irina Arndt wrote:
>>> we would like to add DDC classes to a bunch of MARC records, which
>>> contains only LoC Subject Headings. Does anybody know, if a mapping between
>>> LCSH and DDC is anywhere existent (and available)?
>>> I'm thinking of a tool, where I can upload my list of subject headings
>>> and get back a list, where the matching Dewey classes have been added (but
>>> a 'simple' csv file with LCSH terms and DDC classes would be helpful as
>>> well- I am fully aware, that neither LCSH nor DDC are simple at all...) .
>>> Na´ve idea...?
>> Classification Web offers a correlations feature between Dewey and the
>> 1st LCSH, based on usage in LC's database (as well as correlations between
>> LCC and LCSH, and DDC and LCC). It is of some use in helping the cataloger
>> determine possible classifications or subject headings to use.
>> Unfortunately, I don't believe ClassWeb is easily accessible by automated
>> processes (even for subscribers). Even if it were, I doubt it is possible
>> to automate a process of assigning Dewey based on 1st LCSH. As mentioned,
>> the 1st LCSH and classification are generally supposed to be
>> similar/linked, but that applies more to LCC/LCSH than DDC to LCSH, due to
>> the way Dewey works. For example, ClassWeb correlation between LCSH Disease
>> management (chosen while looking at Health, then Disease, then looking for
>> an example showing a better variety of Deweys than the 1st 2) shows DDCs
>> used by LC (counts of records in parentheses):
>> Disease management [Topical]
>> 362.1 (4)
>> 610.285 (1)
>> 615.1 (1)
>> 615.5071 (1)
>> 616.89142 (1)
>> That said, as Ed mentioned, given a large set of records for training,
>> you should be able to develop something to help local catalogers determine
>> possible Deweys record-by-record.
>> I hope this helps,
>> Bryan Baldus
>> Senior Cataloger
>> Quality Books Inc.
>> The Best of America's Independent Presses
>> [log in to unmask]