Yeah, Ed! I'm totally looking forward to results. Unlikely as it is, if there's anything I can do.... and I understand about limiting to 650, but ... well, let's see how it goes. kc On 12/10/13, 1:37 PM, Edward Summers wrote: > I was going to try to reduce the space a bit by focusing on 650 fields. Each record with a Dewey number will be a tab separated line, that will include each 650 field in order. So something like: > > 305.42/0973 <tab> Women's rights -- United States -- History -- Sources. <tab> Women -- United States -- History — Sources <tab> Manuscripts, American -- Facsimiles. > > I thought it might be a place to start at least … it’s running on an ec2 instance right now :-) > > //Ed > > On Dec 10, 2013, at 4:26 PM, Karen Coyle <[log in to unmask]> wrote: > >> I've often thought that this would be an interesting exercise if someone would undertake it. >> >> Just a reminder: in theory (IN THEORY) the first subject heading in an LC record is the one most semantically close to the assigned subject classification. So perhaps a first pass with the FIRST 6xx might give a more refined matching. And then it would be interesting to compare that with the results using all 600-651's. >> >> kc >> >> On 12/10/13, 1:18 PM, Edward Summers wrote: >>> Not a naive idea at all. If you have the stomach for it, you could extract the Subject Heading / Dewey combinations out of say the LC Catalog MARC data [1] to use as training data for some kind of clustering [2] algorithm. You might even be able to do something simple like keep a count of the Dewey ranges associated with each subject heading. >>> >>> I’m kind of curious myself, so I could work on getting the subject heading / dewey combinations if you want? >>> >>> //Ed >>> >>> [1] https://archive.org/details/marc_records_scriblio_net >>> [2] https://en.wikipedia.org/wiki/Cluster_analysis >>> >>> On Dec 10, 2013, at 8:18 AM, Irina Arndt <[log in to unmask]> wrote: >>> >>>> Hi CODE4LIB, >>>> >>>> we would like to add DDC classes to a bunch of MARC records, which contains only LoC Subject Headings. >>>> Does anybody know, if a mapping between LCSH and DDC is anywhere existent (and available)? >>>> >>>> I understood, that WebDewey http://www.oclc.org/dewey/versions/webdewey.en.html might provide such a service, but >>>> >>>> · we are no OCLC customers or subscribers to WebDewey >>>> >>>> · even if we were, I'm not sure, if the service matches our needs >>>> >>>> I'm thinking of a tool, where I can upload my list of subject headings and get back a list, where the matching Dewey classes have been added (but a 'simple' csv file with LCSH terms and DDC classes would be helpful as well- I am fully aware, that neither LCSH nor DDC are simple at all...) . Naïve idea...? >>>> >>>> Thanks for any clues, >>>> Irina >>>> >>>> >>>> ------- >>>> >>>> Irina Arndt >>>> Max Planck Digital Library (MPDL) >>>> Library System Coordinator >>>> Amalienstr. 33 >>>> D-80799 Muenchen, Germany >>>> >>>> Tel. +49 89 38602-254 >>>> Fax +49 89 38602-290 >>>> >>>> Email: [log in to unmask]<mailto:[log in to unmask]> >>>> http://www.mpdl.mpg.de >> >> -- >> Karen Coyle >> [log in to unmask] http://kcoyle.net >> m: 1-510-435-8234 >> skype: kcoylenet > > -- Karen Coyle [log in to unmask] http://kcoyle.net m: 1-510-435-8234 skype: kcoylenet