Nice, that might be what I need. Maybe I'll take a look at the LibX code, it's open source, right? Google Scholar has no API--you're screen scraping it? Jonathan Godmar Back wrote: > A year or so ago a couple of students looked into this for LibX. There > are a number of systems that people have published about, although > some are not available and none worked very well or were easy to get > to work. The systems also varied in their computational complexity, > with some not suitable for interactive use. Google for "libx citation > sensing", or generally for citation extraction, automatic record > boundary detection or extraction. (Unfortunately, pubs.dlib.vt.edu > appears to be down at the moment - otherwise, Suresh Menon's report > contains a useful bibliography of work. I'll ping them.) > > For citations that contain item titles (which is true for a majority, > but definitely not all citation styles) LibX's magic button uses > Scholar as a hidden backend to produce an actionable OpenURL. Combined > with a similarity analysis, this "magic button" functionality > produces a usable OpenURL in (on average) 81% of cases for a set of > 400 randomly chosen citations from 4 widely read journals from 4 > different areas published in 2006 [1]. With some fixes, we could > probably get this number up to 90%. Obviously, this approach only > works for individual use, Google would object for large scale batch > uses. > > - Godmar > > [1] Annette Bailey and Godmar Back, Retrieving Known Items with LibX. > The Serials Librarian, 2007. To appear. > > On 7/17/07, Jonathan Rochkind <[log in to unmask]> wrote: >> Does anyone have any decent open source code to parse a citation? I'm >> talking about a completely narrative citation like someone might >> cut-and-paste from a bibliography or web page. I realize there are a >> number of differnet formats this could be in (not to mention the human >> error problems that always occur from human entered free text)--but >> thinking about it, I suspect that with some work you could get something >> that worked reasonably well (if not perfect). So I'm wondering if anyone >> has donethis work. >> >> (One of the commerical legal product--I forget if it's Lexis or >> West--does this with legal citations--a more limited domain--quite >> well. I'm not sure if any of the commerical bibliographic citation >> management software does this?) >> >> The goal, as you can probably guess, is a box that the user can paste a >> citation into; make an OpenURL out of it; show the user where to get the >> citation. I'm pretty confident something useful could be created here, >> with enough time put into it. But saldy, it's probably more time than >> anyone has individually. Unless someone's done it already? >> >> Hopefully, >> Jonathan >> > -- Jonathan Rochkind Sr. Programmer/Analyst The Sheridan Libraries Johns Hopkins University 410.516.8886 rochkind (at) jhu.edu