Nice, that might be what I need. Maybe I'll take a look at the LibX
code, it's open source, right?
Google Scholar has no API--you're screen scraping it?
Jonathan
Godmar Back wrote:
> A year or so ago a couple of students looked into this for LibX. There
> are a number of systems that people have published about, although
> some are not available and none worked very well or were easy to get
> to work. The systems also varied in their computational complexity,
> with some not suitable for interactive use. Google for "libx citation
> sensing", or generally for citation extraction, automatic record
> boundary detection or extraction. (Unfortunately, pubs.dlib.vt.edu
> appears to be down at the moment - otherwise, Suresh Menon's report
> contains a useful bibliography of work. I'll ping them.)
>
> For citations that contain item titles (which is true for a majority,
> but definitely not all citation styles) LibX's magic button uses
> Scholar as a hidden backend to produce an actionable OpenURL. Combined
> with a similarity analysis, this "magic button" functionality
> produces a usable OpenURL in (on average) 81% of cases for a set of
> 400 randomly chosen citations from 4 widely read journals from 4
> different areas published in 2006 [1]. With some fixes, we could
> probably get this number up to 90%. Obviously, this approach only
> works for individual use, Google would object for large scale batch
> uses.
>
> - Godmar
>
> [1] Annette Bailey and Godmar Back, Retrieving Known Items with LibX.
> The Serials Librarian, 2007. To appear.
>
> On 7/17/07, Jonathan Rochkind <[log in to unmask]> wrote:
>> Does anyone have any decent open source code to parse a citation? I'm
>> talking about a completely narrative citation like someone might
>> cut-and-paste from a bibliography or web page. I realize there are a
>> number of differnet formats this could be in (not to mention the human
>> error problems that always occur from human entered free text)--but
>> thinking about it, I suspect that with some work you could get something
>> that worked reasonably well (if not perfect). So I'm wondering if anyone
>> has donethis work.
>>
>> (One of the commerical legal product--I forget if it's Lexis or
>> West--does this with legal citations--a more limited domain--quite
>> well. I'm not sure if any of the commerical bibliographic citation
>> management software does this?)
>>
>> The goal, as you can probably guess, is a box that the user can paste a
>> citation into; make an OpenURL out of it; show the user where to get the
>> citation. I'm pretty confident something useful could be created here,
>> with enough time put into it. But saldy, it's probably more time than
>> anyone has individually. Unless someone's done it already?
>>
>> Hopefully,
>> Jonathan
>>
>
--
Jonathan Rochkind
Sr. Programmer/Analyst
The Sheridan Libraries
Johns Hopkins University
410.516.8886
rochkind (at) jhu.edu
|