Print

Print


Ha! If it's not too difficult, then with all the time you've spent
"looking at it extensively", how come you don't have a solution yet?

Just kidding. :)

Jonathan

Nathan Vack wrote:
> We've looked at this pretty extensively, and we're pretty certain
> there's nothing downloadable that does a "good enough" job. However,
> it's by no means impossible -- it seems to be undergrad thesis-level
> work in Singapore:
>
> http://wing.comp.nus.edu.sg/parsCit/
>
> There used to be a paper describing this approach (essentially
> treating citation parsing as a natural language processing task and
> using a maximum entropy algorithm) online... the page even cites
> it... but it seems to be gone now.
>
> FWIW, it didn't look too difficult.
>
> -Nate
>
> On Jul 17, 2007, at 6:16 PM, Jonathan Rochkind wrote:
>
>> Does anyone have any decent open source code to parse a citation? I'm
>> talking about a completely narrative citation like someone might
>> cut-and-paste from a bibliography or web page. I realize there are a
>> number of differnet formats this could be in (not to mention the human
>> error problems that always occur from human entered free text)--but
>> thinking about it, I suspect that with some work you could get
>> something
>> that worked reasonably well (if not perfect). So I'm wondering if
>> anyone
>> has donethis work.
>>
>> (One of the commerical legal product--I forget if it's Lexis or
>> West--does this with legal citations--a more limited domain--quite
>> well.  I'm not sure if any of the commerical bibliographic citation
>> management software does this?)
>>
>> The goal, as you can probably guess, is a box that the user can
>> paste a
>> citation into; make an OpenURL out of it; show the user where to
>> get the
>> citation.  I'm pretty confident something useful could be created
>> here,
>> with enough time put into it. But saldy, it's probably more time than
>> anyone has individually. Unless someone's done it already?
>>
>> Hopefully,
>> Jonathan
>>
>

--
Jonathan Rochkind
Sr. Programmer/Analyst
The Sheridan Libraries
Johns Hopkins University
410.516.8886
rochkind (at) jhu.edu