Ha! If it's not too difficult, then with all the time you've spent
"looking at it extensively", how come you don't have a solution yet?
Just kidding. :)
Nathan Vack wrote:
> We've looked at this pretty extensively, and we're pretty certain
> there's nothing downloadable that does a "good enough" job. However,
> it's by no means impossible -- it seems to be undergrad thesis-level
> work in Singapore:
> There used to be a paper describing this approach (essentially
> treating citation parsing as a natural language processing task and
> using a maximum entropy algorithm) online... the page even cites
> it... but it seems to be gone now.
> FWIW, it didn't look too difficult.
> On Jul 17, 2007, at 6:16 PM, Jonathan Rochkind wrote:
>> Does anyone have any decent open source code to parse a citation? I'm
>> talking about a completely narrative citation like someone might
>> cut-and-paste from a bibliography or web page. I realize there are a
>> number of differnet formats this could be in (not to mention the human
>> error problems that always occur from human entered free text)--but
>> thinking about it, I suspect that with some work you could get
>> that worked reasonably well (if not perfect). So I'm wondering if
>> has donethis work.
>> (One of the commerical legal product--I forget if it's Lexis or
>> West--does this with legal citations--a more limited domain--quite
>> well. I'm not sure if any of the commerical bibliographic citation
>> management software does this?)
>> The goal, as you can probably guess, is a box that the user can
>> paste a
>> citation into; make an OpenURL out of it; show the user where to
>> get the
>> citation. I'm pretty confident something useful could be created
>> with enough time put into it. But saldy, it's probably more time than
>> anyone has individually. Unless someone's done it already?
The Sheridan Libraries
Johns Hopkins University
rochkind (at) jhu.edu