It's on our list of Big Problems To Solve; I'm hoping to have time to
tackle it later this year :)
On Jul 18, 2007, at 12:57 PM, Jonathan Rochkind wrote:
> Ha! If it's not too difficult, then with all the time you've spent
> "looking at it extensively", how come you don't have a solution yet?
> Just kidding. :)
> Nathan Vack wrote:
>> We've looked at this pretty extensively, and we're pretty certain
>> there's nothing downloadable that does a "good enough" job. However,
>> it's by no means impossible -- it seems to be undergrad thesis-level
>> work in Singapore:
>> There used to be a paper describing this approach (essentially
>> treating citation parsing as a natural language processing task and
>> using a maximum entropy algorithm) online... the page even cites
>> it... but it seems to be gone now.
>> FWIW, it didn't look too difficult.
>> On Jul 17, 2007, at 6:16 PM, Jonathan Rochkind wrote:
>>> Does anyone have any decent open source code to parse a citation?
>>> talking about a completely narrative citation like someone might
>>> cut-and-paste from a bibliography or web page. I realize there are a
>>> number of differnet formats this could be in (not to mention the
>>> error problems that always occur from human entered free text)--but
>>> thinking about it, I suspect that with some work you could get
>>> that worked reasonably well (if not perfect). So I'm wondering if
>>> has donethis work.
>>> (One of the commerical legal product--I forget if it's Lexis or
>>> West--does this with legal citations--a more limited domain--quite
>>> well. I'm not sure if any of the commerical bibliographic citation
>>> management software does this?)
>>> The goal, as you can probably guess, is a box that the user can
>>> paste a
>>> citation into; make an OpenURL out of it; show the user where to
>>> get the
>>> citation. I'm pretty confident something useful could be created
>>> with enough time put into it. But saldy, it's probably more time
>>> anyone has individually. Unless someone's done it already?
> Jonathan Rochkind
> Sr. Programmer/Analyst
> The Sheridan Libraries
> Johns Hopkins University
> rochkind (at) jhu.edu