Ha! If it's not too difficult, then with all the time you've spent "looking at it extensively", how come you don't have a solution yet? Just kidding. :) Jonathan Nathan Vack wrote: > We've looked at this pretty extensively, and we're pretty certain > there's nothing downloadable that does a "good enough" job. However, > it's by no means impossible -- it seems to be undergrad thesis-level > work in Singapore: > > http://wing.comp.nus.edu.sg/parsCit/ > > There used to be a paper describing this approach (essentially > treating citation parsing as a natural language processing task and > using a maximum entropy algorithm) online... the page even cites > it... but it seems to be gone now. > > FWIW, it didn't look too difficult. > > -Nate > > On Jul 17, 2007, at 6:16 PM, Jonathan Rochkind wrote: > >> Does anyone have any decent open source code to parse a citation? I'm >> talking about a completely narrative citation like someone might >> cut-and-paste from a bibliography or web page. I realize there are a >> number of differnet formats this could be in (not to mention the human >> error problems that always occur from human entered free text)--but >> thinking about it, I suspect that with some work you could get >> something >> that worked reasonably well (if not perfect). So I'm wondering if >> anyone >> has donethis work. >> >> (One of the commerical legal product--I forget if it's Lexis or >> West--does this with legal citations--a more limited domain--quite >> well. I'm not sure if any of the commerical bibliographic citation >> management software does this?) >> >> The goal, as you can probably guess, is a box that the user can >> paste a >> citation into; make an OpenURL out of it; show the user where to >> get the >> citation. I'm pretty confident something useful could be created >> here, >> with enough time put into it. But saldy, it's probably more time than >> anyone has individually. Unless someone's done it already? >> >> Hopefully, >> Jonathan >> > -- Jonathan Rochkind Sr. Programmer/Analyst The Sheridan Libraries Johns Hopkins University 410.516.8886 rochkind (at) jhu.edu