A suggestion: you might want to also add Biblio-Citation-Parser by Mike Jewell (http://search.cpan.org/~mjewell/Biblio-Citation-Parser-1.10/)<http://search.cpan.org/~mjewell/Biblio-Citation-Parser-1.10/> Steve On Tue, Sep 16, 2008 at 11:11 AM, Miriam Goldberg <[log in to unmask]>wrote: > Thanks for pointing out these other parsing tools. I've added them to > the list on our website (see under heading "Other Citation Tools" at > http://freecite.library.brown.edu/). > > Citation metadata extraction is a difficult open problem whose > potential solutions are based on continually-developing technologies. > So I think it's important that we approach this task from many diverse > angles. If our project makes a little headway here, ParsCit makes some > headway there, and five other groups make their own advancements, > hopefully we'll be able to pool our findings into a viable > application. > > > Anyone want to compare and contrast these three projects? Might make a > good very > > short article/review for the Code4Lib Journal if you wanted to. > > Agreed. I'd love to see this. Another idea might be to write an > application that takes the output of multiple parsers and assembles > the best answer. > > On Fri, Sep 12, 2008 at 3:50 PM, Jonathan Rochkind <[log in to unmask]> > wrote: > > This is the third open source citation parser I know of now. A welcome > change from a year ago when I needed one and didn't know of any! But I can't > help but think maybe people should be cooperating more instead of > engineering their own wheels. Also curious if anyone has looked at all three > and can compare and contrast and make a reccommendation. > > > > The other two I know about are: > > > > ParsCit -- http://wing.comp.nus.edu.sg/parsCit/ > > A CDL project I don't have a good home page for, but code is here: > http://gales.cdlib.org/~egh/hmm-citation-extractor/ > > > > I've been keeping track because I have a use for this, although haven't > had time to make use of any of them yet. > > > > Anyone want to compare and contrast these three projects? Might make a > good very short article/review for the Code4Lib Journal if you wanted to. > > > > Jonathan > > > > > >>>> jean rainwater <[log in to unmask]> 09/12/08 2:25 PM >>> > > Please help us beta test "FreeCite", a new citation parser for > > non-structured bibliographic data. FreeCite is the result of > > collaboration between the Brown University Library and Public Display, > > a Providence-based software company founded by and employing many > > Brown grads. Public Display's core business is information > > extraction. Partial funding for this project was provided by the > > Andrew W. Mellon Foundation. > > > > FreeCite is implemented in Ruby on Rails and uses the CRF++ library > > implementation of conditional random fields. The model is trained on > > the CORA dataset with lexical augmentation from the Directory of > > Research and Researchers at Brown (DRR-B). The API and code are > > available at: http://freecite.library.brown.edu. > > > > Jean Rainwater > > Co-Leader, Integrated Technology Services > > Brown University Library > > Providence, RI 02912 > > 401.863.9031 > > [log in to unmask] > > >