Dear all:
The ParsCit team has also been updating the ParsCit package, and is
happy to announce a new version that improves on classification
accuracy. This version also adds a fully-integrated module that adds
document logical structure parsing so that that each line of the input
is classified among 23 logical structure categories (e.g., page
number, title, section header, figure, table, figureCaption, etc.) can
be extracted from either plain text or XML output files that come from
an OCR engine. The version also benefits from a number of user
contributed fixes and training data.
You can either download a copy of ParsCit for your own use, or use it
through a web services interface. We welcome your feedback and hope
that if you use ParsCit or any other freely available reference string
parsing tool that you can contribute annotated data to help make these
models more robust.
ParsCit (and its online demos) are available from:
http://wing.comp.nus.edu.sg/parsCit/
Current Distribution: http://wing.comp.nus.edu.sg/parsCit/parscit-100401.zip
Cheers,
Min
|