Dear all:
The ParsCit team has also been updating the ParsCit package, and is
happy to announce a new version that improves on classification
accuracy, especially for general science journals. This version also
adds a module that further processes XML files that are the output of
the commercial Omnipage OCR engine. The version also benefits from a
number of user-contributed fixes and training data, such as separating
volume and issue numbers for journals, and export of parsed reference
strings into EndNote, MODS, BibTeX or other metadata formats via the
BiblioScript library.
You can either download a copy of ParsCit for your own use, or use it
through a web services interface. We welcome your feedback and hope
that if you use ParsCit or any other freely available reference string
parsing tool that you can contribute annotated data to help make these
models more robust.
ParsCit (and its online demos) are available from:
http://wing.comp.nus.edu.sg/parsCit/ <http://wing.comp.nus.edu.sg/parsCit/>
ParsCit is open source software that is used by many projects
worldwide, and not just in experimental, research and academic places,
but in commercial enterprises as well. Mendeley is using ParsCit to
parse references from contributed papers, as is the Citations in
Economics (CitEc) project
CHANGELOG:
http://wing.comp.nus.edu.sg/parsCit/CHANGELOG.txt
<http://wing.comp.nus.edu.sg/parsCit/CHANGELOG.txt>
The bleeding edge (with development bugs), unsupported alpha version
is also available from github.
http://github.com/knmnyn/parsCit <http://github.com/knmnyn/parsCit>
Regards,
Huy
Research Assistant
for the ParsCit team at the Web IR / NLP Group (WING) of the National
University of Singapore (NUS), headed by Min-Yen Kan
|