I worked a lot with GATE in a previous position (not in a library, but in a
research position at the Univ. of Texas at Austin). It's handy in that
there is both a UI version (GATE Developer) and a set of APIs (GATE
Embedded), which were the only versions I worked with. Also nice is the
fact that there is reasonably good documentation from the Univ. of
Sheffield (, including some basic video tutorials and
slides from recent training courses that you can step through (

Pretty much all the standard text-mining tools can be accessed through
GATE, by creating a pipeline that incorporates the tools you need. There
are also some default machine learning options if you don't want to roll
your own. There's even a UIMA plug-in if you'd like to use it inside a GATE



> There have been some great software recommendations in this thread, that I
> really don't want to quibble with. What I'd like to quibble with is the
> software-first approach. We've all tried the software-first approach, how
> many of us were happy with it?
> There is a standard in this area and that standard appears to have at
> least two non-trivial implementations, including from one software
> distributor whose name we all recognise.
> SPEC:**uima/v1.0/uima-v1.0.html<>
> Anyone have experience using the standard or these two implementations?
> cheers
> stuart
