I worked a lot with GATE in a previous position (not in a library, but in a research position at the Univ. of Texas at Austin). It's handy in that there is both a UI version (GATE Developer) and a set of APIs (GATE Embedded), which were the only versions I worked with. Also nice is the fact that there is reasonably good documentation from the Univ. of Sheffield (http://gate.ac.uk/), including some basic video tutorials and slides from recent training courses that you can step through ( http://gate.ac.uk/wiki/TrainingCourseJune2013/). Pretty much all the standard text-mining tools can be accessed through GATE, by creating a pipeline that incorporates the tools you need. There are also some default machine learning options if you don't want to roll your own. There's even a UIMA plug-in if you'd like to use it inside a GATE pipeline. Danielle -- Danielle Cunniff Plumer dcplumer associates www.dcplumer.com [log in to unmask] On Tue, Aug 27, 2013 at 5:15 PM, stuart yeates <[log in to unmask]>wrote: > There have been some great software recommendations in this thread, that I > really don't want to quibble with. What I'd like to quibble with is the > software-first approach. We've all tried the software-first approach, how > many of us were happy with it? > > There is a standard in this area and that standard appears to have at > least two non-trivial implementations, including from one software > distributor whose name we all recognise. > > SPEC: http://docs.oasis-open.org/**uima/v1.0/uima-v1.0.html<http://docs.oasis-open.org/uima/v1.0/uima-v1.0.html> > APACHE UIMA: http://uima.apache.org/ > GATE: http://gate.ac.uk/ > > Anyone have experience using the standard or these two implementations? > > cheers > stuart > > -- > Stuart Yeates > Library Technology Services http://www.victoria.ac.nz/**library/<http://www.victoria.ac.nz/library/> >