Print

Print


Do any of these work in Hadoop using MapReduce as a programming model? It seems like Hadoop would be a natural use case for text mining and analysis.  

Alan

On Aug 27, 2013, at 7:44 PM, "Riley, Jenn" <[log in to unmask]> wrote:

> This is still command-line, but Mallet is heavily used in the DH
> community: http://mallet.cs.umass.edu/. I think MONK
> (http://monkproject.org/) has a UI, but I'm not overly familiar with its
> features.
> 
> Jenn
> 
> --------------------------------
> Jenn Riley
> Head, Carolina Digital Library and Archives
> The University of North Carolina at Chapel Hill
> http://cdla.unc.edu/
> http://www.lib.unc.edu/users/jlriley
> 
> [log in to unmask]
> (919) 843-5910
> 
> 
> 
> 
> 
> On 8/27/13 11:24 AM, "Eric Lease Morgan" <[log in to unmask]> wrote:
> 
>> What sorts of text mining software do y'all support / use in your
>> libraries?
>> 
>> We here in the Hesburgh Libraries at the University of Notre Dame have
>> all but opened a place called the Center For Digital Scholarship. We are
>> / will be providing a number of different services to a number of
>> different audiences. These services include but are not necessarily
>> limited exactly to:
>> 
>> * data management consultation
>> * data analysis and visualization
>> * geographic information systems support
>> * text mining investigations
>> * referrals to other "centers" across campus
>> 
>> I am expected to support the text mining investigations. I have
>> traditionally used open source tools do to my work. Many of these tools
>> require some sort of programming in order to exploit. To some degree I am
>> expected mount text mining software on our local Windows and Macintosh
>> computers here in our Center. I am familiar with the lists of tools
>> available at Bamboo as well as Hermeneuti.ca. [0, 1] TAPoRware is good
>> too, but a bit long in the tooth. [2]
>> 
>> Do you know of other sets of tools to choose from? Are you familiar with
>> SASŪ Text Analytics, STATISTICA Data Miner, or RapidMiner? [3, 4, 5]
>> 
>> [0] Bamboo Dirt - http://dirt.projectbamboo.org
>> [1] Hermeneuti.ca - http://hermeneuti.ca/voyeur/tools
>> [2] TAPoRware - http://taporware.ualberta.ca
>> [3] Text Analytics - http://www.sas.com/text-analytics/
>> [4] Data Miner - http://www.statsoft.com/Products/STATISTICA/Data-Miner/
>> [5] RapidMiner - http://rapid-i.com/content/view/181/190/
>> 
>> --
>> Eric Lease Morgan, Digital Initiatives Librarian
>> Hesburgh Libraries
>> University of Notre Dame
>> 
>> 574/631-8604