What they said - +1!
From: Code for Libraries [mailto:[log in to unmask]] On Behalf Of Dhanushka Samarakoon
Sent: Saturday, November 11, 2017 3:17 PM
To: [log in to unmask]
Subject: Re: [CODE4LIB] hands-on workshop on natural language processing & text mining
I'm interested too.
On Thu, Nov 9, 2017 at 11:45 AM, Haitz, Lisa (haitzlm) < [log in to unmask]> wrote:
> I love it!
> On 11/9/17, 1:13 PM, "Code for Libraries on behalf of Eric Lease
> Morgan" < [log in to unmask] on behalf of [log in to unmask]> wrote:
> I’m thinking about a hands-on workshop on natural language
> processing & text mining, below, and your feedback is desired. —ELM
> Natural language processing & text mining using freely available
> tools: "No programming necessary"
> This text outlines a hands-on natural language & text mining workshop.
> It is possible to do simple & rudimentary natural language
> processing & text mining with a set of freely available tools. No
> programming is necessary. This workshop facilitates hands-on exercises
> demonstrating how this can be done. By participating in this workshop,
> students & researchers will be able to:
> * identify patterns, anomalies, and trends in their texts
> * practice both "distant" and "scalable" reading
> * enhance & complement their ability to do "close" reading
> * use & understand a corpus of poetry or prose at scale
> Activities in the workshop include:
> * learning what natural language processing is, and why you
> should care
> * articulating a research question
> * creating a corpus
> * creating a plain text version of a corpus with Tika 
> * using Voyant Tools to do some "distant" reading" 
> * using a concordance (AntConc) to facilitate searching keywords
> in context 
> * creating a simple word list with a text editor
> * cleaning & analyzing word lists with OpenRefine 
> * charting & graphing word lists with Tableau Public 
> * increasing meaning by extracting parts-of-speech with the
> Standford POS Tagger 
> * increasing meaning some by extracting named entities with the
> Standford NER 
> * identifying themes and clustering documents using MALLET 
> Anybody with sets of texts can benefit from this workshop. Any
> corpus of textual content is apropos: journal articles, books, the
> complete run of a magazine, blog postings, Tweets, press releases,
> conference proceedings, websites, poetry, etc. This workshop is
> computer (Windows, Linux,
> Macintosh) agnostic. All the software used in this workshop is freely
> available on the 'Net, or it is already installed on one's computer.
> Active participation requires zero programming, but students must
> bring their own computer, and they must not be afraid of their
> computer's command line interface.
> This workshop will not make participants an expert in natural
> language processing, but it will empower them to make better sense of
> large sets of textual information.
>  Tika - http://tika.apache.org
>  Voyant - http://voyant-tools.org
>  AntConc - http://www.laurenceanthony.net/software/antconc/
>  OpenRefine - http://openrefine.org
>  Tableau Public - https://public.tableau.com/
>  POS Tagger - https://nlp.stanford.edu/software/tagger.shtml
>  NER - https://nlp.stanford.edu/software/CRF-NER.shtml
>  MALLET - http://mallet.cs.umass.edu