LISTSERV 16.5 - CODE4LIB Archives

What they said - +1!


-----Original Message-----
From: Code for Libraries [mailto:[log in to unmask]] On Behalf Of Dhanushka Samarakoon
Sent: Saturday, November 11, 2017 3:17 PM
To: [log in to unmask]
Subject: Re: [CODE4LIB] hands-on workshop on natural language processing & text mining

I'm interested too.

On Thu, Nov 9, 2017 at 11:45 AM, Haitz, Lisa (haitzlm) < [log in to unmask]> wrote:

> I love it!
>
> On 11/9/17, 1:13 PM, "Code for Libraries on behalf of Eric Lease 
> Morgan" < [log in to unmask] on behalf of [log in to unmask]> wrote:
>
>     I’m thinking about a hands-on workshop on natural language 
> processing & text mining, below, and your feedback is desired.  —ELM
>
>
>     Natural language processing & text mining using freely available
> tools: "No programming necessary"
>
>     This text outlines a hands-on natural language & text mining workshop.
>
>     It is possible to do simple & rudimentary natural language 
> processing & text mining with a set of freely available tools. No 
> programming is necessary. This workshop facilitates hands-on exercises 
> demonstrating how this can be done. By participating in this workshop, 
> students & researchers will be able to:
>
>      * identify patterns, anomalies, and trends in their texts
>      * practice both "distant" and "scalable" reading
>      * enhance & complement their ability to do "close" reading
>      * use & understand a corpus of poetry or prose at scale
>
>     Activities in the workshop include:
>
>      * learning what natural language processing is, and why you 
> should care
>      * articulating a research question
>      * creating a corpus
>      * creating a plain text version of a corpus with Tika [1]
>      * using Voyant Tools to do some "distant" reading" [2]
>      * using a concordance (AntConc) to facilitate searching keywords 
> in context [3]
>      * creating a simple word list with a text editor
>      * cleaning & analyzing word lists with OpenRefine [4]
>      * charting & graphing word lists with Tableau Public [5]
>      * increasing meaning by extracting parts-of-speech with the 
> Standford POS Tagger [6]
>      * increasing meaning some by extracting named entities with the 
> Standford NER [7]
>      * identifying themes and clustering documents using MALLET [8]
>
>     Anybody with sets of texts can benefit from this workshop. Any 
> corpus of textual content is apropos: journal articles, books, the 
> complete run of a magazine, blog postings, Tweets, press releases, 
> conference proceedings, websites, poetry, etc. This workshop is 
> computer (Windows, Linux,
> Macintosh) agnostic. All the software used in this workshop is freely 
> available on the 'Net, or it is already installed on one's computer. 
> Active participation requires zero programming, but students must 
> bring their own computer, and they must not be afraid of their 
> computer's command line interface.
>
>     This workshop will not make participants an expert in natural 
> language processing, but it will empower them to make better sense of 
> large sets of textual information.
>
>     [1] Tika - http://tika.apache.org
>     [2] Voyant - http://voyant-tools.org
>     [3] AntConc - http://www.laurenceanthony.net/software/antconc/
>     [4] OpenRefine - http://openrefine.org
>     [5] Tableau Public - https://public.tableau.com/
>     [6] POS Tagger - https://nlp.stanford.edu/software/tagger.shtml
>     [7] NER - https://nlp.stanford.edu/software/CRF-NER.shtml
>     [8] MALLET - http://mallet.cs.umass.edu
>
>
>