Print

Print


I worked on a text mining project last semester where I had a bunch of
magazines with text that was totally unstructured (from IA). I would have
really liked to know how to work entity matching into such a project. Are
there text mining projects out there that demonstrate doing this?

On Fri, Apr 8, 2016 at 11:08 AM, diego ferreyra <[log in to unmask]>
wrote:

> I think controlled vocabularies can be used to improve text-minning
> process, to entities recongnition (persons, institutions and critical
> concepts) ... I think thats... but I'm a not neutral about this.... because
> I am developer of a controlled vocabularies tool :)
>
> Sorry about my english :/
>
> 2016-04-08 3:24 GMT-03:00 Eric Lease Morgan <[log in to unmask]>:
>
> > On Apr 7, 2016, at 4:24 PM, Gregory Markus <[log in to unmask]>
> > wrote:
> >
> > >> from one of the New York Times stories on the Panama Papers: "The
> > >> ICIJ made a number of powerful research tools available to the
> > >> consortium that the group had developed for previous leak
> > >> investigations. Those included a secure, Facebook-type forum
> > >> where reporters could post the fruits of their research, as well
> > >> as database search program called “Blacklight” that allowed the
> > >> teams to hunt for specific names, countries or sources.”
> > >>
> > >>
> >
> http://www.nytimes.com/2016/04/06/business/media/how-a-cryptic-message-interested-in-data-led-to-the-panama-papers.html
> > >
> > >
> >
> https://ijnet.org/en/blog/how-icij-pulled-large-scale-cross-border-investigative-collaboration
> >
> >
> > Based on my VERY quick read of the articles linked above, a group of
> > people created a collaborative system for collecting, indexing,
> searching,
> > and analyzing data/information. In the end, they facilitated the creation
> > of knowledge. That sure sounds like a library to me. Kudos! I believe our
> > profession has many things to learn from this example, and two of those
> > things include: 1) you need full text content, and 2) controlled
> > vocabularies are not a necessary component of the system. —ELM
> >
>
>
>
> --
> Diego Ferreyra
>