Hey code4lib!
I've been working on some text visualizations recently and realized that
there's probably at least a few people on this list who might find this
work interesting.
The first is a visualization of line similarity in T.S. Eliot's "Four
Quartets" http://willkurt.github.io/four_quartets_visualized/
This project also contains a detailed description of what you're actually
seeing (essentially cosine similarity matrices).
The second is a similar visualization of line repetition/similarity in the
lyrics of every Vampire Weekend song (broken out by album):
http://imgur.com/a/ZFxFo
A super brief explanation of what you're seeing: each square represents the
similarity between any two lines. For example [1,2] would shows how similar
line 1 is with line 2. This explains the diagonal lines (line x is always
identical to line x) as well as the symmetry (the similarity [x,y] is the
same as [y,x]).
As a next step I plan to create an interactive version of this, as well as
an easy way to automate the creation of this type of visualization.
Enjoy!
--Will
|