# Animated WordCloud for Alice in Wonderland

Posted 6 years ago
10669 Views
|
5 Replies
|
19 Total Likes
|
 I am trying to find a good way to visualize evolution of subjects, ideas, characters in a text using WordCloud. Below you see this simple idea: Get text of the book and DeleteStopwords Delete obvious non-informative top words as "Alice" in our case Build frames of animation each frame on a bout a page of a book ~ 500 words For smoother animation transitions make shifts between the frames much smaller than a page - say 20 words So we are basically scanning the text by a window of 500 words shifted in steps of 20 words. Here is the result and code. Smaller words are harder to to comprehend. Let me know if you got some better ideas. alice = DeleteCases[TextWords[DeleteStopwords[ ToLowerCase[ExampleData[{"Text", "AliceInWonderland"}]]]], "alice" | "said" | "little" | "heard"]; lngth = alice // Length 3580 frms = ParallelTable[WordCloud[alice[[k ;; k + 500]], ImageSize -> 400], {k, 1, lngth - 500, 20}]; Export["alice.gif", frms, "DisplayDurations" -> {.25}] "alice.gif"
5 Replies
Sort By:
Posted 6 years ago
 Neat project! I think for this application, it would be nice if WordCloud took an argument/option for the max number of words to display in the cloud. Maybe it would be easier to understand if you take just the top, say, 20 or 25 words in each window to make a cloud with. I might also play with taking a longer DisplayDurations, while perhaps increasing the window step size from 20, if needed, to maintain dynamics. Maybe I'll try it myself when I have some time :).
Posted 6 years ago
 Thank you for your suggestions. Indeed max number of words could be an interesting option to take advantage of. I already played with various DisplayDurations and slowing it down makes it more boring to my taste. I was thinking that maybe there are method of packing words that allows to keep same words mostly at the same positions. Most of the jumping is not due to disappearance of words but their sudden relocation. But this sounds like a tough problem.
Posted 6 years ago
 This is really great, Vitaliy! Seeing the way the representation changes as the window moves through time is very interesting. It made me think of another representation -- different but quite similar.Picture a network map which includes all the words ever used, with their spacial proximities determined by how close in time their utterances were, on average, to each other. But now represent their usage densities, within each time window, as a magnitude, and represent that as something like a color or a value (like saturation or value in the colorimetric sense), or even a height of the characters, as in a 3D histogram. Now we see a connectedness representation, as well as a time evolution. It would be interesting to follow a political campaign, or a debate, or even the conversation at a lunch table, as the discussion went from subject to subject.