Message Boards Message Boards

Grammar reduction and other cleanup in WordCloud

POSTED BY: Vitaliy Kaurov
2 Replies

Thanks for sharing your tips, Vitaliy! These "cleaning methods" seem to work very nicely on the whole. I have nothing new to add (haven't played with WordCloud yet), just wanted to say that there's obviously an error in WordData["species","BaseForm"]. Yes, "specie" is a word, but it has to do with coins and other forms of commodity money (i.e. no cats involved as far as humans know). WordData doesn't even return "species" as a baseform of "species", I looked at the complete results. If we did get both baseforms though, we'd need a good technique to decide which one is more likely to be relevant to the topic, but that might actually be doable... (sorry to be a downer, the "specie" error is actually quite funny...)

POSTED BY: Bianca Eifert

Thanks for feedback @Bianca Eifert ! Indeed there are cases when filtering needs to get more sophisticated. We will take a look at what causes WordData["species","BaseForm"].

POSTED BY: Vitaliy Kaurov
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract