If you are building a word cloud there are a few tricks. First of all use DeleteStopwords. Compare:
cat = Import["https://en.wikipedia.org/wiki/Cat"];
{WordCloud[cat], WordCloud[DeleteStopwords[cat]]}
Yes, the right one is much better, but now we see the next problem: "cat" vs "cats". I usually use reduce to baseform:
Clear@base;
base[w_] := With[
{tmp = WordData[w, "BaseForm", "List"]},
If[(Head[tmp] === Missing) || tmp === {}, w, tmp[[1]]]];
SetAttributes[base, Listable];
I would also consider removing numbers (not a must though) and black-listing. For example Wikipedia pages often noisy with:
blackLIST = {"doi", "ed", "isbn", "pmid"};
I would also use ScalingFunctions -> (#^s &)
where $0<s<5$ is a correction that emphasizes or deemphasizes word frequency-size visual (5 is a good number, could be more though). Here we go:
WordCloud[
DeleteCases[
base[TextWords[StringDelete[DeleteStopwords[ToLowerCase[cat]],
DigitCharacter ..]]],
Alternatives @@ blackLIST],
ScalingFunctions -> (#^.3 &)]
I imagine there is more to it. Like statistically finding less meaningful words and removing them. Etc, etc.
Do you have your own tips? Please, share!