# [CALL] For Curious Cases of Word Histories

Posted 8 months ago
3920 Views
|
27 Replies
|
109 Total Likes
|

NOTE: This is a long page with many images. Scroll through to find some gems.

WordFrequencyData is a nifty instrument for mining oceans of texts and discovering wonderful historical semantic curiosities. This post is a call for you to share your discoveries of interesting word histories. Rules are very simple.

• Post you discovery as a comment on this thread

• Your discovery should be curious histories of some words that can be seen in their WordFrequencyData

• Start your comment with a title clearly indicating the meaning of your discovery (use # as the first character to make a title)

• Your comment must contain a plot WordFrequencyData of your terms. You can use the function I provide below. Alternatively you can use your own what to visualize WordFrequencyData.

• Your comment must contain Wolfram Language code you use to make the plot

• Your comment must contain some text explaining why you think the words you found are curious and interesting in your opinion

• If you want to comment on someone's work please click REPLY to his/her specific post so it is clear to what you refer and nested structure of comments is preserved.

Please see comment below for good examples.

### FUNCTION for PLOTs

Feel free to use this function for your visualizations and change or improve it if you wish. Note what kind of options you can provide to this plot. I tried to limit those options to only very important once, fixing other options to make a nice plot.

ClearAll@WordFrequencyPlot;

Options[WordFrequencyPlot]=
{"YearStart"->1800,"YearEnd"->Now,"Case"->True,
"Smooth"->3,"Scaling"->None,"Style"->Automatic};

WordFrequencyPlot[words_,OptionsPattern[]]:=
With[{
$data=WordFrequencyData[words,"TimeSeries", {OptionValue["YearStart"],OptionValue["YearEnd"]}, IgnoreCase->OptionValue["Case"]]}, DateListPlot[ MapThread[Callout, {MeanFilter[#,Quantity[OptionValue["Smooth"],"Years"]]&/@ Values[$data],words}],
ScalingFunctions->OptionValue["Scaling"],
PlotRange->All,
PlotTheme->"Detailed",
PlotStyle->OptionValue["Style"],
FrameTicks->{Automatic,None},
ImageSize->Large,
FrameLabel->{"YEAR","FREQUENCY in TEXT"}]
]

27 Replies
Sort By:
Posted 8 months ago

# Benford's Law

I wanted to start from something spectacular in its simplicity - demonstration of Benford's Law. From MathWorld:

Benford's Law is a phenomenological law also called the first digit law, first digit phenomenon, or leading digit phenomenon. Benford's law states that in listings, tables of statistics, etc., the digit 1 tends to occur with probability ∼30%, much greater than the expected 11.1% (i.e., one digit out of 9). Benford's law can be observed, for instance, by examining tables of logarithms and noting that the first pages are much more worn and smudged than later pages.

Surprisingly, this law holds not only for the digits usage in texts, but also for the word-names of the digits, - see plots below. It would be nice to hear any explanations of this.

WordFrequencyPlot[ToString /@ Range[0, 9]]


## Log scaling of vertical axis

WordFrequencyPlot[ToString /@ Range[0, 9], "Scaling" -> "Log"]


## Direct plot of digit "names"

WordFrequencyPlot[IntegerName[Range[0, 9]]]


## Log scaling of vertical axis for digit "names"

WordFrequencyPlot[IntegerName[Range[0, 9]], "Scaling" -> "Log"]


Posted 8 months ago
 Surprisingly, this law holds not only for the digits usage in texts, but also for the word-names of the digits, - see plots below. It would be nice to hear any explanations of this. "One" is also an indefinite pronoun ("no one", "one of the group", "if one wishes"), so appears a lot more often in English than just as a spelled-out number. For the cases of actual numeric representation, nearly any numbered list will include one (such as this brief statement, for one); those that go to two will also include one (but not three), those that go to three will also include two (but not four), and so on.
Posted 8 months ago

# The Five Basic Senses

There are five basic senses: touch, sight, hearing, smell and taste. I remember reading somewhere that the maximal information flow human experience normally is due to the vision. The diagram below reflects on that, - note it is a logarithmic scale, - word "see" is much more frequent than others (Although it can have other meanings too, besides the direct act of vision itself, like "understand" etc. But so can other words too.). Note curious fall of "taste" below "hear" and "touch".

WordFrequencyPlot[{"see", "hear", "touch", "taste", "smell"}, "Scaling" -> "Log"]


Posted 8 months ago

# The Five Ws and H

WordFrequencyPlot[{"how", "why", "what", "where", "when", "who"}]


According to Wikipedia, The Five Ws and H are questions whose answers are considered basic in information gathering or problem solving. They are often mentioned in journalism, research, and police investigations. They constitute a formula for getting the complete story on a subject. According to the principle of the Six Ws, a report can only be considered complete if it answers these questions starting with an interrogative word:

• Who was involved?
• What happened?
• Where did it take place?
• When did it take place?
• Why did that happen?
• How did it happen?

Rudyard Kipling in his "Just So Stories" (1902) writes:

I keep six honest serving-men

(They taught me all I knew);

Their names are What and Why and When

And How and Where and Who.

It is quite remarkable to observe that these questions have various degrees of usage perhaps reflecting on their relevant importance, with "when" being the key question nowadays, which was not always the case. "Who" lead in past but became less prominent.

Posted 8 months ago
 There is one more with wh:and the popular names show the importance of doing:
Posted 8 months ago

# Freedom, Liberty, Justice, Equality

WordFrequencyPlot[{"freedom", "liberty", "justice", "equality"}]


Political theorists dating back to the Hellenic period have examined the tensions between these concepts, especially between equality and freedom. It's interesting to see how "equality" stays fairly flat while "justice" and "liberty" have decreased (though "justice" seems to be on a relatively recent uptick). Interestingly, "freedom" has increased over time, especially around the time of World War II and the following decades.

Posted 8 months ago

# Religion and Politics

WordFrequencyPlot[{"religion", "politics"}]


It's a common adage that discussing politics and religion won't make you any friends, especially at social gatherings such as dinner parties. But which of these terms has been mentioned more frequently over time? Not terribly surprising that the term "religion" has decreased over time, with general trends of people becoming more secular, but the fact the terms "politics" and "religion" seem to meet in our current period is somewhat telling. Karl Marx once quipped religion is the opium of the people. Perhaps politics and political discourse are shaping up to take its place, for better or worse.

Posted 8 months ago

# Trigonometry and Calculator

WordFrequencyPlot[{"trigonometry", "calculator"}]


It's a pretty well-know fact and self-evident that calculators have ruined trigonometry.

# Smartphone, iPhone, Apple, Samsung, Orange

WordFrequencyPlot[{"iPhone", "Smartphone", "Samsung", "Apple",
"Orange"}, "Scaling" -> "Log"]


People back in 1900 didn't think iPhone was the best phone ever. But if you think Apple is the best company ever, have you ever tried Orange? Way more stable and reliable.

I didn't resist in a humorist take in this thread.

Posted 8 months ago
 Somewhat related to health: WordFrequencyPlot[{"homeopathy", "acupuncture", "chemotherapy", "fasting", "antibiotic"}] 
Posted 8 months ago

## Four letter words from f to k

These are more than one and the one does not come up (OMG):

flak was important during WW II - even here it throws its shadows ....

Posted 8 months ago

## Western Thinking

Posted 8 months ago

## Social Occupation

The King is still the king, unbelievable

may be the King of Rock n'Roll and the King of Pop are included. Without the king

the servant declines, the employee raises, not much gain in it, isn't it? The abolition of slavery seems to be reflected.

Posted 8 months ago

## Political Systems

WordFrequencyPlot[{"capitalism", "nationalism", "socialism",
"communism", "fascism", "populism"}]


Posted 8 months ago

## World Powers

Posted 8 months ago

## Recession word frequency vs actual recessions

Let's plot the frequency of the word "recession".

WordFrequencyPlot[{"recession"}, "Scaling" -> "Log"]


The federal reserve bank in St. Louis keeps a set of indicators which includes the recession periods since 1854.

fred = ServiceConnect["FederalReserveEconomicData"];
usrec = fred["SeriesData", "ID" -> "USREC"];
recWord = WordFrequencyData["recession", "TimeSeries", {1854, Now}];
recLogWord = TimeSeriesMap[Log, recWord];
{min, max} = {Min[#], Max[#]} &@recLogWord["Values"];
recScaled =
MovingAverage[TimeSeriesMap[Rescale[#, {min, max}] &, recLogWord], 3];
DateListPlot[{recScaled, usrec}, Filling -> Axis, ImageSize -> Large,
FrameTicks -> {Automatic, None},
PlotLegends -> {"recession word freq(Log)", "Recessions"},
PlotRange -> {{DateObject[{1854}], DateObject[{2010}]}, {0, 1}}]


Posted 8 months ago

# Colors and the "rise" of blue

I think colors are quite interesting. In the plots below note the "rise of blue" in texts. There is a research field that relates language and perception of color. See for example "Russian blues reveal effects of language on color discrimination". Color blue takes a special place, in some opinions less frequent in ancient literature. Also many languages do not distinguish between what in English are described as "blue" and "green" and instead use a cover term spanning both; this might have an effect on English translations. Please respond to this comment if you have any thoughts about this phenomenon.

color={"white","black","red","yellow","green","blue",
"orange","purple","gray","indigo","pink"};
Interpreter["Color"][color]


Show [WordFrequencyPlot[color[[;; 6]], "Case" -> False, "Scaling" -> "Log",
"Style" -> Interpreter["Color"][color[[;; 6]]]],Background -> GrayLevel[.8]]


A bit more of colors in regular non-Log scaling:

Show [WordFrequencyPlot[color, "Case" -> False,
"Style" -> Interpreter["Color"][color]], Background -> GrayLevel[.8]]


Posted 8 months ago

# Is money the root of all evil?

I just wondered if money was the root of all evil. I'm not great with math, so I wasn't sure how to get the square or cube root of the values for money to see if they aligned with the value for "evil" at a certain point in history. However, "evil" is not mentioned as frequently as "money", so I don't think "money" is any root of "evil". It might be the other way around though.

WordFrequencyPlot[{"money", "evil"}]


Posted 8 months ago
 Navigator's tools. Notice the blip during WWII: WordFrequencyPlot[{"sextant", "chronometer", "compass", "pelorus", "almanac"}, "Scaling" -> "Log"] 
Posted 8 months ago

# MRB

Here are my initials's (MRB) occurrence since I was born. (I discovered the MRB constant in 1999 -- any connection between that and the MRB uptick in the graph after 2000?)

ClearAll@WordFrequencyPlot;

Options[WordFrequencyPlot] = {"YearStart" -> 1995, "YearEnd" -> Now,
"Case" -> True, "Smooth" -> 3, "Scaling" -> None,
"Style" -> Automatic};

WordFrequencyPlot[words_, OptionsPattern[]] :=
With[{$data = WordFrequencyData[words, "TimeSeries", {OptionValue["YearStart"], OptionValue["YearEnd"]}, IgnoreCase -> OptionValue["Case"]]}, DateListPlot[ MapThread[ Callout, {MeanFilter[#, Quantity[OptionValue["Smooth"], "Years"]] & /@ Values[$data],
words}], ScalingFunctions -> OptionValue["Scaling"],
PlotRange -> All, PlotTheme -> "Detailed",
PlotStyle -> OptionValue["Style"], FrameTicks -> {Automatic, None},
ImageSize -> Large, FrameLabel -> {"YEAR", "FREQUENCY in TEXT"}]]


Posted 8 months ago

# War and Peace

Apparently "Peace" is not as talked about (or written about) as "War"...

WordFrequencyPlot[{"war", "peace"}]


# Earth and Space

Or perhaps running out of options for peace on earth, space becomes the next frontier...

 WordFrequencyPlot[{"earth", "space"}]


Posted 8 months ago
 I have looked at a number of example some similar to the ones above. Some word histories tell nice stories, for example about medicine. Here are three words for malaria: Options[WordFrequencyPlot] = {"YearStart" -> 1800, "YearEnd" -> Now, "Case" -> True, "Smooth" -> 3, "Scaling" -> None, "Style" -> Automatic}; WordFrequencyPlot[words_, OptionsPattern[]] := With[{$data = WordFrequencyData[words, "TimeSeries", {OptionValue["YearStart"], OptionValue["YearEnd"]}, IgnoreCase -> OptionValue["Case"]]}, DateListPlot[ MapThread[ Callout, {MeanFilter[#, Quantity[OptionValue["Smooth"], "Years"]] & /@ Values[$data], words}], ScalingFunctions -> OptionValue["Scaling"], PlotRange -> All, PlotTheme -> "Detailed", PlotStyle -> OptionValue["Style"], FrameTicks -> {Automatic, None}, ImageSize -> Large, FrameLabel -> {"YEAR", "FREQUENCY in TEXT"}]] WordFrequencyPlot[{"ague", "malaria", "paludism"}] Until 1880, when Laveran first discovered the parasite, ague and malaria have basically the same frequency. Malaria is derived from malaria aria "bad air", whereas ague comes from acute febris "acute fever".Sometimes we can also observe a shift in the frequency of words reflecting meaning the same thing Options[WordFrequencyPlot] = {"YearStart" -> 1800, "YearEnd" -> Now, "Case" -> True, "Smooth" -> 3, "Scaling" -> None, "Style" -> Automatic}; WordFrequencyPlot[{"Moslem", "Muslim"}] In those cases a relative frequency plot, i.e. displaying quantiles could be interesting: StackedDateListPlot[ MapThread[ Callout, {Values[ WordFrequencyData[{"Moslem", "Muslim"}, "TimeSeries"]], {"Moslem", "Muslim"}}], PlotRange -> All, PlotLayout -> "Percentile", ImageSize -> Large, PlotStyle -> {Red, Green}, LabelStyle -> Directive[Bold, 16], PlotTheme -> "Detailed"] Such a plot is also useful to compare opposites like the words peace and war, which are also studied in an earlier post: StackedDateListPlot[ MapThread[ Callout, {Values[ WordFrequencyData[{"war", "peace"}, "TimeSeries"]], {"war", "peace"}}], PlotRange -> All, PlotLayout -> "Percentile", ImageSize -> Large, PlotStyle -> {Red, Green}, LabelStyle -> Directive[Bold, 16], PlotTheme -> "Detailed"] It is interesting to see that since about 1850 the word war is more frequent than peace.These plot also reflect use of words such as bike and bicycle StackedDateListPlot[ MapThread[ Callout, {Values[ WordFrequencyData[{"bicycle", "bike"}, "TimeSeries"]], {"bicycle", "bike"}}], PlotRange -> All, PlotLayout -> "Percentile", ImageSize -> Large, PlotStyle -> {Red, Green}, LabelStyle -> Directive[Bold, 16], PlotTheme -> "Detailed"] I would have expected that bike becomes more prominent during the 20th century. Between 1800 and 1880 is is also surprisingly common. I am not sure why, but this could be due to the other meaning of bike which is something like "nest or swarm of bees". It would be interesting to consider the change of meaning of words. I tried to look at the word "gay" which has changed meaning over the years from lighthearted (13th century), bright and showy (14th century) and happy. It could also imply morality and mean gay women (prostitute) or gay man (womaniser), gay house (brothel). around 1900 it was something like "cheerful"; in the 1980 young users would use it to mean "lame, stupid" around 1990 it got to mean homosexual. I tried to use google n-grams to figure that out, but it didn't really work well. Here are words that are used close to gay over the years: Table[{k, StringSplit[ StringSplit[ StringSplit[ StringSplit[ URLExecute[ "https://books.google.com/ngrams/graph?content=gay+*_ADJ&\ year_start=" <> ToString[k] <> "&year_end=" <> ToString[k + 20] <> "&corpus=15&smoothing=3"], "direct_url="][[2]], " width"][[1]], "gay%20"][[3 ;;]], "_"][[All, 1]]}, {k, 1800, 2000, 20}] which givesThe frequency plot is: Options[WordFrequencyPlot] = {"YearStart" -> 1800, "YearEnd" -> Now, "Case" -> True, "Smooth" -> 3, "Scaling" -> None, "Style" -> Automatic}; WordFrequencyPlot[{"gay"}] or over longer times: Options[WordFrequencyPlot] = {"YearStart" -> 1500, "YearEnd" -> Now, "Case" -> True, "Smooth" -> 3, "Scaling" -> None, "Style" -> Automatic}; WordFrequencyPlot[{"gay"}] Also plastic has changed meaning from the the characteristic of being plastic to the material plastic: WordFrequencyPlot[{"plastic"}] In general we can see when different products have been developed: WordFrequencyPlot[{"radio", "telephone", "computer", "car", "watch", "electricity"}, "YearStart" -> 1700, "YearEnd" -> Now] Of course, words can come out of fashion, too. For example: WordFrequencyPlot[{"Pence", "Dollar", "Shilling", "Euro", "Sterling", "Farthing", "Florin", "Dime", "Yen", "Yuan"}, "YearStart" -> 1700, "YearEnd" -> Now] In fact we can study this more systematically, by looking at the correlations between frequency curves: words = {"war", "peace", "communism", "capitalism", "socialism", "democracy", "unemployment", "conflict", "crisis", "terrorism", "military", "welfare", "bomb", "weapons", "combat"} worddata = (WordFrequencyData[#, "TimeSeries"])["Values"] & /@ words; cm = Correlation[ Transpose@ worddata[[All, 1 ;; Min[ Table[Length[worddata[[i, ;;]]], {i, 1, Length[words] - 1}]]]]]; Column[{GraphicsRow[words[[1 ;;]], ImageSize -> 1000, Frame -> All], Row[{GraphicsColumn[words[[1 ;;]], ImageSize -> 67, Frame -> All], Overlay[{ArrayPlot[cm, ColorFunction -> (ColorData["TemperatureMap"][(1 + #)/2] &), Frame -> None, Mesh -> True, PlotRangePadding -> 0, ImageSize -> 1000, ColorFunctionScaling -> False], GraphicsGrid[Map[NumberForm[#, 2] &, cm, {2}], ImageSize -> 1000]}]}]}, Alignment -> Right, Spacings -> 0] We can use a BandwidthOrdering Needs["GraphUtilities"] {r, c} = MinimumBandwidthOrdering[cm, Method -> "RCMD"] cm2 = Correlation[ Transpose@ worddata[[r]][[All, 1 ;; Min[ Table[Length[worddata[[r]][[i, ;;]]], {i, 1, Length[words]}]]]]]; Column[{GraphicsRow[words[[r]], ImageSize -> 1000, Frame -> All], Row[{GraphicsColumn[words[[r]], ImageSize -> 67, Frame -> All], Overlay[{ArrayPlot[cm2, ColorFunction -> (ColorData["TemperatureMap"][(1 + #)/2] &), Frame -> None, Mesh -> True, PlotRangePadding -> 0, ImageSize -> 1000, ColorFunctionScaling -> False], GraphicsGrid[Map[NumberForm[#, 2] &, cm2, {2}], ImageSize -> 1000]}]}]}, Alignment -> Right, Spacings -> 0] Using that we can try to find words with a similar behaviour: StackedDateListPlot[ MapThread[Callout, Log /@ {Values[ WordFrequencyData[{"Democracy", "War", "Peace"}, "TimeSeries"]], {"Democracy", "War", "Peace"}}], PlotRange -> All, PlotLayout -> "Percentile", ImageSize -> Large, PlotStyle -> {Red, Green, Blue}, LabelStyle -> Directive[Bold, 16], PlotTheme -> "Detailed"] which indicates near constant ratios over a long time. This is not that easy to see in the FrequencyPlot WordFrequencyPlot[{"Democracy", "War", "Peace"}, "YearStart" -> 1900, "YearEnd" -> Now, "Scaling" -> {None, "Log"}] .Finally, it is interesting to look at other languages such as tu vs usted in Spanish Options[WordFrequencyPlotSpanish] = {"YearStart" -> 1800, "YearEnd" -> Now, "Case" -> True, "Smooth" -> 3, "Scaling" -> None, "Style" -> Automatic}; WordFrequencyPlotSpanish[words_, OptionsPattern[]] := With[{$data = WordFrequencyData[words, "TimeSeries", {OptionValue["YearStart"], OptionValue["YearEnd"]}, IgnoreCase -> OptionValue["Case"], Language -> "Spanish"]}, DateListPlot[ MapThread[ Callout, {MeanFilter[#, Quantity[OptionValue["Smooth"], "Years"]] & /@ Values[$data], words}], ScalingFunctions -> OptionValue["Scaling"], PlotRange -> All, PlotTheme -> "Detailed", PlotStyle -> OptionValue["Style"], FrameTicks -> {Automatic, None}, ImageSize -> Large, FrameLabel -> {"YEAR", "FREQUENCY in TEXT"}]] WordFrequencyPlotSpanish[{"vosotros", "ustedes"}] or Du and Sie in German Options[WordFrequencyPlotGerman] = {"YearStart" -> 1800, "YearEnd" -> Now, "Case" -> True, "Smooth" -> 3, "Scaling" -> None, "Style" -> Automatic}; WordFrequencyPlotGerman[words_, OptionsPattern[]] := With[{$data = WordFrequencyData[words, "TimeSeries", {OptionValue["YearStart"], OptionValue["YearEnd"]},(*IgnoreCase\[Rule]OptionValue["Case"],*) IgnoreCase -> False, Language -> "German"]}, DateListPlot[ MapThread[ Callout, {MeanFilter[#, Quantity[OptionValue["Smooth"], "Years"]] & /@ Values[$data], words}], ScalingFunctions -> OptionValue["Scaling"], PlotRange -> All, PlotTheme -> "Detailed", PlotStyle -> OptionValue["Style"], FrameTicks -> {Automatic, None}, ImageSize -> Large, FrameLabel -> {"YEAR", "FREQUENCY in TEXT"}]] WordFrequencyPlotGerman[{"Du", "Sie"}] I suppose that there are interesting mechanisms working here. It definitely feels that "Du" becomes more prevalent as opposed to the more formal "Sie". But there might be an affect due to (social?) media etc. in the opposite direction. Here are a couple of pronouns in English: WordFrequencyPlot[{"you", "thou", "ye", "thee", "thy"}, "YearStart" -> 1200, "YearEnd" -> Now] which might look better on a percentile plot: StackedDateListPlot[ MapThread[ Callout, {Values[ WordFrequencyData[{"you", "thou", "ye", "thee", "thy"}, "TimeSeries"]], {"you", "thou", "ye", "thee", "thy"}}], PlotRange -> All, PlotLayout -> "Percentile", ImageSize -> Large, PlotStyle -> RandomColor[5], LabelStyle -> Directive[Bold, 16], PlotTheme -> "Detailed"] Logarithmically, this becomes: StackedDateListPlot[ MapThread[Callout, Log@{Values[ WordFrequencyData[{"you", "thou", "ye", "thee", "thy"}, "TimeSeries"]], {"you", "thou", "ye", "thee", "thy"}}], PlotRange -> All, PlotLayout -> "Percentile", ImageSize -> Large, PlotStyle -> RandomColor[5], LabelStyle -> Directive[Bold, 16], PlotTheme -> "Detailed"] Cheers,Marco
Posted 8 months ago
 Curiosity killed the cat: WordFrequencyPlot[{"curiosity", "cat"}] 
Posted 8 months ago

# Transportation

WordFrequencyPlot[{"airplane", "car", "train", "speed"}]


Posted 8 months ago

# Programming Languages

WordFrequencyPlot[{"Wolfram", "Mathematica", "Fortran", "Cobol",
"HTML", "CSS", "Ruby", "JavaScript", "PHP", "Matlab", "LabVIEW",
"Python", "Java", "Swift"}, "Scaling" -> "Log"]


Posted 8 months ago
 - A man who writen "Philosophiæ Naturalis Principia Mathematica".  WordFrequencyPlot[{"Pythagoras", "Archimedes", "Euclid", "Fibonacci", "Descartes", "Newton", "Leibniz", "Gauss", "Euler", "Fermat", "Turing"}, "YearStart" -> 1800]  WordFrequencyPlot[{"Newton", "Copernicus", "Gauss", "Einstein", "Hawking"}, "YearStart" -> 18 
Posted 7 months ago

# North, South, East, West

WordFrequencyPlot[{"north", "south", "east", "west"}, "Scaling" -> "Log"]
`

I was a bit surprised to see south is dominating north considering that the later is a standard and "the fundamental direction" in various geography, cartography, GIS, etc. applications and conventions. South surpassed north around 1900.