Message Boards Message Boards

[CALL] For Curious Cases of Words' Histories

NOTE: This is a long page with many images. Scroll through to find some gems.


enter image description here

WordFrequencyData is a nifty instrument for mining oceans of texts and discovering wonderful historical semantic curiosities. This post is a call for you to share your discoveries of interesting word histories. Rules are very simple.

  • Post you discovery as a comment on this thread

  • Your discovery should be curious histories of some words that can be seen in their WordFrequencyData

  • Start your comment with a title clearly indicating the meaning of your discovery (use # as the first character to make a title)

  • Your comment must contain a plot WordFrequencyData of your terms. You can use the function I provide below. Alternatively you can use your own what to visualize WordFrequencyData.

  • Your comment must contain Wolfram Language code you use to make the plot

  • Your comment must contain some text explaining why you think the words you found are curious and interesting in your opinion

  • If you want to comment on someone's work please click REPLY to his/her specific post so it is clear to what you refer and nested structure of comments is preserved.

Please see comment below for good examples.


FUNCTION for PLOTs


Feel free to use this function for your visualizations and change or improve it if you wish. Note what kind of options you can provide to this plot. I tried to limit those options to only very important once, fixing other options to make a nice plot.

ClearAll@WordFrequencyPlot;

Options[WordFrequencyPlot]=
{"YearStart"->1800,"YearEnd"->Now,"Case"->True,
"Smooth"->3,"Scaling"->None,"Style"->Automatic};

WordFrequencyPlot[words_,OptionsPattern[]]:=
With[{
    $data=WordFrequencyData[words,"TimeSeries",
       {OptionValue["YearStart"],OptionValue["YearEnd"]},
       IgnoreCase->OptionValue["Case"]]},
    DateListPlot[
       MapThread[Callout,
         {MeanFilter[#,Quantity[OptionValue["Smooth"],"Years"]]&/@
         Values[$data],words}],
       ScalingFunctions->OptionValue["Scaling"],
       PlotRange->All,
       PlotTheme->"Detailed",
       PlotStyle->OptionValue["Style"],
       FrameTicks->{Automatic,None},
       ImageSize->Large,
       FrameLabel->{"YEAR","FREQUENCY in TEXT"}]
]
POSTED BY: Vitaliy Kaurov
33 Replies

A few key sciences

Medicine leads strongly much above even mathematics.

WordFrequencyPlot[{"chemistry","geology","physics","mathematics","astronomy","biology",
"botany","zoology","genetics","medicine","ecology","anthropology"},"YearStart"->1950]

enter image description here

POSTED BY: Vitaliy Kaurov

Communist political ideologies and systems

https://en.wikipedia.org/wiki/Communism

The raise and peak of communism mentions happens during cold war and existence of Soviet Union (USSR). The emergence of the Soviet Union as the world's first nominally communist state led to communism's widespread association with Marxism–Leninism and the Soviet economic model. Almost all communist governments in the 20th century espoused Marxism–Leninism or a variation of it.

WordFrequencyPlot[{"Leninism", "Marxism", "Stalinism", "Trotskyism"}]

enter image description here

POSTED BY: Vitaliy Kaurov

enter image description here

POSTED BY: Martijn Froeling

Which temple?

The fall of religion to philosophy, the rise of information, data and technology, the drama is evolving ;-) Note the usage of "technology"|"tech" which adds the frequencies of both. Anyone who read "American Gods" by Neil Gaiman might relate.

WordFrequencyPlot[{"science","technology"|"tech","religion","philosophy",
"sense","wisdom","reason","wit","logic","truth","information","data","knowledge"}]

enter image description here

POSTED BY: Vitaliy Kaurov

When something is wrong with a human

I was curious to see the vocabulary related to description of issues with human health. Excluded words like "disorder" which have too much wait form non-medical domains.

WordFrequencyPlot[{
"disease","illness","malady","sickness","syndrome","ailment",
"affliction","flu","fever","epidemic","plague","infection"}]

enter image description here

POSTED BY: Vitaliy Kaurov
Posted 6 years ago

Added "virus"....curious how that might correlate with "computer virus."

WordFrequencyPlot[{"disease", "illness", "malady", "sickness", 
  "syndrome", "ailment", "affliction", "flu", "fever", "epidemic", 
  "plague", "infection", "virus"}]

enter image description here

POSTED BY: Null Null

North, South, East, West

WordFrequencyPlot[{"north", "south", "east", "west"}, "Scaling" -> "Log"]

enter image description here

I was a bit surprised to see south is dominating north considering that the later is a standard and "the fundamental direction" in various geography, cartography, GIS, etc. applications and conventions. South surpassed north around 1900.

POSTED BY: Vitaliy Kaurov

I guess this basically has to do with the "Heroic Age of Antarctic Exploration", when "the pole" was just a synonym for south pole:

enter image description here

POSTED BY: Henrik Schachner
Posted 6 years ago

- A man who writen "Philosophiæ Naturalis Principia Mathematica".

 WordFrequencyPlot[{"Pythagoras", "Archimedes", "Euclid", "Fibonacci", 
      "Descartes", "Newton", "Leibniz", "Gauss", "Euler", "Fermat", 
      "Turing"}, "YearStart" -> 1800]

enter image description here

WordFrequencyPlot[{"Newton", "Copernicus", "Gauss", "Einstein", 
  "Hawking"}, "YearStart" -> 18

enter image description here

POSTED BY: Frederick Wu

Programming Languages

WordFrequencyPlot[{"Wolfram", "Mathematica", "Fortran", "Cobol", 
  "HTML", "CSS", "Ruby", "JavaScript", "PHP", "Matlab", "LabVIEW", 
  "Python", "Java", "Swift"}, "Scaling" -> "Log"]

enter image description here

Transportation

WordFrequencyPlot[{"airplane", "car", "train", "speed"}]

enter image description here

POSTED BY: l van Veen

Curiosity killed the cat:

WordFrequencyPlot[{"curiosity", "cat"}]

enter image description here

enter image description here

POSTED BY: Jan Brugard

I have looked at a number of example some similar to the ones above. Some word histories tell nice stories, for example about medicine. Here are three words for malaria:

Options[WordFrequencyPlot] = {"YearStart" -> 1800, "YearEnd" -> Now, 
   "Case" -> True, "Smooth" -> 3, "Scaling" -> None, 
   "Style" -> Automatic};

WordFrequencyPlot[words_, OptionsPattern[]] := 
 With[{$data = 
    WordFrequencyData[words, 
     "TimeSeries", {OptionValue["YearStart"], OptionValue["YearEnd"]},
      IgnoreCase -> OptionValue["Case"]]}, 
  DateListPlot[
   MapThread[
    Callout, {MeanFilter[#, 
        Quantity[OptionValue["Smooth"], "Years"]] & /@ Values[$data], 
     words}], ScalingFunctions -> OptionValue["Scaling"], 
   PlotRange -> All, PlotTheme -> "Detailed", 
   PlotStyle -> OptionValue["Style"], FrameTicks -> {Automatic, None},
    ImageSize -> Large, FrameLabel -> {"YEAR", "FREQUENCY in TEXT"}]]

WordFrequencyPlot[{"ague", "malaria", "paludism"}]

enter image description here

Until 1880, when Laveran first discovered the parasite, ague and malaria have basically the same frequency. Malaria is derived from malaria aria "bad air", whereas ague comes from acute febris "acute fever".

Sometimes we can also observe a shift in the frequency of words reflecting meaning the same thing

Options[WordFrequencyPlot] = {"YearStart" -> 1800, "YearEnd" -> Now, "Case" -> True, "Smooth" -> 3, "Scaling" -> None, "Style" -> Automatic};

WordFrequencyPlot[{"Moslem", "Muslim"}]

enter image description here

In those cases a relative frequency plot, i.e. displaying quantiles could be interesting:

StackedDateListPlot[
 MapThread[
  Callout, {Values[
    WordFrequencyData[{"Moslem", "Muslim"}, "TimeSeries"]], {"Moslem",
     "Muslim"}}], PlotRange -> All, PlotLayout -> "Percentile", 
 ImageSize -> Large, PlotStyle -> {Red, Green}, 
 LabelStyle -> Directive[Bold, 16], PlotTheme -> "Detailed"]

enter image description here

Such a plot is also useful to compare opposites like the words peace and war, which are also studied in an earlier post:

StackedDateListPlot[
 MapThread[
  Callout, {Values[
    WordFrequencyData[{"war", "peace"}, "TimeSeries"]], {"war", 
    "peace"}}], PlotRange -> All, PlotLayout -> "Percentile", 
 ImageSize -> Large, PlotStyle -> {Red, Green}, 
 LabelStyle -> Directive[Bold, 16], PlotTheme -> "Detailed"]

enter image description here

It is interesting to see that since about 1850 the word war is more frequent than peace.

These plot also reflect use of words such as bike and bicycle

StackedDateListPlot[
 MapThread[
  Callout, {Values[
    WordFrequencyData[{"bicycle", "bike"}, "TimeSeries"]], {"bicycle",
     "bike"}}], PlotRange -> All, PlotLayout -> "Percentile", 
 ImageSize -> Large, PlotStyle -> {Red, Green}, 
 LabelStyle -> Directive[Bold, 16], PlotTheme -> "Detailed"]

enter image description here

I would have expected that bike becomes more prominent during the 20th century. Between 1800 and 1880 is is also surprisingly common. I am not sure why, but this could be due to the other meaning of bike which is something like "nest or swarm of bees".

It would be interesting to consider the change of meaning of words. I tried to look at the word "gay" which has changed meaning over the years from lighthearted (13th century), bright and showy (14th century) and happy. It could also imply morality and mean gay women (prostitute) or gay man (womaniser), gay house (brothel). around 1900 it was something like "cheerful"; in the 1980 young users would use it to mean "lame, stupid" around 1990 it got to mean homosexual. I tried to use google n-grams to figure that out, but it didn't really work well. Here are words that are used close to gay over the years:

Table[{k, 
  StringSplit[
    StringSplit[
      StringSplit[
        StringSplit[
          URLExecute[
           "https://books.google.com/ngrams/graph?content=gay+*_ADJ&\
year_start=" <> ToString[k] <> "&year_end=" <> ToString[k + 20] <> 
            "&corpus=15&smoothing=3"], "direct_url="][[2]], 
        " width"][[1]], "gay%20"][[3 ;;]], "_"][[All, 1]]}, {k, 1800, 
  2000, 20}]

which gives

enter image description here

The frequency plot is:

Options[WordFrequencyPlot] = {"YearStart" -> 1800, "YearEnd" -> Now, 
   "Case" -> True, "Smooth" -> 3, "Scaling" -> None, 
   "Style" -> Automatic};

WordFrequencyPlot[{"gay"}]

enter image description here

or over longer times:

Options[WordFrequencyPlot] = {"YearStart" -> 1500, "YearEnd" -> Now, 
   "Case" -> True, "Smooth" -> 3, "Scaling" -> None, 
   "Style" -> Automatic};

WordFrequencyPlot[{"gay"}]

enter image description here

Also plastic has changed meaning from the the characteristic of being plastic to the material plastic:

WordFrequencyPlot[{"plastic"}]

enter image description here

In general we can see when different products have been developed:

WordFrequencyPlot[{"radio", "telephone", "computer", "car", "watch", 
  "electricity"}, "YearStart" -> 1700, "YearEnd" -> Now]

enter image description here

Of course, words can come out of fashion, too. For example:

WordFrequencyPlot[{"Pence", "Dollar", "Shilling", "Euro", "Sterling", 
  "Farthing", "Florin", "Dime", "Yen", "Yuan"}, "YearStart" -> 1700, 
 "YearEnd" -> Now]

enter image description here

In fact we can study this more systematically, by looking at the correlations between frequency curves:

words = {"war", "peace", "communism", "capitalism", "socialism", 
  "democracy", "unemployment", "conflict", "crisis", "terrorism", 
  "military", "welfare", "bomb", "weapons", "combat"}

worddata = (WordFrequencyData[#, "TimeSeries"])["Values"] & /@ words;

cm = Correlation[
   Transpose@
    worddata[[All, 
     1 ;; Min[
       Table[Length[worddata[[i, ;;]]], {i, 1, 
         Length[words] - 1}]]]]];

Column[{GraphicsRow[words[[1 ;;]], ImageSize -> 1000, Frame -> All], 
  Row[{GraphicsColumn[words[[1 ;;]], ImageSize -> 67, Frame -> All], 
    Overlay[{ArrayPlot[cm, 
       ColorFunction -> (ColorData["TemperatureMap"][(1 + #)/2] &), 
       Frame -> None, Mesh -> True, PlotRangePadding -> 0, 
       ImageSize -> 1000, ColorFunctionScaling -> False], 
      GraphicsGrid[Map[NumberForm[#, 2] &, cm, {2}], 
       ImageSize -> 1000]}]}]}, Alignment -> Right, Spacings -> 0]

enter image description here

We can use a BandwidthOrdering

Needs["GraphUtilities`"]
{r, c} = MinimumBandwidthOrdering[cm, Method -> "RCMD"]

cm2 = Correlation[
   Transpose@
    worddata[[r]][[All, 
      1 ;; Min[
        Table[Length[worddata[[r]][[i, ;;]]], {i, 1, 
          Length[words]}]]]]];

Column[{GraphicsRow[words[[r]], ImageSize -> 1000, Frame -> All], 
  Row[{GraphicsColumn[words[[r]], ImageSize -> 67, Frame -> All], 
    Overlay[{ArrayPlot[cm2, 
       ColorFunction -> (ColorData["TemperatureMap"][(1 + #)/2] &), 
       Frame -> None, Mesh -> True, PlotRangePadding -> 0, 
       ImageSize -> 1000, ColorFunctionScaling -> False], 
      GraphicsGrid[Map[NumberForm[#, 2] &, cm2, {2}], 
       ImageSize -> 1000]}]}]}, Alignment -> Right, Spacings -> 0]

enter image description here

Using that we can try to find words with a similar behaviour:

StackedDateListPlot[
 MapThread[Callout, 
  Log /@ {Values[
     WordFrequencyData[{"Democracy", "War", "Peace"}, 
      "TimeSeries"]], {"Democracy", "War", "Peace"}}], 
 PlotRange -> All, PlotLayout -> "Percentile", ImageSize -> Large, 
 PlotStyle -> {Red, Green, Blue}, LabelStyle -> Directive[Bold, 16], 
 PlotTheme -> "Detailed"]

enter image description here

which indicates near constant ratios over a long time. This is not that easy to see in the FrequencyPlot

WordFrequencyPlot[{"Democracy", "War", "Peace"}, "YearStart" -> 1900, 
 "YearEnd" -> Now, "Scaling" -> {None, "Log"}]

enter image description here.

Finally, it is interesting to look at other languages such as tu vs usted in Spanish

Options[WordFrequencyPlotSpanish] = {"YearStart" -> 1800, 
"YearEnd" -> Now, "Case" -> True, "Smooth" -> 3, "Scaling" -> None,
"Style" -> Automatic};

WordFrequencyPlotSpanish[words_, OptionsPattern[]] := 
With[{$data = 
WordFrequencyData[words, 
"TimeSeries", {OptionValue["YearStart"], OptionValue["YearEnd"]},
IgnoreCase -> OptionValue["Case"], Language -> "Spanish"]}, 
DateListPlot[
MapThread[
Callout, {MeanFilter[#, 
Quantity[OptionValue["Smooth"], "Years"]] & /@ Values[$data], 
words}], ScalingFunctions -> OptionValue["Scaling"], 
PlotRange -> All, PlotTheme -> "Detailed", 
PlotStyle -> OptionValue["Style"], FrameTicks -> {Automatic, None},
ImageSize -> Large, FrameLabel -> {"YEAR", "FREQUENCY in TEXT"}]]

WordFrequencyPlotSpanish[{"vosotros", "ustedes"}]

enter image description here

or Du and Sie in German

Options[WordFrequencyPlotGerman] = {"YearStart" -> 1800, 
"YearEnd" -> Now, "Case" -> True, "Smooth" -> 3, "Scaling" -> None,
"Style" -> Automatic};

WordFrequencyPlotGerman[words_, OptionsPattern[]] := 
With[{$data = 
WordFrequencyData[words, 
"TimeSeries", {OptionValue["YearStart"], 
OptionValue["YearEnd"]},(*IgnoreCase\[Rule]OptionValue["Case"],*)
IgnoreCase -> False, Language -> "German"]}, 
DateListPlot[
MapThread[
Callout, {MeanFilter[#, 
Quantity[OptionValue["Smooth"], "Years"]] & /@ Values[$data], 
words}], ScalingFunctions -> OptionValue["Scaling"], 
PlotRange -> All, PlotTheme -> "Detailed", 
PlotStyle -> OptionValue["Style"], FrameTicks -> {Automatic, None},
ImageSize -> Large, FrameLabel -> {"YEAR", "FREQUENCY in TEXT"}]]

WordFrequencyPlotGerman[{"Du", "Sie"}]

I suppose that there are interesting mechanisms working here. It definitely feels that "Du" becomes more prevalent as opposed to the more formal "Sie". But there might be an affect due to (social?) media etc. in the opposite direction.

Here are a couple of pronouns in English:

WordFrequencyPlot[{"you", "thou", "ye", "thee", "thy"}, 
 "YearStart" -> 1200, "YearEnd" -> Now]

enter image description here

which might look better on a percentile plot:

StackedDateListPlot[
 MapThread[
  Callout, {Values[
    WordFrequencyData[{"you", "thou", "ye", "thee", "thy"}, 
     "TimeSeries"]], {"you", "thou", "ye", "thee", "thy"}}], 
 PlotRange -> All, PlotLayout -> "Percentile", ImageSize -> Large, 
 PlotStyle -> RandomColor[5], LabelStyle -> Directive[Bold, 16], 
 PlotTheme -> "Detailed"]

enter image description here

Logarithmically, this becomes:

StackedDateListPlot[
 MapThread[Callout, 
  Log@{Values[
     WordFrequencyData[{"you", "thou", "ye", "thee", "thy"}, 
      "TimeSeries"]], {"you", "thou", "ye", "thee", "thy"}}], 
 PlotRange -> All, PlotLayout -> "Percentile", ImageSize -> Large, 
 PlotStyle -> RandomColor[5], LabelStyle -> Directive[Bold, 16], 
 PlotTheme -> "Detailed"]

enter image description here

Cheers,

Marco

POSTED BY: Marco Thiel

War and Peace

Apparently "Peace" is not as talked about (or written about) as "War"...

WordFrequencyPlot[{"war", "peace"}]

enter image description here

Earth and Space

Or perhaps running out of options for peace on earth, space becomes the next frontier...

 WordFrequencyPlot[{"earth", "space"}]

enter image description here

MRB

Here are my initials's (MRB) occurrence since I was born. (I discovered the MRB constant in 1999 -- any connection between that and the MRB uptick in the graph after 2000?)

ClearAll@WordFrequencyPlot;

Options[WordFrequencyPlot] = {"YearStart" -> 1995, "YearEnd" -> Now, 
   "Case" -> True, "Smooth" -> 3, "Scaling" -> None, 
   "Style" -> Automatic};

WordFrequencyPlot[words_, OptionsPattern[]] := 
 With[{$data = 
    WordFrequencyData[words, 
     "TimeSeries", {OptionValue["YearStart"], OptionValue["YearEnd"]},
      IgnoreCase -> OptionValue["Case"]]}, 
  DateListPlot[
   MapThread[
    Callout, {MeanFilter[#, 
        Quantity[OptionValue["Smooth"], "Years"]] & /@ Values[$data], 
     words}], ScalingFunctions -> OptionValue["Scaling"], 
   PlotRange -> All, PlotTheme -> "Detailed", 
   PlotStyle -> OptionValue["Style"], FrameTicks -> {Automatic, None},
    ImageSize -> Large, FrameLabel -> {"YEAR", "FREQUENCY in TEXT"}]]

enter image description here

See https://en.wikipedia.org/wiki/MRB

Navigator's tools. Notice the blip during WWII:

WordFrequencyPlot[{"sextant", "chronometer", "compass", "pelorus", 
  "almanac"}, "Scaling" -> "Log"]

enter image description here

POSTED BY: John Doty

Is money the root of all evil?

I just wondered if money was the root of all evil. I'm not great with math, so I wasn't sure how to get the square or cube root of the values for money to see if they aligned with the value for "evil" at a certain point in history. However, "evil" is not mentioned as frequently as "money", so I don't think "money" is any root of "evil". It might be the other way around though.

WordFrequencyPlot[{"money", "evil"}]

enter image description here

POSTED BY: Dorothy Evans

Colors and the "rise" of blue

I think colors are quite interesting. In the plots below note the "rise of blue" in texts. There is a research field that relates language and perception of color. See for example "Russian blues reveal effects of language on color discrimination". Color blue takes a special place, in some opinions less frequent in ancient literature. Also many languages do not distinguish between what in English are described as "blue" and "green" and instead use a cover term spanning both; this might have an effect on English translations. Please respond to this comment if you have any thoughts about this phenomenon.

color={"white","black","red","yellow","green","blue",
            "orange","purple","gray","indigo","pink"};
Interpreter["Color"][color]

enter image description here

Show [WordFrequencyPlot[color[[;; 6]], "Case" -> False, "Scaling" -> "Log", 
"Style" -> Interpreter["Color"][color[[;; 6]]]],Background -> GrayLevel[.8]]

enter image description here

A bit more of colors in regular non-Log scaling:

Show [WordFrequencyPlot[color, "Case" -> False, 
"Style" -> Interpreter["Color"][color]], Background -> GrayLevel[.8]]

enter image description here

POSTED BY: Vitaliy Kaurov
Posted 6 years ago

Recession word frequency vs actual recessions

Let's plot the frequency of the word "recession".

WordFrequencyPlot[{"recession"}, "Scaling" -> "Log"]

enter image description here

The federal reserve bank in St. Louis keeps a set of indicators which includes the recession periods since 1854.

fred = ServiceConnect["FederalReserveEconomicData"];
usrec = fred["SeriesData", "ID" -> "USREC"];
recWord = WordFrequencyData["recession", "TimeSeries", {1854, Now}];
recLogWord = TimeSeriesMap[Log, recWord];
{min, max} = {Min[#], Max[#]} &@recLogWord["Values"];
recScaled = 
 MovingAverage[TimeSeriesMap[Rescale[#, {min, max}] &, recLogWord], 3];
DateListPlot[{recScaled, usrec}, Filling -> Axis, ImageSize -> Large, 
 FrameTicks -> {Automatic, None}, 
 PlotLegends -> {"recession word freq(Log)", "Recessions"}, 
 PlotRange -> {{DateObject[{1854}], DateObject[{2010}]}, {0, 1}}]

enter image description here

POSTED BY: Diego Zviovich
Posted 6 years ago

World Powers

enter image description here

POSTED BY: Diego Zviovich
Posted 6 years ago

Political Systems

WordFrequencyPlot[{"capitalism", "nationalism", "socialism", 
"communism", "fascism", "populism"}]

enter image description here

POSTED BY: Diego Zviovich

Social Occupation

The King is still the king, unbelievable

enter image description here

may be the King of Rock n'Roll and the King of Pop are included. Without the king

enter image description here

the servant declines, the employee raises, not much gain in it, isn't it? The abolition of slavery seems to be reflected.

POSTED BY: Dent de Lion

Western Thinking

enter image description here

POSTED BY: Dent de Lion

Four letter words from f to k

These are more than one and the one does not come up (OMG):

enter image description here

flak was important during WW II - even here it throws its shadows ....

POSTED BY: Dent de Lion

Somewhat related to health:

WordFrequencyPlot[{"homeopathy", "acupuncture", "chemotherapy", "fasting", "antibiotic"}]

health related

POSTED BY: Gustavo Delfino

Trigonometry and Calculator

WordFrequencyPlot[{"trigonometry", "calculator"}]

enter image description here

It's a pretty well-know fact and self-evident that calculators have ruined trigonometry.

Smartphone, iPhone, Apple, Samsung, Orange

WordFrequencyPlot[{"iPhone", "Smartphone", "Samsung", "Apple", 
  "Orange"}, "Scaling" -> "Log"]

enter image description here

People back in 1900 didn't think iPhone was the best phone ever. But if you think Apple is the best company ever, have you ever tried Orange? Way more stable and reliable.

I didn't resist in a humorist take in this thread.

POSTED BY: Thales Fernandes
Posted 6 years ago

Religion and Politics

WordFrequencyPlot[{"religion", "politics"}]

enter image description here

It's a common adage that discussing politics and religion won't make you any friends, especially at social gatherings such as dinner parties. But which of these terms has been mentioned more frequently over time? Not terribly surprising that the term "religion" has decreased over time, with general trends of people becoming more secular, but the fact the terms "politics" and "religion" seem to meet in our current period is somewhat telling. Karl Marx once quipped religion is the opium of the people. Perhaps politics and political discourse are shaping up to take its place, for better or worse.

POSTED BY: Null Null
Posted 6 years ago

Freedom, Liberty, Justice, Equality

WordFrequencyPlot[{"freedom", "liberty", "justice", "equality"}]

enter image description here

Political theorists dating back to the Hellenic period have examined the tensions between these concepts, especially between equality and freedom. It's interesting to see how "equality" stays fairly flat while "justice" and "liberty" have decreased (though "justice" seems to be on a relatively recent uptick). Interestingly, "freedom" has increased over time, especially around the time of World War II and the following decades.

POSTED BY: Null Null

The Five Ws and H

WordFrequencyPlot[{"how", "why", "what", "where", "when", "who"}]

enter image description here

According to Wikipedia, The Five Ws and H are questions whose answers are considered basic in information gathering or problem solving. They are often mentioned in journalism, research, and police investigations. They constitute a formula for getting the complete story on a subject. According to the principle of the Six Ws, a report can only be considered complete if it answers these questions starting with an interrogative word:

  • Who was involved?
  • What happened?
  • Where did it take place?
  • When did it take place?
  • Why did that happen?
  • How did it happen?

Rudyard Kipling in his "Just So Stories" (1902) writes:

I keep six honest serving-men

(They taught me all I knew);

Their names are What and Why and When

And How and Where and Who.

It is quite remarkable to observe that these questions have various degrees of usage perhaps reflecting on their relevant importance, with "when" being the key question nowadays, which was not always the case. "Who" lead in past but became less prominent.

POSTED BY: Vitaliy Kaurov

There is one more with wh:

enter image description here

and the popular names show the importance of doing:

enter image description here

POSTED BY: Dent de Lion

The Five Basic Senses

There are five basic senses: touch, sight, hearing, smell and taste. I remember reading somewhere that the maximal information flow human experience normally is due to the vision. The diagram below reflects on that, - note it is a logarithmic scale, - word "see" is much more frequent than others (Although it can have other meanings too, besides the direct act of vision itself, like "understand" etc. But so can other words too.). Note curious fall of "taste" below "hear" and "touch".

WordFrequencyPlot[{"see", "hear", "touch", "taste", "smell"}, "Scaling" -> "Log"]

enter image description here

POSTED BY: Vitaliy Kaurov

Benford's Law

I wanted to start from something spectacular in its simplicity - demonstration of Benford's Law. From MathWorld:

Benford's Law is a phenomenological law also called the first digit law, first digit phenomenon, or leading digit phenomenon. Benford's law states that in listings, tables of statistics, etc., the digit 1 tends to occur with probability ?30%, much greater than the expected 11.1% (i.e., one digit out of 9). Benford's law can be observed, for instance, by examining tables of logarithms and noting that the first pages are much more worn and smudged than later pages.

Surprisingly, this law holds not only for the digits usage in texts, but also for the word-names of the digits, - see plots below. It would be nice to hear any explanations of this.

WordFrequencyPlot[ToString /@ Range[0, 9]]

enter image description here

Log scaling of vertical axis

WordFrequencyPlot[ToString /@ Range[0, 9], "Scaling" -> "Log"]

enter image description here

Direct plot of digit "names"

WordFrequencyPlot[IntegerName[Range[0, 9]]]

enter image description here

Log scaling of vertical axis for digit "names"

WordFrequencyPlot[IntegerName[Range[0, 9]], "Scaling" -> "Log"]

enter image description here

POSTED BY: Vitaliy Kaurov

Surprisingly, this law holds not only for the digits usage in texts, but also for the word-names of the digits, - see plots below. It would be nice to hear any explanations of this.

"One" is also an indefinite pronoun ("no one", "one of the group", "if one wishes"), so appears a lot more often in English than just as a spelled-out number. For the cases of actual numeric representation, nearly any numbered list will include one (such as this brief statement, for one); those that go to two will also include one (but not three), those that go to three will also include two (but not four), and so on.

POSTED BY: Lynda Sherman
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract