Message Boards Message Boards

The importance of letter 'R'

Posted 3 years ago

My elementary school age child clearly has rhotacism (a speech impediment that is defined by the lack of ability, or difficulty in, pronouncing the sound R.). I have had recently a discussion with her trying to explain that we should start working on fixing this problem. But it was a bit difficult to illustrate that a letter 'R' was that important. And so we opened the Mathematica notebook and typed in the following code:

BarChart[
((StringCases[WordList[], ___ ~~ # ~~ ___] // Flatten // 
Length)/(WordList[] // Length)*100 // N) & /@ Alphabet[],
ChartLabels -> ToUpperCase@Alphabet[], PlotTheme -> "Business"
]

English alphabet

I was even surprised to see that letter 'R' was found at least once in more than 50% of English words (well, those provided by WL). The kid was excited too! She asked me to show the same 'proof' for any other language. 'R' hit the 60% mark in Italian case. Excellent!

BarChart[
 ((StringCases[WordList[Language -> "Italian"], ___ ~~ # ~~ ___] // 
          Flatten // Length)/(WordList[Language -> "Italian"] // 
         Length)*100 // N) & /@ Alphabet[Language -> "Italian"],
 ChartLabels -> ToUpperCase@Alphabet[Language -> "Italian"], 
 PlotTheme -> "Business"
 ]

Italian alphabet

That worked out very well and my kid wanted to learn more about Wolfram Language, and the letter 'R' :)

POSTED BY: Alan Parson
3 Replies

enter image description here -- you have earned Featured Contributor Badge enter image description here Your exceptional post has been selected for our editorial column Staff Picks http://wolfr.am/StaffPicks and Your Profile is now distinguished by a Featured Contributor Badge and is displayed on the Featured Contributor Board. Thank you!

POSTED BY: Moderation Team

Interesting idea! Another thing to try is to look at the phonetic transcription. The reason for this is that some words are not emphasizing sound r strongly enough even though its letter is present in the spelling. For instance, compare strong r sound in the beginning of the syllable

In[]:= WordData["rat","PhoneticForm"]
Out[]= rˈæt

to weak r sound at the end of the syllable that is reflected in the absence of r in phonetic transcription

In[]:= WordData["water","PhoneticForm"]
Out[]= wˈɔtɝ

Although in some other languages, like Russian language for example, it would be a bit different -- in a sense that any spelled letter r always maps onto a phonetic sound r. But focusing on English, we can try to explore stats of only on the words with strong phonetic r. We can obtain it for 29795 words with phonetic transcription:

phonetics=DeleteMissing[WordData[#,"PhoneticForm"]&/@WordData[]]
Length[phonetics]

enter image description here

Then we can get the symbols that represent phonetic alphabet removing 2 that do not actually represent any sound:

symbols=DeleteCases[Characters@phonetics//Flatten//Union,"ˈ"|"ˌ"]

{ŋ,ɒ,ɔ,ə,ɛ,ɝ,ɡ,ɪ,ʃ,ʊ,ʌ,ʒ,a,æ,b,d,ð,e,f,h,i,j,k,l,m,n,o,p,r,s,t,u,v,w,z,θ}

Then we can build your chart based on the phonetic symbols:

BarChart[
Length[Flatten[StringCases[phonetics, ___ ~~ # ~~ ___]]]/
Length[phonetics]100.&/@symbols,
ChartLabels -> symbols, PlotTheme -> "Detailed"]

enter image description here

While stats are a bit different the value of sound r is still very high. BTW here is another sample dictionary and a little different code to reproduce your original idea closely:

BarChart[
Length[DictionaryLookup[___ ~~ # ~~ ___]]&/@
Alphabet[]/Length[DictionaryLookup[]] 100.,
ChartLabels -> Alphabet[], PlotTheme -> "Detailed"]

enter image description here

POSTED BY: Vitaliy Kaurov
Posted 3 years ago

Thank you - that provides more finer exploration of the matter! It also reminded me about existence of the words file on *nix machines, just in case someone wants to use it one day.

In[1]:= words = Import["/usr/share/dict/words"] // Flatten;
wordsLength = Length[words]

Out[1]= 235886

It has some lengthy words, which I won't try pronouncing:

In[2]:= Select[words, StringLength[#] > 23 &]

Out[2]= {"formaldehydesulphoxylate", "pathologicopsychological", \
"scientificophilosophical", "tetraiodophenolphthalein", \
"thyroparathyroidectomize"}

BarChart[((StringCases[words, ___ ~~ # ~~ ___] // Flatten // Length)/
      wordsLength*100.) & /@ Alphabet[],
 ChartLabels -> ToUpperCase@Alphabet[], PlotTheme -> "Detailed"]

English

POSTED BY: Alan Parson
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract