# The importance of letter 'R'

Posted 17 days ago
478 Views
|
3 Replies
|
20 Total Likes
|
 My elementary school age child clearly has rhotacism (a speech impediment that is defined by the lack of ability, or difficulty in, pronouncing the sound R.). I have had recently a discussion with her trying to explain that we should start working on fixing this problem. But it was a bit difficult to illustrate that a letter 'R' was that important. And so we opened the Mathematica notebook and typed in the following code: BarChart[ ((StringCases[WordList[], ___ ~~ # ~~ ___] // Flatten // Length)/(WordList[] // Length)*100 // N) & /@ Alphabet[], ChartLabels -> ToUpperCase@Alphabet[], PlotTheme -> "Business" ] I was even surprised to see that letter 'R' was found at least once in more than 50% of English words (well, those provided by WL). The kid was excited too! She asked me to show the same 'proof' for any other language. 'R' hit the 60% mark in Italian case. Excellent! BarChart[ ((StringCases[WordList[Language -> "Italian"], ___ ~~ # ~~ ___] // Flatten // Length)/(WordList[Language -> "Italian"] // Length)*100 // N) & /@ Alphabet[Language -> "Italian"], ChartLabels -> ToUpperCase@Alphabet[Language -> "Italian"], PlotTheme -> "Business" ] That worked out very well and my kid wanted to learn more about Wolfram Language, and the letter 'R' :)
3 Replies
Sort By:
Posted 16 days ago
 Interesting idea! Another thing to try is to look at the phonetic transcription. The reason for this is that some words are not emphasizing sound r strongly enough even though its letter is present in the spelling. For instance, compare strong r sound in the beginning of the syllable In[]:= WordData["rat","PhoneticForm"] Out[]= rˈæt to weak r sound at the end of the syllable that is reflected in the absence of r in phonetic transcription In[]:= WordData["water","PhoneticForm"] Out[]= wˈɔtɝ Although in some other languages, like Russian language for example, it would be a bit different -- in a sense that any spelled letter r always maps onto a phonetic sound r. But focusing on English, we can try to explore stats of only on the words with strong phonetic r. We can obtain it for 29795 words with phonetic transcription: phonetics=DeleteMissing[WordData[#,"PhoneticForm"]&/@WordData[]] Length[phonetics] Then we can get the symbols that represent phonetic alphabet removing 2 that do not actually represent any sound: symbols=DeleteCases[Characters@phonetics//Flatten//Union,"ˈ"|"ˌ"]  {ŋ,ɒ,ɔ,ə,ɛ,ɝ,ɡ,ɪ,ʃ,ʊ,ʌ,ʒ,a,æ,b,d,ð,e,f,h,i,j,k,l,m,n,o,p,r,s,t,u,v,w,z,θ} Then we can build your chart based on the phonetic symbols: BarChart[ Length[Flatten[StringCases[phonetics, ___ ~~ # ~~ ___]]]/ Length[phonetics]100.&/@symbols, ChartLabels -> symbols, PlotTheme -> "Detailed"] While stats are a bit different the value of sound r is still very high. BTW here is another sample dictionary and a little different code to reproduce your original idea closely: BarChart[ Length[DictionaryLookup[___ ~~ # ~~ ___]]&/@ Alphabet[]/Length[DictionaryLookup[]] 100., ChartLabels -> Alphabet[], PlotTheme -> "Detailed"] 
 Thank you - that provides more finer exploration of the matter! It also reminded me about existence of the words file on *nix machines, just in case someone wants to use it one day. In[1]:= words = Import["/usr/share/dict/words"] // Flatten; wordsLength = Length[words] Out[1]= 235886 It has some lengthy words, which I won't try pronouncing: In[2]:= Select[words, StringLength[#] > 23 &] Out[2]= {"formaldehydesulphoxylate", "pathologicopsychological", \ "scientificophilosophical", "tetraiodophenolphthalein", \ "thyroparathyroidectomize"}  BarChart[((StringCases[words, ___ ~~ # ~~ ___] // Flatten // Length)/ wordsLength*100.) & /@ Alphabet[], ChartLabels -> ToUpperCase@Alphabet[], PlotTheme -> "Detailed"]