Description
The aim of this project is to calculate phonetic distances of given words and generate a list of rhyming words.
So lets first understand what is rhyme and further we will talk about phonetics and phoneme distances. As you may know a rhyme is a repetition of similar sounds in two or more words, most often in the final syllables of lines in poems and songs. So now when we know the definition of the rhyme, lets see how to differentiate and generate them. Short answer for this is by using phonetic forms of the words. Phonetic form is the branch of linguistics that deals with the sounds of speech and their production, combination, description, and representation by written symbols.
Project Content
- Finding Homophones
- Creating data for more general rhymes
- Generating rhymes with various phoneme distances
- Some Examples
Finding Homophones
Homophones are words pronounced alike but which are different in meaning. Finding homophones is being done by comparing the suffixes of the phonetic representations of given words and choosing the same ones.
Here is the code where phonetic form of the word is taken from WordData (local function in Mathematica) and then homophones or total rhymes are generated from user input: in this case the input is "sheep" and output is the list of words which totally rhyme with the given word.
associationOfWordsAndPhoneticForm=
StringReplace[#,___~~"?" ~~ any___ ~~EndOfString :> any]&/@
DeleteMissing[AssociationMap[WordData[#,"PhoneticForm"]&,Take[WordData[]]]];
vals=Select[Reverse@SortBy[Tally[Values[associationOfWordsAndPhoneticForm]],
#[[2]]&],#[[2]]>1&][[All,1]];
phoneticAssocKeys=Position[associationOfWordsAndPhoneticForm,#]&/@vals;
homophones = Map[
Association @ Thread[(First /@ phoneticAssocKeys[[All, All, 1]][[#]]) -> vals[[#]]]&,
Range[Length[phoneticAssocKeys[[All, All, 1]]]]
]
f[word_,homophones_]:=Keys @ First @ Select[homophones, MemberQ[Keys[#], word]&]
rhymingWords= f["sheep",homophones]
{"asleep", "beep", "bleep", "bopeep", "cheap", "creep", "deep",
"heap", "jeep", "keep", "leap", "peep", "reap", "seep", "sheep",
"sleep", "steep", "sweep", "weep"}
As you see the output words are quite similar phonetically, moreover their suffixes are the same. This case is an example of perfect rhyme, more technically the distance between the phonemes is 0. We are not going to concentrate on the case of the perfect rhymes, because they have limited usage in real life applications. Further in this project we are going to consider more complex cases when the distance between phonemes is larger than 0. For that we need to have proper phonetics data.
Creating data for more general rhymes
Working with the words where phonetic distances are larger than 0 is more complex then working with the perfect rhymes. This is when we need to have a data of vowel distances, consonant distances and the distance between consonants and vowels. As I could not find the proper data for phoneme distances, we decided to take vowel and consonants charts and measure vowel to vowel and consonant to consonant distances between them which is euclidean distance on the images of charts. The reason why we used these charts for phoneme distance is that they are like a periodic table of sounds: close sounds on these charts are sounds that are produced in similar ways, either in the same part of the mouth, or using the same technique.
Here is the code where we imported the images and measured euclidean distance between phonemes by using LocatorPane function.
vowels = {"a", "æ", "?", "?", "?", "?", "e", "o", "?", "i", "u", "?",
"?", "?"};
phonemes = {"?", "?", "?", "?", "?", "?", "?", "?", "?", "?", "?",
"?", "?", "a", "æ", "b", "d", "ð", "e", "f", "h", "i", "j", "k",
"l", "m", "n", "o", "p", "r", "s", "t", "u", "v", "w", "z",
"\[Theta]"}
With[{image =
Import["https://upload.wikimedia.org/wikipedia/en/5/5a/IPA_vowel_\
chart_2005.png"]},
pos = Transpose[{RandomReal[ImageDimensions[image][[1]],
Length[vowels]],
RandomReal[ImageDimensions[image][[2]], Length[vowels]]}];
LocatorPane[Dynamic[pos], image,
Appearance -> Function[ Style[#, Red, 20]] /@ vowels]]
Generating rhymes with various phoneme distances
Having the dataset obtained above, we can find rhyming words with different phoneme distances, most importantly distances larger than zero. In the piece of code below, we used the dataset to get the phonetic distance between the words. Also we found larger scale of rhyming words with different phoneme distances by using vowels, consonants and phoneme matrices from the dataset.
Clear[distance]
distance[a_, b_] /; StringLength[a] == StringLength[b] == 1 := Part[
$allDistances,
Position[allphonemes, a][[1, 1]],
Position[allphonemes,b][[1, 1]]
]
distance[a_, b_] /; StringLength[a] == StringLength[b] := Total[MapThread[
distance,
{Characters[a], Characters[b]}
]]
distance[a_, b_] /; StringLength[a] < StringLength[b] := distance[b, a]
distance[a_, ""] := StringLength[a] * deletionPenalty
distance[a_, b_] /; StringLength[a] > StringLength[b] := With[
{chara = Characters[a], charb = Characters[b]},
Min[Map[
Total[MapThread[
distance,
{chara, #}
]]&,
PadAll[charb, "", Length[chara]]
]]
]
OrderedTuples[n_, k_] := With[{syms = Table[Unique[],{n}]},
With[{iter = Sequence @@ Map[Append[k], Reverse /@ Partition[Prepend[syms, 1], 2, 1]]},
Flatten[Table[syms, iter], n - 1]
]
]
PadAll[list_, elem_, length_] := Map[
Insert[list, elem, Transpose[{#}]]&,
OrderedTuples[length - Length[list], Length[list] + 1]
]
Clear[phonemeDistance, vowelDistanceFunction, consonantDistanceFunction, phonemeDistanceFunction, getSuffix,
rhymingDistance,consonantsDistance, vowelsDistance, phonemesDistance]
phonemeDistance[x_, y_] /; AnyTrue[{x, y}, MissingQ] := Missing["NotAvailable"]
phonemeDistance[x_, y_] := Total[Replace[
SequenceAlignment[x, y],
{
{a_, b_} :> distance[a, b],
a_ -> 0
},
{1}
]]
vowelDistanceFunction[x_, y_] /; AnyTrue[{x, y}, MissingQ] := Missing["NotAvailable"]
vowelDistanceFunction[x_, y_] := Total[Replace[
SequenceAlignment[x, y],
{
{a_, b_} :> distance[a, b],
a_ -> 0
},
{1}
]]
consonantDistanceFunction[x_, y_] /; AnyTrue[{x, y}, MissingQ] := Missing["NotAvailable"]
consonantDistanceFunction[x_, y_] := Total[Replace[
SequenceAlignment[x, y],
{
{a_, b_} :> distance[a, b],
a_ -> 0
},
{1}
]]
phonemeDistanceFunction[x_, y_] /; AnyTrue[{x, y}, MissingQ] := Missing["NotAvailable"]
phonemeDistanceFunction[x_, y_] := Total[Replace[
SequenceAlignment[x, y],
{
{a_, b_} :> distance[a, b],
a_ -> 0
},
{1}
]]
Clear[getPhoneme, getSuffix, getConsonants, getVowels]
getSuffix[word_] := getSuffix[word] = Replace[
WordData[word, "PhoneticForm"],
{
s_?StringQ :> StringReplace[StringReplace[
s,
___ ~~ "?" ~~ any___ ~~ EndOfString :> any
], "?" -> "" ],
any_ :> Missing["NotAvailable"]
}
]
rhymingDistance[x_, y_] := phonemeDistance[getSuffix[x], getSuffix[y]]
getPhoneme[word_] := getPhoneme[word] = Replace[
WordData[word, "PhoneticForm"],
{
s_?StringQ :> StringReplace[s, {"?"|"?" -> ""}],
any_ :> Missing["NotAvailable"]
}
]
phonemesDistance[x_, y_] := phonemeDistanceFunction[getPhoneme[x], getPhoneme[y]]
getVowels[word_] := Replace[
WordData[word, "PhoneticForm"],
{
s_?StringQ :> StringReplace[s, (Alternatives @@ consonants)|"?"|"?" -> ""],
any_ :> Missing["NotAvailable"]
}
]
vowelsDistance[x_, y_] := vowelDistanceFunction[getVowels[x], getVowels[y]]
getConsonants[word_] := Replace[
WordData[word, "PhoneticForm"],
{
s_?StringQ :> StringReplace[s, (Alternatives @@ vowels)|"?"|"?" -> ""],
any_ :> Missing["NotAvailable"]
}
]
consonantsDistance[x_,y_] := consonantDistanceFunction[getConsonants[x], getConsonants[y]]
Some Examples
Let's loot at an example to see what are the word's consonants, vowels and phonetic distances, in addition let's see how the rhyming words sound.
getConsonants /@ {"tomato", "potato"}
{"tmt", "ptt"}
getVowels /@ {"tomato", "potato"}
{"?e?o?", "?e?o?"}
getPhoneme /@ {"tomato", "potato"}
{"t?me?to?", "p?te?to?"}
getSuffix /@ {"tomato", "potato"}
{"e?to?", "e?to?"}
consonantsDistance["tomato", "potato"]
8.62618
vowelsDistance["tomato", "potato"]
0
phonemesDistance["tomato", "potato"]
8.62618
rhymingDistance["tomato", "potato"]
0
rhymingWords =
Nearest[$wordsWithPhoneticForm, "banana", 20,
DistanceFunction -> rhymingDistance]
{"anna", "banana", "bandana", "cabana", "campana",
"dulciana", "gitana", "lantana", "manna", "nanna", "savanna",
"savannah", "tramontana", "banner", "manner", "manor", "planner",
"scanner", "spanner", "tanner"}
Click here to listen the words that rhyme with banana
Button["Rhyme", Speak[StringJoin[Riffle[rhymingWords, " "]]]]
random = RandomSample[$wordsWithPhoneticForm, 25];
tsp = FindShortestTour[random, DistanceFunction -> rhymingDistance];
Part[random, tsp[[2]]]
{"sweetened", "either", "blur", "materiel", "skill",
"imperial", "marginal", "destructive", "balking", "probing",
"floppy", "dropsy", "firefly", "divine", "days", "blaze", "pix",
"interleukin", "monsoon", "cor", "unformed", "versed", "stent",
"emirate", "standpoint", "sweetened"}
Button["Rhyme Random",
Speak[StringJoin[Riffle[Part[random, tsp[[2]]], " "]]]]
Click here to listen how 25 random words with minimal distance sound
Conclusion and further development
Summing up, we have constructed functions for finding phonetic distances and generating rhymes by using calculated dataset and WordData. This is a good starting point for developing a powerful tool. For that we can further enhance our functions to be able to generate poems and do interesting stuff using song lyrics. Stay tuned and keep the great mood :)
Attachments: