Message Boards Message Boards

Computational mnemonics

Yesterday I was reading this on Stephen's blog and something caught my attention.

swblogquote

So I decided to do some coding to figure out how many other sentences like that existed. The strategy is as follows, we're going to first find all the possible partitions of 14 (with permutations), then split the formula string and extract words that match the sub-string.

Partitions

We need to construct all possible partitions of 14 which is the length of the formula: IntegerPartition returns sorted partitions, so we need to call Permutations.

Join @@ Permutations /@ IntegerPartitions[14]

returns 8192 permuted partitions of 14. It is left as an exercise to the reader that there are always 2^(n-1) permuted partitions of the integer n.

From these we can compute all the possible spans.

spans[n_] := Map[
    MapAt[
        Function[a, a + 1], 
        Partition[Prepend[Accumulate[#], 0], 2, 1],
        {All, 1}
    ] &, 
     Join @@ (Permutations /@ IntegerPartitions[n])
]

Finding words

Now we need to find all the words that match the right piece of the pattern:

Options[filteredWords] = {IgnoreCase -> True};
filteredWords[{a_, b_, c_}, opts: OptionsPattern[]][pos_] := filteredWords[{a, b, c}, opts][pos] = 
    DictionaryLookup[
       StringReplace[StringTake["DDDpqrDpDDprpr", pos], {"D" -> _,"p" -> a, "q" -> b , "r" -> c}],
       IgnoreCase -> OptionValue[filteredWords, {opts}, IgnoreCase]
    ]

filteredWords takes three letters a, b, c and two indices and constructs a StringExpression to be fed to DictionaryLookup, by StringTake-ing the part of the formula we want to construct mnemonics from. For example: given input {"u", "r", "e"} and indices {10, 14} it constructs the pattern _ ~~ "ueue", and it extracts {"queue"}. Note that we've used the fact that StringReplace returns a StringExpression when some of the rhs are not strings themselves.

As an intermediate step we can construct all the words that will make up our mnemonics:

words = DeleteCases[
    Map[filteredWords[{"u", "r", "e"}], spans[14], {2}],
    {___, {}, ___}
]

And just from these we can say that there are 192 945 possible sentences:

Total[Times @@@ Map[Length, words, {2}]]

We can easily construct them:

sentences = Flatten[Apply[Outer[StringRiffle[{##}, " "] &, ##] &, words, {1}]]

Let's look at them:

RandomSample[sentences, 10]

returns

{"Oz cur Ed um queue", "ID Wu re I US queue", "kW surer UK queue", 
"gap Urey US queue", "Ms nu re duo queue", "XL cur Ed uh queue", "a I Wu re gut queue",
"AM nu re cub queue", "I you re jun queue", "a slur eh up queue"}

Do they all end in queue?

Select[sentences, Not @* StringMatchQ[___ ~~ "queue"]]

unfortunately yes.

Now we can try to do some further exploration. A few facts:

  • There are no sentences with {"p", "q", "r"}
  • There are a staggering 14 690 288 sentences with {"a", "b", "c"}
  • If we restrict the search to case sensitive there are 0 sentences with {"a", "b", "c"}, which makes me suspect that there are far too many acronyms with abc in our dictionary.
  • There are no three consecutive letters that have case-sensitive valid sentences.

What about case-insensitive?

assoc = AssociationMap[
    With[
        {words = DeleteCases[
            Map[filteredWords[#, IgnoreCase -> True],  spans[14], {2}],
            {___, {}, ___}
        ]},
        Total[Times @@@ Map[Length, words, {2}]]
   ] &,
     Partition[Alphabet[], 3, 1]
 ]
DeleteCases[assoc, 0]

returns

<|{"a", "b", "c"} -> 14690288, {"g", "h", "i"} -> 672097, {"t", "u", "v"} -> 76164|>

For {"t", "u", "v"}, they all end in "TV TV", so that's not very interesting.

{"g", "h", "i"} on the other hand seems to contain quite a few I, which is definitely interesting, if it wasn't for the fact that they all end with "I", which is probably not going to work well with English grammar.

I egg HIV go a gig I

I big hip gum gig I

go ugh is God gig I

I gig his gig GI GI

Maybe with some punctuation...

I have to stop writing here, but I will come back with some statistics of the 26!/23! (15600) 3-letter combinations.

POSTED BY: Carlo Barbieri

enter image description here - Congratulations! This post is now a Staff Pick as distinguished by a badge on your profile! Thank you, keep it coming, and consider contributing your work to the The Notebook Archive!

POSTED BY: Moderation Team
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract