Group Abstract

Message Boards

WOLFRAM COMMUNITY

41.1K Views

10 Replies

40 Total Likes

View groups...

Follow this post

Share this post:

GROUPS:

Staff Picks Recreation Social Science Curated Data Wolfram Language Computational Linguistics Natural Language Processing Know-How Computational Humanities

Joke generator: pool of words applied to text formula

Jesse Friedman

Jesse Friedman, Wolfram Research

Posted 10 years ago

While most humor has continually eluded the grasp of computers, some jokes can be reduced to simple linguistic formula. I focus on two in this post: Joke Type 1: The Comparison I like my `<noun 1>` like I like my `<noun 2>`: `<adjective or other word that can describe both nouns>` Example: I like my immunity like I like my resistance: passive. The Comparison is probably the easier of the two jokes to generate. In other languages it would be difficult even with a dictionary dataset, but we can take advantage of a sometimes-annoying feature of the Wolfram Language: the fact that WordData contains compound nouns, nouns with two or more component words. First we define a subset of WordData[], all compound words with exactly two component words: twoword = Select[WordData[], Length[StringSplit[#]] == 2 &] Then we group all of the compound words by the first component word, and delete any groups with only one element. We then truncate each group to two elements only. This discards some possible jokes, but it makes the list more manageable. pairs = Values[ Select[GroupBy[twoword, StringSplit[#][[1]] &], Length[#] > 1 &]][[All, ;; 2]] If you wanted to create all possible pairs, you could find all the 2-subsets of each group: `Subsets[#,{2}] &/@ groups` Next we define a StringTemplate for this joke formula to make generation easier: comparisontemp = StringTemplate["I like my `1` like I like my `2`: `3`."]; Now the fun part! We choose a random sample of 5 word groups, and apply the string template to each: comparisontemp[StringSplit[#[[1]]][[2]], StringSplit[#[[2]]][[2]], StringSplit[#[[1]]][[1]]] & /@ RandomSample[pairs, 5] All those StringSplits are there just to get the component words for insertion into the StringTemplate. This gives us something like: {"I like my Ashe like I like my Compton: Arthur.", "I like my method like I like my section: rhythm.", "I like my lithography like I like my printing: offset.", "I like my californica like I like my cinerea: Juglans.", "I like my cynthia like I like my walkeri: Samia."} Most of the jokes this method generates are nonsensical or not funny like those above, but occasionally one elicits a small chuckle. Here are a couple of my favorites: I like my curve like I like my decay: exponential. I like my bathing like I like my blocker: sun. I like my bag like I like my benefit: sick. I like my dew like I like my double: daily. Joke Type 2: The Rhyme What do you call a `<noun or adjective 1>` `<noun or adjective 2>`? A(n) `<synonym for noun/adjective 1 that rhymes with synonym 2>` `<synonym for noun/adjective 2 that rhymes with synonym 1>`. Example: What do you call a ramshackle conservative? A broken-down button-down. The Rhyme is a fair bit more complicated to create, but it usually yields much funnier jokes than The Comparison, plus it has the advantage that you can generate a big list of jokes at once. NOTE: While the two joke types shouldn't have overlapping symbol namespaces, just to be on the safe side you should quit the kernel with Quit[] before starting this section of the code. We start out by defining the number of syllables that have to match at the end of two words for those words to rhyme. You can set this to anything, but I've found that a value of 2 generally produces the highest-quality results. Bear in mind that a higher value will produce less jokes. syllables = 2; Then we define a StringTemplate for this joke formula to make generation easier: rhymetemp = StringTemplate["What do you call a `1` `2`? A `3` `4`."]; Next we making a big list of nouns and adjectives that WordData has hyphenation and synonym data for. words = Flatten[ Table[# -> {WordData[#, "Hyphenation"], Flatten[WordData[#, "Synonyms"][[All, 2]]]} & /@ WordData[All, part], {part, {"Noun", "Adjective"}}]] This list is a bunch of rules in this form: "word"->{{"syllables","in","word"},{"synonyms","for","word"}} Now we select only the words in the list that have no space, have at least one synonym, have hyphenation (syllable) data, and have at least `syllables` syllables: filteredwords = Select[words, Length[StringSplit[#[[1]], " "]] == 1 && Length[#[[2, 2]]] > 0 && Length[#[[2, 1]]] >= syllables && ! MissingQ[#[[2, 1]]] &]; Next we group `filteredwords` by the last `syllables` syllables of each word, so all the words in each group rhyme: rhymes = Select[GroupBy[filteredwords, StringJoin @@ #[[2, 1, -syllables ;;]] &], Length[#] > 1 &] The next part creates a list of four-word lists that contain the words applied to the joke formula. It uses a large piece of code, but it's mostly just a bunch of tests in a Select statement ensuring that all four words are unique and that no two words are too similar. Short[jokequarts = Select[Table[{RandomChoice[suffix[[1, 2, 2]]], RandomChoice[suffix[[2, 2, 2]]], suffix[[1, 1]], suffix[[2, 1]]}, {suffix, rhymes}], (#[[1]] != #[[2]]) && (#[[3]] != #[[4]]) && Quiet[StringTake[#[[1]], -3] != StringTake[#[[2]], -3]] && (Quiet[StringTake[#[[1]], 3] != StringTake[#[[3]], 3]] && Quiet[StringTake[#[[2]], 3] != StringTake[#[[4]], 3]]) && (Length[StringPosition[#[[3]], #[[4]]]] == 0 && Length[StringPosition[#[[4]], #[[3]]]] == 0) &]] Like The Comparison, we're only generating one joke for each suffix to keep the list size down, but you could again use `Subsets` to extend the "joke space" to all possible jokes. Now we apply `rhymetemp` to all of the four-word lists: jokestrings = rhymetemp @@@ jokequarts; And we're done! Evaluate this line repeatedly to generate as many jokes as you want: Style[RandomChoice[jokestrings], "Subtitle"] This method still creates many bad jokes, but a surprising amount are pretty good: What do you call a mediate intermediary? A in-between go-between. What do you call a lucid disciple? A coherent adherent. What do you call a topsy-turvy senior? A disorderly elderly. What do you call a ophthalmologist democrat? A oculist populist. This code still has its fair share of problems. For example, the second sentence always starts with "A", even if the next word starts with a vowel sound (although this should be pretty easy to fix.) Also, the conditional tests could probably be improved to weed out "jokes" like this: What do you call a Eastern Roman Empire Byzantium? A Byzantine Empire Eastern Roman Empire. However, it's still quite impressive that my computer can make me laugh given only a dictionary and a formula! Can you improve this code? Did it produce a particularly good joke for you? Leave a response below!

POSTED BY: Jesse Friedman

10 Replies

Sort By:

Kathryn Cramer

Kathryn Cramer, Wolfram Research

Posted 10 years ago

Here's another way to get rhyming words: WolframAlpha["word", {{"Rhyme:WordData", 1}, "ComputableData"}] You can get lexically similar words with: WolframAlpha["word", {{"LexicallyCloseWords:WordData", 1}, "ComputableData"}] And, for puns, may want a supply of homophones: WolframAlpha["where", {{"Homophones:WordData", 1}, "ComputableData"}] With some words, to get this to work, you need to specify that the input is a word: WolframAlpha["see", {{"Homophones:WordData", 1}, "ComputableData"}, InputAssumptions -> {"C.see-_Word-"}]

Here's another way to get rhyming words:

WolframAlpha["word", {{"Rhyme:WordData", 1}, "ComputableData"}]

You can get lexically similar words with:

WolframAlpha["word", {{"LexicallyCloseWords:WordData", 1}, 
  "ComputableData"}]

And, for puns, may want a supply of homophones:

WolframAlpha["where", {{"Homophones:WordData", 1}, "ComputableData"}]

With some words, to get this to work, you need to specify that the input is a word:

WolframAlpha["see", {{"Homophones:WordData", 1}, "ComputableData"}, 
 InputAssumptions -> {"*C.see-_*Word-"}]

POSTED BY: Kathryn Cramer

Kathryn Cramer

Kathryn Cramer, Wolfram Research

Posted 10 years ago

I like this one for its sheer surreality: What do you call a flame nettle hammer? A coleus malleus.

POSTED BY: Kathryn Cramer

Kathryn Cramer

Kathryn Cramer, Wolfram Research

Posted 10 years ago

Q. What do you call a gator bowdlerizer? A. An alligator expurgator. (Joke Type 2 works now, with the correction.)

POSTED BY: Kathryn Cramer

Vitaliy Kaurov

Vitaliy Kaurov, WOLFRAM Research

Posted 10 years ago

By the way @Jesse Friedman your post collected 6K views in a single day with most of the traffic coming from Reddit-programming channel where it still occupies the front page with about a 100 votes already. Congratulations!

POSTED BY: Vitaliy Kaurov

Vitaliy Kaurov

Vitaliy Kaurov, WOLFRAM Research

Posted 10 years ago

@Jesse Friedman we were playing with friends with your code and I think in this piece: words = # -> {WordData[#, "Hyphenation"], Flatten[WordData[#, "Synonyms"][[All, 2]]]} & /@ WordData[All, part]; the symbol `part` is undefined. Could you add the missing `part` - forgive the pun ;) ?

POSTED BY: Vitaliy Kaurov

Jesse Friedman

Jesse Friedman, Wolfram Research

Posted 10 years ago

Thanks for pointing that out! It looks like I missed some of the line when I copied it. It should be: words = Flatten[ Table[# -> {WordData[#, "Hyphenation"], Flatten[WordData[#, "Synonyms"][[All, 2]]]} & /@ WordData[All, part], {part, {"Noun", "Adjective"}}]] I fixed it in the original post too.

POSTED BY: Jesse Friedman

Kathryn Cramer

Kathryn Cramer, Wolfram Research

Posted 10 years ago

A poem derived from the first. (Arranging and sequencing in a certain way changes the emotional valence from funny to something else.) My Conviction I like my birth like I like my medication: giving. I like my breakfast like I like my divide: continental. I like my coffee like I like my tea: iced. I like my cat like I like my twin: Siamese. I like my map like I like my pitcher: relief. I like my machine like I like my region: infernal. I like my life like I like my world: phantasy. I like my correlation like I like my distortion: nonlinear. I like my fever like I like my stroke: haemorrhagic. I like my disorder like I like my psychosis: schizophrenic. I like my intelligence like I like my schizophrenia: borderline. I like my pronoun like I like my verb: reflexive. I like my balance like I like my cubism: analytical. I like my error like I like my language: programming. I like my file like I like my trauma: blunt. I like my charge like I like my conviction: murder.

POSTED BY: Kathryn Cramer

EDITORIAL BOARD

EDITORIAL BOARD, WOLFRAM

Posted 10 years ago

- you earned "Featured Contributor" badge, congratulations ! Dear @Jesse Friedman, this is a great post and it has been selected for the curated Staff Picks group. Many people will receive a monthly digest featuring you post. Your profile is now distinguished by a "Featured Contributor" badge and displayed on the "Featured Contributor" board.