Message Boards Message Boards


Generating Name Generators

Posted 4 years ago
1 Reply
9 Total Likes

After reading the umpteenth online article describing how someone trained a neural net to make up band names, or write bizarre recipes, or generate Pokemon, I asked whether any of the ML functionality in the Wolfram Language could easily do this sort of thing. I was told to look at SequencePredict — and it turns out, with next to no knowledge of machine learning, and using some documentation examples as a springboard, I could get pretty decent results with very minimal code...

First, a short function to de-camelcase words, since in practice I noticed that the output strings would often be multiple words mashed together:

decamel[str_] := 
    str, {RegularExpression["([a-z])([A-Z])"] -> "$1 $2", 
     RegularExpression["([0-9])([A-Z])"] -> "$1 $2", 
     RegularExpression["([a-z])([0-9])"] -> "$1 $2"}]]]

Next, a function to produce a list of predictions of varying lengths, with the option of de-camelcasing output strings if needed:

predictionList[func_, num_, min_, max_, decam_: True] := 
 If[decam == True, 
  decamel /@ 
   Table[func["", "RandomNextElement" -> RandomInteger[{min, max}]], num],
  Table[func["", "RandomNextElement" -> RandomInteger[{min, max}]], num]]

And then the code to actually produce SequencePredictorFunctions, working from a) the name of a built-in Wolfram Language entity type, b) a list of Entities, or c) a list of names (strings).

nameGenerator[domain_String, extractor_: "SegmentedWords"] :=
  rand = CommonName[DeleteMissing[RandomEntity[domain, 500]]];
  SequencePredict[rand, FeatureExtractor -> extractor]]

nameGenerator[entOrString_List, extractor_: "SegmentedWords"] :=
  With[{heads = DeleteDuplicates[Head /@ entOrString]},
   Which[heads === {Entity},
    names = CommonName[DeleteMissing[entOrString]];
    SequencePredict[names, FeatureExtractor -> extractor],
    heads === {String}, 
    names = StringTrim /@ DeleteMissing[entOrString];
    SequencePredict[names, FeatureExtractor -> extractor]]]]

And then...

In[50]:= bandSP = 
     "Country" -> Entity["Country", "UnitedStates"]] // EntityList];

In[59]:= predictionList[bandSP, 10, 2, 6]

Out[59]= {"Spears Lou Miley", "Show  Danity", "K\[Hyphen]Ci Morgan \
Reese Jobe", "Misty Orleans Dance Plug", "Widespread Whitey Eddy", \
"Yankovic G", "Nash Gyra", "", "Robert", "Spree Samantha Gene"}

Or aircraft...

In[72]:= planeSP = nameGenerator["Aircraft"];

In[73]:= predictionList[planeSP, 10, 3, 7]

Out[73]= {"Student R XP", "Miles Whitworth", "XP-F27 Raytheon", \
"Mitsubishi", "Robin -", "Ambrosini Eye C-XP", "Tupolev Chelidon", "-- \
 Ju", "Apuzzo Ro.22 Savoia", ".VI -12"}

Or people...

In[60]:= frSP = 
     "BirthPlace" -> 
      Entity["City", {"Paris", "IleDeFrance", "France"}]] // 

In[62]:= predictionList[frSP, 10, 3, 4]

Out[62]= {"Langelaan Armand", "Pascal George Jean\[Hyphen]Baptiste", \
"Enfant", "Vreeland Melissa M", "Hugh Kamara", "de Dux Barencey \
Joseph", "Paul Dufay", "Léon Roland", "Schiffman \
Saint\[Hyphen]Hilaire Alize", "Perec Louis"}

In[61]:= jpSP = 
     "BirthPlace" -> Entity["City", {"Tokyo", "Tokyo", "Japan"}]] // 

In[63]:= predictionList[jpSP, 10, 3, 4]

Out[63]= {"Yohji Ikeda", "Fukuda", "Yasuda", "Shioda", "Sicheng \
Yukawa Kibayashi", ".Mitsuru", "Eri Mokomichi", "Michiko Hijiri Mc \
Donough Mizumaki", "Ikuo Kenji Oyama", "Shirahama Juhn"}

Or Pokemon names:

In[64]:= pokeSP = 
   StringDelete[EntityValue["Pokemon", "Name"], 
    RegularExpression[" \\(.+\\)"]], "SegmentedCharacters"];

In[67]:= Capitalize /@ predictionList[pokeSP, 10, 5, 10]

Out[67]= {"Arper", "Chummotark", "Chimedeowa", "Lex CT", "Enundude", \
"Tikip", "Uckitit", "Eirteaz", "Batenomogo", "Maryuffull"}

Suggestions for improvement are welcome...

enter image description here - Congratulations! This post is now a Staff Pick! Thank you for your wonderful contributions. Please, keep them coming!

Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
or Discard

Group Abstract Group Abstract