Message Boards Message Boards

Generating Name Generators

Posted 8 years ago

After reading the umpteenth online article describing how someone trained a neural net to make up band names, or write bizarre recipes, or generate Pokemon, I asked whether any of the ML functionality in the Wolfram Language could easily do this sort of thing. I was told to look at SequencePredict — and it turns out, with next to no knowledge of machine learning, and using some documentation examples as a springboard, I could get pretty decent results with very minimal code...

First, a short function to de-camelcase words, since in practice I noticed that the output strings would often be multiple words mashed together:

decamel[str_] := 
 StringTrim[
  StringJoin[
   StringSplit[
    str, {RegularExpression["([a-z])([A-Z])"] -> "$1 $2", 
     RegularExpression["([0-9])([A-Z])"] -> "$1 $2", 
     RegularExpression["([a-z])([0-9])"] -> "$1 $2"}]]]

Next, a function to produce a list of predictions of varying lengths, with the option of de-camelcasing output strings if needed:

predictionList[func_, num_, min_, max_, decam_: True] := 
 If[decam == True, 
  decamel /@ 
   Table[func["", "RandomNextElement" -> RandomInteger[{min, max}]], num],
  Table[func["", "RandomNextElement" -> RandomInteger[{min, max}]], num]]

And then the code to actually produce SequencePredictorFunctions, working from a) the name of a built-in Wolfram Language entity type, b) a list of Entities, or c) a list of names (strings).

nameGenerator[domain_String, extractor_: "SegmentedWords"] :=
 Block[{rand},
  rand = CommonName[DeleteMissing[RandomEntity[domain, 500]]];
  SequencePredict[rand, FeatureExtractor -> extractor]]

nameGenerator[entOrString_List, extractor_: "SegmentedWords"] :=
 Block[{names},
  With[{heads = DeleteDuplicates[Head /@ entOrString]},
   Which[heads === {Entity},
    names = CommonName[DeleteMissing[entOrString]];
    SequencePredict[names, FeatureExtractor -> extractor],
    heads === {String}, 
    names = StringTrim /@ DeleteMissing[entOrString];
    SequencePredict[names, FeatureExtractor -> extractor]]]]

And then...

In[50]:= bandSP = 
  nameGenerator[
   EntityClass["MusicAct", 
     "Country" -> Entity["Country", "UnitedStates"]] // EntityList];

In[59]:= predictionList[bandSP, 10, 2, 6]

Out[59]= {"Spears Lou Miley", "Show  Danity", "K\[Hyphen]Ci Morgan \
Reese Jobe", "Misty Orleans Dance Plug", "Widespread Whitey Eddy", \
"Yankovic G", "Nash Gyra", "", "Robert", "Spree Samantha Gene"}

Or aircraft...

In[72]:= planeSP = nameGenerator["Aircraft"];

In[73]:= predictionList[planeSP, 10, 3, 7]

Out[73]= {"Student R XP", "Miles Whitworth", "XP-F27 Raytheon", \
"Mitsubishi", "Robin -", "Ambrosini Eye C-XP", "Tupolev Chelidon", "-- \
 Ju", "Apuzzo Ro.22 Savoia", ".VI -12"}

Or people...

In[60]:= frSP = 
  nameGenerator[
   EntityClass["Person", 
     "BirthPlace" -> 
      Entity["City", {"Paris", "IleDeFrance", "France"}]] // 
    EntityList];

In[62]:= predictionList[frSP, 10, 3, 4]

Out[62]= {"Langelaan Armand", "Pascal George Jean\[Hyphen]Baptiste", \
"Enfant", "Vreeland Melissa M", "Hugh Kamara", "de Dux Barencey \
Joseph", "Paul Dufay", "Léon Roland", "Schiffman \
Saint\[Hyphen]Hilaire Alize", "Perec Louis"}

In[61]:= jpSP = 
  nameGenerator[
   EntityClass["Person", 
     "BirthPlace" -> Entity["City", {"Tokyo", "Tokyo", "Japan"}]] // 
    EntityList];

In[63]:= predictionList[jpSP, 10, 3, 4]

Out[63]= {"Yohji Ikeda", "Fukuda", "Yasuda", "Shioda", "Sicheng \
Yukawa Kibayashi", ".Mitsuru", "Eri Mokomichi", "Michiko Hijiri Mc \
Donough Mizumaki", "Ikuo Kenji Oyama", "Shirahama Juhn"}

Or Pokemon names:

In[64]:= pokeSP = 
  nameGenerator[
   StringDelete[EntityValue["Pokemon", "Name"], 
    RegularExpression[" \\(.+\\)"]], "SegmentedCharacters"];

In[67]:= Capitalize /@ predictionList[pokeSP, 10, 5, 10]

Out[67]= {"Arper", "Chummotark", "Chimedeowa", "Lex CT", "Enundude", \
"Tikip", "Uckitit", "Eirteaz", "Batenomogo", "Maryuffull"}

Suggestions for improvement are welcome...

POSTED BY: Alan Joyce

enter image description here - Congratulations! This post is now a Staff Pick! Thank you for your wonderful contributions. Please, keep them coming!

POSTED BY: EDITORIAL BOARD
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract