# GPT-2 NetModel encoder issue?

Posted 2 months ago
231 Views
|
0 Replies
|
0 Total Likes
|
 It seems that the GPT-2 net model does not encode input words correctly. Trying the official Wolfram examples - in the help section - for word generation mostly gives me random words.I spent some time and now I believe that the issue is that the encoder for this model does not correctly encode the words. For example, I looked at the encoder vocabulary and made sure that the word "Hitman" is in there. I then gave the encoder the word "hitman". Interestingly, the word vectors are generated for "hit" and "man" separately. lm = NetModel[{"GPT2 Transformer Trained on WebText Data", "Task" -> "LanguageModeling", "Size" -> "774M"}] NetExtract[lm, "Input"]["Hitman"] Output: {17634, 550} According to decoder, these two indices are associated with the words "Hit" and "man": NetExtract[lm, {"Output", "Labels"}][[{17634, 550}]] Output: {"Hit", "man"} Try other words, and you will see the same type of behavior - like "Gorgeous" splits into {"G", "orge", "ous"}.I am relatively new to Mathematica... Am I doing something wrong or there is really something wrong with the encoder?-Ethan