Message Boards Message Boards

1
|
4256 Views
|
1 Reply
|
1 Total Likes
View groups...
Share
Share this post:

Fine tuning GPT-2 in Mathematica

Posted 3 years ago

Hello everyone

I'm trying to fine-tune GPT-2 (https://resources.wolframcloud.com/NeuralNetRepository/resources/GPT-2-Transformer-Trained-on-WebText-Data)

I tried training it like this:

gpt = NetModel[{"GPT-2 Transformer Trained on WebText Data", 
   "Task" -> "LanguageModeling", "Size" -> "345M"}]

gpt = NetTrain[gpt, {"This is an example"}]

But it didn't work, can someone explain me how to train transformers in Mathematica?

POSTED BY: Mike Bark
Posted 3 years ago

Unfortunately, training transformers is slightly more complicated. You would need to add code for the loss function.

You could instead train a simple (non-transformer) text classifier in Mathematica using the Classify function:

classifier = Classify[{"This is a happy example"->1, "This is a negative or bad example" -> 0}]

If you really wanted to train transformers (and do so without implementing a lot of code yourself), you could check out the HiggingFace transformer github repo - which is a Python library implementing training of transformers.

POSTED BY: Alec Graves
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract