Group Abstract Group Abstract

Message Boards Message Boards

2
|
21.9K Views
|
9 Replies
|
13 Total Likes
View groups...
Share
Share this post:

Using NetModel to "fine tune" models with new final layers?

Posted 9 years ago

Hi -- First, congrats on 11.1. Support for the 1080 GPU is enough for me to get excited. I also love that I can load pre-trained models using NetModel, as that is an increasingly obvious strategy for problem solving. However, I'm not clear on how I would go about keeping the weights from the lower (feature) layers, while re-training the upper layers. Knowing how thorough you are, I'm sure it's possible, but just not sure how. NetChain lets me build layers, but I don't think it lets me operate on them. NetExtract lets me pull out layers, but I don't want just the model, I'd like to keep the pre-trained weights. Thanks! -- David

POSTED BY: David Cardinal
9 Replies

Ah, I think I see what you're saying.

If you just want to remove the final layers and replace them with your own, you can do something like this:

NetChain[{
    (*the parts of the net you want to keep*)
    Take[net, 7], 
    (*the following are the replacement layers*)
    500, 
    Ramp,
    10,
    SoftmaxLayer[]
}]

Is that what you mean?

POSTED BY: David Cardinal

David P. -- I'm running on Windows 10. I did a NetTrain with TargetDevice -> GPU and it ran just fine on my EVGA 1080. Just to double check after seeing your post, I ran the code again, and indeed the GPU lights up with activity, so it is really in use.

POSTED BY: David Cardinal
Posted 9 years ago
POSTED BY: David Proffer

Christopher -- Yes, changing the LearningRateMultiplier for the layers I want fixed is perfect. That combined with Sebastian's reminder about Drop will do what I want. And Matteo's alternate suggestion of saving out the Features is an interesting one, as I think it also does what I'd like, but in a different way. Thanks all. -- David

POSTED BY: David Cardinal

Christopher -- Thanks! That definitely helpful, although I'm not sure it is the complete solution. It sounds like we can retrain the existing upper layers (which is helpful), but in many cases the need is to replace them (as in when we want to use a pre-trained classifier to generate 2 classes instead of many classes). In Keras I'd .pop the relevant layers and add back ones I want, but I'm not seeing that in Mathematica that I can find.

POSTED BY: David Cardinal
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard