I am trying to use a pre-trained VGG-16 as perceptual loss function. This is usually achieved by taking only the beginning of a VGG-16 network up to a given pooling layer. To do so, I did the following :
vgg = NetModel["VGG-16 Trained on ImageNet Competition Data"];
FeatureExtractor = NetTake[vgg, {NetPort["Input"], "relu4_3"}];
Since VGG-16 was trained on inputs of size 3x224x224, the new network FeatureExtractor also expects this input size. However, since the network FeatureExtractor only consists of convolution layers, RELU and pooling layers, it should be able to use any input size with 3 channels, e.g. inputs of size 3x64x64.
Putting FeatureExtractor in a new NetGraph and adding a new input size gives an error , i.e
NetGraph[{NetTake[vgg, {NetPort["Input"], "relu4_3"}]}, {1 -> NetPort["Output"]}, "Input" -> {3, 64, 64}]
does not work.
So my question is how can I create a version of my network FeatureExtractor that can accept a different (but fixed) input size? I don't want to scale up smaller inputs to the required 224x224 using some kind of interpolation.