How to control the step size of the following conv net as it slides onto a larger image?
See also: https://mathematica.stackexchange.com/questions/144060/sliding-fullyconvolutional-net-over-larger-images/148033
As a toy example, I'd like to slide a digit classifier trained on 28x28 images to classify each neighborhood of a larger image. This is lenet with linear layers replaced by 1x1 convolutional layers.
trainingData = ResourceData["MNIST", "TrainingData"];
testData = ResourceData["MNIST", "TestData"];
lenetModel =
NetModel["LeNet Trained on MNIST Data",
"UninitializedEvaluationNet"];
newlenet = NetExtract[lenetModel, All];
newlenet[[7]] = ConvolutionLayer[500, {4, 4}];
newlenet[[8]] = ElementwiseLayer[Ramp];
newlenet[[9]] = ConvolutionLayer[10, 1];
newlenet[[10]] = SoftmaxLayer[1];
newlenet[[11]] = PartLayer[{All, 1, 1}];
newlenet =
NetChain[newlenet,
"Input" ->
NetEncoder[{"Image", {28, 28}, ColorSpace -> "Grayscale"}]]
Now train it:
newtd = First@# -> UnitVector[10, Last@# + 1] & /@ trainingData;
newvd = First@# -> UnitVector[10, Last@# + 1] & /@ testData;
ng = NetGraph[
<|"inference" -> newlenet,
"loss" -> CrossEntropyLossLayer["Probabilities", "Input" -> 10]
|>,
{
"inference" -> NetPort["loss", "Input"],
NetPort["Target"] -> NetPort["loss", "Target"]
}
]
tnew = NetTrain[ng, newtd, ValidationSet -> newvd,
TargetDevice -> "GPU"]
Now remove dimensions information (see stackexchange for the code definition of removeInputInformation
):
removeInputInformation[layer_ConvolutionLayer] :=
With[{k = NetExtract[layer, "OutputChannels"],
kernelSize = NetExtract[layer, "KernelSize"],
weights = NetExtract[layer, "Weights"],
biases = NetExtract[layer, "Biases"],
padding = NetExtract[layer, "PaddingSize"],
stride = NetExtract[layer, "Stride"],
dilation = NetExtract[layer, "Dilation"]},
ConvolutionLayer[k, kernelSize, "Weights" -> weights,
"Biases" -> biases, "PaddingSize" -> padding, "Stride" -> stride,
"Dilation" -> dilation]]
removeInputInformation[layer_PoolingLayer] :=
With[{f = NetExtract[layer, "Function"],
kernelSize = NetExtract[layer, "KernelSize"],
padding = NetExtract[layer, "PaddingSize"],
stride = NetExtract[layer, "Stride"]},
PoolingLayer[kernelSize, stride, "PaddingSize" -> padding,
"Function" -> f]]
removeInputInformation[layer_ElementwiseLayer] :=
With[{f = NetExtract[layer, "Function"]}, ElementwiseLayer[f]]
removeInputInformation[x_] := x
tmp = NetExtract[NetExtract[tnew, "inference"], All];
n3 = removeInputInformation /@ tmp[[1 ;; -3]];
AppendTo[n3, SoftmaxLayer[1]];
n3 = NetChain@n3;
And the network n3
slides onto any larger input. However, note that it seems to slide with steps of 4. How could I make it take steps of 1 instead?
In[358]:= n3[RandomReal[1, {1, 28*10, 28}]] // Dimensions
Out[358]= {10, 64, 1}
In[359]:= BlockMap[Length, Range[28*10], 28, 4] // Length
Out[359]= 64