Very nice fine tuning!
One small bit of code simplification could be
FindFaces[image, "Image", PaddingSize -> Scaled[0.1]]
instead of trimming in a separate steps. This way you can also add an arbitrary padding taken from the original image.
You can even take advantage of the built-in listability of the NN framework to write
trainedNet[FindFaces[image, "Image"]]
that will evaluate the net in batch mode and will be faster when image
contains multiple faces.