I really liked your example. I just tried the same with AlexNet. https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf. Since AlexNet is easier to implement, I thought of giving it a try.

You can also use the pretrained NetModels (VGGNet-16 etc.). I modified the tensor dimensions of AlexNet, (especially reducing the LinearLayers did not cause much difference in accuracy) to reduce the number of parameters (in turn the memory requirements). This leads to better BatchSize and faster training. The AlexNet trained in approximately 6 min as compared to the lenet(30 min), with much less Cancer elements being mis-classified as Normal or Benign. So the Recall of the actual class is much higher (0.64) as compared to the LeNet (0.28) in my machine.

==================================================================================================
I revisited the problem again. If we are just worried about the Recall (i.e. getting the cancer classified in the bets possible manner), then we can just use data augmentation for the Cancer and Benign class.
This way I would have a better distribution of data in the three classes Normal, Cancer and Benign, as a result, I can get faster and better results even with LeNet:
