Hi Steve,
I've just had a mother look at the training methodologies for the different neural nets. They do differ in terms of their exact matching strategies, i.e. how are the objects assigned to the anchor boxes (just one or several), and also how they define their loss functions.
I do think it is possible to build a generic framework where you can choose these decisions independently of the network you are training on. Some of the loss functions used can be quite quite complex, and you could choose exactly how closely you want to replace each authors training strategy. I am sure it will work, as to exactly how closely you have to replicate their training strategy to make it work really well is an open question.
Regarding your question about the single responsibility assignment, my approach has been..
You need to create a custom loss layer. It is just a net layer like any other, it has input ports and output ports. I pass in as an input an array of all the losses for every anchor box. I then pass in a mask array which selects which anchor boxes we have assigned. I simply multiply these input losses with the mask so now only the losses associated with the mask positions are part of the loss function. I then pass this out to a port called "Loss". When running NetTrain I specify that this is the loss function which should be minimised (using its LossFunction parameter).
See ./Experimental/Training/FocusLoss.m Look at MaskLossLayer
Apologies for code in Experimental folder being a bit scruffy, its mostly random ideas I have been thinking about.
Just on a slightly cautious note, I am not sure whether a few hundred images is going to be enough to train these sort of nets? I think the Tiny Yolo net used about 20,000 labelled images, and COCO (for RetinaNet) used about 300,000 images? I am pretty sure they are both preinitialised on ImageNet (a few million images), so that's your transfer learning. I think TinyYolo used aggressive data augmentation (not sure about the others). I do not know this, but I suspect you may need a lot more labelled images.
If you would like to take this off forum, please do feel free to email me at julian.w.francis@gmail.com
Kind regards,
Julian.