I have added an implementation of the Single Shot Detector Object detection system to my CognitoZoo project.
Please see the following link for details (it's a GitHub project):
CognitoZoo Wiki link
It can recognise the 20 Pascal object categories. It's really quite fun.
I've also got an implementation of Yolo as well in that project. (The Tiny Yolo v2 version)
From playing with both of them, in my experience the SSD VGG 300 detector seems to have better recognition performance than the tiny Yolo version.There are other versions of Yolo which I hope to implement later.
I hope this is of interest.
Thank you for these insights and showing the power of Mathematica.
I have successfully run the YOLO version of your Mathematica implementation and keen to try out the SSD implementation. However, it seems that the SSDVGG300.wlnet or possibly the .hdf is missing on the file repository (as linked in your GitHub post).
Do you have this trained network file available?
Yes, you are right the weight file is currently missing. I had hoped to submit the neural network model and code to the Wolfram Neural Network Repository. However, the actual numerical weight values come from another developer's TensorFlow implementation. Unfortunately that developer hasn't attached clear licensing conditions to the project. I had written to the developer asking for express permission to use those weight values (with proper credit of course), but I haven't heard back. So reluctantly I felt obliged to refrain from distributing the neural network file which contains those weights.
Sorry about this.
Meanwhile options seem to be to either:
Actually train against a dataset. The easy part is downloading a suitable dataset. Slightly more work would be to implement the target decoder, ie the logic that maps the target bounding boxes back into the desired neural net outputs. Additionally I have heard that hard negative mining may be needed to train this net, but I am a bit vague on the details.
Use someone else's weights that has a liberal license attached. There is a CAFFE implementation:
which could be used and has a liberal license. I haven't much experience with CAFFE so I haven't gone down that route. Basically what is required is the ability to dig the weights out of the CAFFE file and save them into a format that Mathematica can read. I have found that JSON works well for fairly small files, but HDF works better for larger neural nets, hence I generally prefer HDF.
In either case the ModelConverters/SSDConverter.m file is helpful as that builds the network structure. It just needs the numerical weights from an HDF file.
Sorry there's not an easier fix. I'll have a look at going down option 2), but I am quite busy at the moment, so it may not be for a few weeks.
I have good news. I have got another SSD Tensorflow implementation trained up and converted the weights into Mathematica.
There was just a couple of minor changes required.
Please feel free to go ahead and use.
You should repull or zip the github repository and off you go.
I am trialling something new, using the cloud to store the neural network, so there is no need to download neural net files, it should happen automatically, and be cached on your system (although it does require an internet connection). It may take a little while first go, as it's a 105MB file (for SSD).
Please do let me know if this works well, if so I shall roll it out to the other nets.
Many thanks for this help - especially for beginners like me.
I will give it a go soonest and let you know about my experience.