I am a developer in the machine learning team, and mostly involved with the neural net repository. One of the discussions with my group raised these question in my mind, and these might be stupid questions, but here they are:
Question 1: Would our users rather want more models in their respective field of applications, that are correct, functioning and tested, or would they want models that are correct, functioning tested and properly structured? In other words, a lot of the times, models when imported are not structured properly, and many a times we spend a great deal of time restructuring them (in lack of better words, prettifying them).
If I were a user, I would rather have more models that are correct and ugly, rather than less models that are correct and pretty, because a great deal of developer time is spent in "prettifying them". However, the question to the users here, is there any potential benefit of having a pretty model as opposed to an ugly one, that could justify endless developer hours being spent on making a model pretty? I maybe missing some details or use of pretty and structured models here, and that's why I reached out to our community to see if there are any advantages to justify the tradition of making the published models pretty. This would potentially help us free some time to actually convert models for various application areas.
Please import the models to learn about the pretty and ugly versions (there are other differences). The .onnx has only the backbone and the FPN (you can verify that the output are the 4 probabilities and box locs). A first-level restructuring where relevant sections are marked backbone, FPN and the rest of the sections would not take much time, but to prettify it to the full, would require a lot of restructuring. For practical purposes, one would simply want to extract the relevant sections like backbone and/or BiFPN for further uses.
Question 2: That also brings me to the next topic, how important are each of these sections in the main page you see for the models in the WNNR? https://resources.wolframcloud.com/NeuralNetRepository/resources/EfficientNet-Trained-on-ImageNet-with-AdvProp-and-AutoAugment/
e.g. If someone could rank the usefulness of each section of the above shingle page, starting from "Resource Retrieval" to "Export to MXNet" that would be great as well.
Question 3: Would you prefer seeing more sections in such shingle pages? Sections explaining the architecture, like they do in the original research papers? If so, any ideas for how it should look/be formatted? How much details should each shingle page have?
P.S. All the groups tagged here are relevant application areas of the WNNR.
Thank you for taking the time to pose these questions to the users!
Question 1: Importance of prettified structures
I am much more interested in fine-tuning models than simply running them. Having some structure to models does make it easier to properly ‘cut-off’ the final layers to replace with my own for training, especially when the network graph is complex with lots of branching. It is particularly nice when residual blocks or ‘bottleneck’ blocks are grouped together.
That being said, I do not really care if names are human-readable. 1, 2, 3, … are fine layer/block names for my purposes. Also, I do not really care if there is any cleanup below the ‘block’ level.
Finally, if it were a choice between painstakingly developer-structured models and having more models, I would much prefer having more models (with usage and fine-tuning examples).
Question 2: WNNR section importance
I would consider weight visualization to be a fun ‘toy’ example, but largely unnecessary since it is covered in an article on the website (and even in a few talks Stephen Wolfram has given):
MXNet/ONNX Export/Import are covered in the documentation for “MXNet” Import/Export, so I do not know how important this section is (unless it is specifically different/complicated for a particular network). I suppose having this section could help those who are new to the Neural Network functionality learn that they can export models as stand-alone MXNet/ONNX format.
I largely agree with the answer on StackExchange. I think transfer learning is one of the most important sections and having more clean examples of how to do it is important. I personally have never used the ‘construction notebooks’ because I did not know they existed, but they seem like a really good resource to learn how to build your own complicated neural networks using WL. I think having better examples (maybe GitHub repositories or a dedicated documentation section with example NN usage of building common neural networks like Transformer networks, LSTMs, etc.) would be even more valuable to users than having construction notebooks for every network in WNNR.
Question 3: Additional Sections
It is hard to think of anything super critical. More documentation of transfer learning in complicated architectures would be nice.
Standard benchmarks for runtime performance on CPU/GPU would be nice, and it would be a great intern project to set up an automated test suite using wolframscript on a Linux GPU docker container that downloads and runs the ‘example usage’ section of each network in WNNR and records average CPU/GPU runtime speed on standardized example data. Results could automatically be uploaded to the Data Repository and visualized on the WNNR website. This test could also measure performance between WL-NN releases and GPU driver updates and hardware versions. Such a detailed reference does not exist as far as I know, and it would be super helpful for fully understanding the speed and accuracy tradeoffs of different networks. It would also be insightful to track baseline performance on real-world hardware over time. The size of WNNR makes it well-suited to this kind of analysis and comparison of networks.
Also, RE: adding sections explaining the architectures,
I do not think this would be very helpful to me. It seems like it would take a lot of technical work to make sure these explanations are very accurate and suitable for use as a professional reference. Furthermore, as someone wise once said, “Truth can only be found in one place: the code.” I can gather a lot of 100% accurate detail about the architecture very quickly simply by looking at the architecture graph (a very nice part of the WL NN library).
I would like to see slightly more elaboration in the brief description at the top of the page of some architectures, but having long Shingle-page explanations of the architecture would provide little value to me personally.
Thanks for asking :)
Hopefully, asking the right questions, hopefully leads to finding the right answers.