Message Boards Message Boards

11
|
8434 Views
|
14 Replies
|
49 Total Likes
View groups...
Share
Share this post:

About MxNet as Mathematica backend choice

Posted 2 years ago

I recently noticed that Mathematica uses MxNet as the backend for neural networks. It seems to have been integrated in 2015. The blog post listing the rationale is here.

MxNet seems to have not gained the momentum to become popular, you can see the trends from "papers with code". Lack of popularity means framework may be slow to develop.

For instance, this question about using neural networks to fit ODEs from Joshua Schrier. It requires underlying framework to support higher order gradients. There's an issue to add support in MxNET but progress has stalled. Meanwhile PyTorch/TensorFlow/JAX support this feature.

POSTED BY: Yaroslav Bulatov
14 Replies
Posted 3 months ago

No offense but I think that Wolfram should have released new NN backend with 14.0 already. Moving too slowly in this key field. And ExternalEvaluate support for Python and Julia etc. is about the same as crippling.

POSTED BY: Jack Hu

I agree with the original poster. Offering MXNet alternatives needs to be a very high priority. Perhaps doing so would help with the significant problem of Target->"GPU" that at least of version 13.3 did not work well on Macs when training neural nets.

POSTED BY: Seth Chandler

We are well aware of the MXNet situation and have been planning a backend switch. It is a very large project that's currently in its early research state, and at the moment we don't yet have a time estimate for its completion.

What machine learning framework (or frameworks) are of interest to you (as a possible complete replacement for MXNet)?

POSTED BY: Arnoud Buzing

Mathematica should be integrated with PyTorch instead of MxNet. It has by far the most community support.

The other candidates could be TensorFlow and Jax, but I would not recommend their integration in 2024. (I worked both on TensorFlow and PyTorch development teams)

POSTED BY: Yaroslav Bulatov

I personally find the JAX model compelling but I never seriously worked with it. I am courious: why you would not reccomend it's inclusion? Thanks!

Jax (and PyTorch) are already "integrated" with Mathematica using Python ExternalFunction interface, so the question would be -- which parts of Jax/PyTorch would you want to integrate more closely?

If the goal is to replace MxNet parts impementing NN interface, then it would be the modeling code.

For PyTorch, that would mean integrating nn.Module abstraction which gives is a building block that you wire up together with other nn.Module blocks to build your neural net. This abstraction has been around for 8+ years, so it probably won't go away soon. For instance, models from PyTorch model hub and most repos under Papers with Code use nn.Module interface. So one interesting use-case would be to make give a way to load model hub models into a Mathematica symbol which could be called as a function.

Jax has started as "autodiff for numpy", so neural network abstractions didn't come until later. There's Flax created by Google and Haiku that was created by DeepMind. Integrating Jax modeling would mean deciding whether you want to integrate Flax Dense or Haiku Linear layers.

POSTED BY: Yaroslav Bulatov

Mathematica's flirting with Python and PyTorch, huh? I have to say, my past Python escapades felt like a wild goose chase through a maze of dependencies and deprecations. But hey, if Mathematica can tame that beast and keep the chaos at bay, I'm all for it! I'm so smitten with Mathematica that the thought of not having to play peek-a-boo with Python again fills me with glee. I hope this new project lets me cozy up even more in my Mathematica comfort zone – it’s like my favorite armchair that I never want to leave.

POSTED BY: James Linton
Posted 3 months ago

Just wanted to alert that as of mid 2023 Google is recommending Flax for new projects instead of Haiku

POSTED BY: Asim Ansari

Btw, regarding Jax, I think of Mathematica implemented tensor differentiation natively, this would obviate the need to rely on external autodiff.

The issue is that neural network training involves differentiation of expression like norm(W1 W2 W3) w.r.t to matrix W2. Mathematica differentiation is at its core a scalar differentiation, it can't treat W2 as an atom. Autodiff frameworks like Jax can find these derivatives efficiently by treating matmuls as atomic operations.

I have thought a lot about the proper design of such a system and have a couple of prototypes, getting it into Mathematica is on my to-do list https://community.wolfram.com/groups/-/m/t/2437093

POSTED BY: Yaroslav Bulatov
Posted 3 months ago

Regarding JAX, you may be aware of Aesara ( https://github.com/aesara-devs/aesara ).

POSTED BY: Asim Ansari
Posted 3 months ago

I recently found that the MXNET was retired as of Sep. 2023 and it is no longer actively developed as indicated from its website and Wikipedia. Mathematica uses MxNet as the backend for neural networks.

POSTED BY: Sangdon Lee

Building on Yaroslav Bulatov's post:

Closely related is support for things like NeuralODEs (backpropogating through NDSolve). This is another increasingly important area in scientific ML, and is now built into Julia (https://julialang.org/blog/2019/01/fluxdiffeq/) and PyTorch (https://towardsdatascience.com/neural-odes-with-pytorch-lightning-and-torchdyn-87ca4a7c6ffd) . One can kind of "hack it" by implementing ODE integrators as neural nets in Mathematica (I've done it, as have some others).

POSTED BY: Joshua Schrier
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract