Group Abstract

Message Boards

WOLFRAM COMMUNITY

19.3K Views

15 Replies

51 Total Likes

View groups...

Follow this post

Share this post:

GROUPS:

About MxNet as Mathematica backend choice

Yaroslav Bulatov

Yaroslav Bulatov, Together AI

Posted 4 years ago

I recently noticed that Mathematica uses MxNet as the backend for neural networks. It seems to have been integrated in 2015. The blog post listing the rationale is here. MxNet seems to have not gained the momentum to become popular, you can see the trends from "papers with code". Lack of popularity means framework may be slow to develop. For instance, this question about using neural networks to fit ODEs from Joshua Schrier. It requires underlying framework to support higher order gradients. There's an issue to add support in MxNET but progress has stalled. Meanwhile PyTorch/TensorFlow/JAX support this feature.

POSTED BY: Yaroslav Bulatov

15 Replies

Sort By:

Josh H

Posted 1 year ago

deleted

POSTED BY: Josh H

Jack Hu

Posted 1 year ago

No offense but I think that Wolfram should have released new NN backend with 14.0 already. Moving too slowly in this key field. And ExternalEvaluate support for Python and Julia etc. is about the same as crippling.

POSTED BY: Jack Hu

Seth Chandler

Seth Chandler, University of Houston

Posted 1 year ago

I agree with the original poster. Offering MXNet alternatives needs to be a very high priority. Perhaps doing so would help with the significant problem of Target->"GPU" that at least of version 13.3 did not work well on Macs when training neural nets.

POSTED BY: Seth Chandler

Asim Ansari

Posted 1 year ago

Regarding JAX, you may be aware of Aesara ( https://github.com/aesara-devs/aesara ).

POSTED BY: Asim Ansari

Yaroslav Bulatov

Yaroslav Bulatov, Together AI

Posted 1 year ago

Btw, regarding Jax, I think of Mathematica implemented tensor differentiation natively, this would obviate the need to rely on external autodiff. The issue is that neural network training involves differentiation of expression like norm(W1 W2 W3) w.r.t to matrix W2. Mathematica differentiation is at its core a scalar differentiation, it can't treat W2 as an atom. Autodiff frameworks like Jax can find these derivatives efficiently by treating matmuls as atomic operations. I have thought a lot about the proper design of such a system and have a couple of prototypes, getting it into Mathematica is on my to-do list https://community.wolfram.com/groups/-/m/t/2437093

POSTED BY: Yaroslav Bulatov

James Linton

James Linton, SoundWaves Productions

Posted 1 year ago

Mathematica's flirting with Python and PyTorch, huh? I have to say, my past Python escapades felt like a wild goose chase through a maze of dependencies and deprecations. But hey, if Mathematica can tame that beast and keep the chaos at bay, I'm all for it! I'm so smitten with Mathematica that the thought of not having to play peek-a-boo with Python again fills me with glee. I hope this new project lets me cozy up even more in my Mathematica comfort zone – it’s like my favorite armchair that I never want to leave.

POSTED BY: James Linton

Asim Ansari

Posted 1 year ago

Just wanted to alert that as of mid 2023 Google is recommending Flax for new projects instead of Haiku

POSTED BY: Asim Ansari

Arnoud Buzing

Arnoud Buzing, Wolfram Research

Posted 1 year ago

I am working on a (new) related project, to make Python functionality more accessible in the WL: https://resources.wolframcloud.com/PacletRepository/resources/ArnoudBuzing/ExternalFunctions/ref/FashionPersonImageSynthesize.html https://resources.wolframcloud.com/PacletRepository/resources/ArnoudBuzing/ExternalFunctions/

POSTED BY: Arnoud Buzing

Yaroslav Bulatov

Yaroslav Bulatov, Together AI

Posted 1 year ago

Jax (and PyTorch) are already "integrated" with Mathematica using Python `ExternalFunction` interface, so the question would be -- which parts of Jax/PyTorch would you want to integrate more closely? If the goal is to replace MxNet parts impementing NN interface, then it would be the modeling code. For PyTorch, that would mean integrating `nn.Module` abstraction which gives is a building block that you wire up together with other `nn.Module` blocks to build your neural net. This abstraction has been around for 8+ years, so it probably won't go away soon. For instance, models from PyTorch model hub and most repos under Papers with Code use `nn.Module` interface. So one interesting use-case would be to make give a way to load model hub models into a Mathematica symbol which could be called as a function. Jax has started as "autodiff for numpy", so neural network abstractions didn't come until later. There's `Flax` created by Google and `Haiku` that was created by DeepMind. Integrating Jax modeling would mean deciding whether you want to integrate Flax Dense or Haiku Linear layers.

POSTED BY: Yaroslav Bulatov

Giulio Alessandrini

Giulio Alessandrini, Wolfram Research Inc.

Posted 1 year ago

I personally find the JAX model compelling but I never seriously worked with it. I am courious: why you would not reccomend it's inclusion? Thanks!

POSTED BY: Giulio Alessandrini

Matteo Salvarezza

Matteo Salvarezza, Wolfram Research

Posted 1 year ago

We are well aware of the MXNet situation and have been planning a backend switch. It is a very large project that's currently in its early research state, and at the moment we don't yet have a time estimate for its completion.

POSTED BY: Matteo Salvarezza

Yaroslav Bulatov

Yaroslav Bulatov, Together AI

Posted 1 year ago

Mathematica should be integrated with PyTorch instead of MxNet. It has by far the most community support. The other candidates could be TensorFlow and Jax, but I would not recommend their integration in 2024. (I worked both on TensorFlow and PyTorch development teams)

POSTED BY: Yaroslav Bulatov

Arnoud Buzing

Arnoud Buzing, Wolfram Research

Posted 1 year ago

What machine learning framework (or frameworks) are of interest to you (as a possible complete replacement for MXNet)?

POSTED BY: Arnoud Buzing

Sangdon Lee

Posted 1 year ago

I recently found that the MXNET was retired as of Sep. 2023 and it is no longer actively developed as indicated from its website and Wikipedia. Mathematica uses MxNet as the backend for neural networks. Wikipedia Apache

POSTED BY: Sangdon Lee

Joshua Schrier

Joshua Schrier, Fordham University

Posted 4 years ago

Building on Yaroslav Bulatov's post: Closely related is support for things like NeuralODEs (backpropogating through NDSolve). This is another increasingly important area in scientific ML, and is now built into Julia (https://julialang.org/blog/2019/01/fluxdiffeq/) and PyTorch (https://towardsdatascience.com/neural-odes-with-pytorch-lightning-and-torchdyn-87ca4a7c6ffd) . One can kind of "hack it" by implementing ODE integrators as neural nets in Mathematica (I've done it, as have some others).

POSTED BY: Joshua Schrier

Reply to this discussion

Reply Preview

Attachments

Remove Add a file to this post

Follow this discussion

or Discard

Feedback