Message Boards Message Boards

6
|
21821 Views
|
13 Replies
|
31 Total Likes
View groups...
Share
Share this post:

Will Mathematica make use of Apple Silicon's GPU in the near future?

Posted 3 years ago

With even the new MacMini being equipped with a GPU for neural computing, to use Apples marketing spiel, and able to run TensorFlow models what is the likelihood of Mathematica allowing access the M1's GPU any time soon?

With a good bit of pulling the hair out of my head I've just managed to configure a MacPro 4,1 with an NVIDIA CUDA capable GPU to run Mathematica Deep Learning code and I'm well impressed. It easily outperforms my much newer MacMini. So I'm gagging to see what performance the M1 will provide.

POSTED BY: Nicholas Walton
13 Replies

"What I would like to see is that any function in Wolfram language that has an option "UseGPU" will actually use the GPU on my Mac in the not too distant future. I don't care that I could get faster results by investing in different hardware (and spending the time and money to make it work). For almost everything else that I need my computer to do, Mathematica and Wolfram Language give me the tools I need to do a much better job than I could do myself. It is only the lack of Wolfram's use of the GPU and related hardware technologies (Neural engine, etc.) that is an issue."

Exactly.

POSTED BY: Seth Chandler

I must apologize for getting this thread off on a tangent. Ultimately, the issue is not whether Apple's neural net/GPU engine is better than NVIDIA, but whether Mathematica will make full use of the hardware available to macOS (and probably iPadOS) users.

After what in my opinion is a slow start, Mathematica is making use of Metal (rather than relying on OpenGL), and the results are very good. This has permitted expansion into new areas, such as the very experimental ray tracing in the 12.2 beta.

All I am looking for as a long-time Mathematica on Macintosh user is that the software make full use of the available hardware. I don't care if NVIDIA is faster on different hardware, as long I have decent hardware acceleration with my hardware.

Mathematica has been characterized as a swiss army knife. The analogy is not exact, since software does not place the same constraints as physical design. However, the analogy is apt in the following sense: it meets the needs of the vast majority of users who need to use some mathematical techniques or C/S magic without having to deal with the messy details. I look at the developers of Mathematica 's functions as collaborators in a very real sense. Back in the dark ages, I filled that role when I worked in a research lab, but those says are long gone.

In the case of Machine learning, I have no doubt that there are dedicated tools that doo a better job, just as there are better tools for audio and image processing -- although there are some things that Mathematica can do that dedicated programs do not, simply because of access to mathematical functionality created for another purpose.

I am not concerned with the availability of optimized libraries similar to those for Intel (and NVIDIA). Worst case, the Rosetta II emulator should be able to handle this functionality and still be faster than than running on Intel. However, Apple has the resources to do what you suggest -- and they have already done so with previous transitions. Hardware evolves all the time. It can be painful, but that is what it is, living on the frontier. It is perhaps my perspective of having coded for nearly 50 years, that I have a different context.

Bottom line

What I would like to see is that any function in Wolfram language that has an option "UseGPU" will actually use the GPU on my Mac in the not too distant future. I don't care that I could get faster results by investing in different hardware (and spending the time and money to make it work). For almost everything else that I need my computer to do, Mathematica and Wolfram Language give me the tools I need to do a much better job than I could do myself. It is only the lack of Wolfram's use of the GPU and related hardware technologies (Neural engine, etc.) that is an issue.

I have spoken with several Mathematica users who use macOS, both within Wolfram Research and elsewhere, and this is something that all of them would like.

Are you sure? Benchmarks I have seen for the new chips would indicate otherwise. Granted, these numbers are not direct comparisons of neural net computations, but it probably indicates what is possible.

As I see it, the main sticking point is that the open-source software Wolfram uses for GPU acceleration of neural net computations is dominated by NVIDIA. The trick, which I think is possible done by a small group of people, would be to make a function-call equivalent library that uses the Apple technology.

This should be within the capabilities of Wolfram ore even people in this community. At the very least, macOS and iOS (iPadOS) users would be able to do computations significantly faster than they can now. Even if it can't match NVIDIA, which is debatable, it would be orders of magnitude faster than just using the CPU.

Note that the M1 chip does not support eGPUs. It is possible that the chips that Apple ends up putting in the Mac Pro or iMac Pro will support these add-ons. Most of us don't have the deep pockets (or ample grant money) to invest in this solution, though.

I fooled around with this stuff back at the time when Apple used NVIDIA. What was frustrating was the fact that NVIDIA would change which cards were supported, and so you could never depend on the resource being available. This is certainly the case now for people using Windows or Linux Systems. The nice thing about Apple is that their APIs are abstracted from the hardware, so if the hardware changes (which it just did in a big way), the APIs do not.

I can understand the desire on the part of Wolfram to use cross-platform solutions whenever possible. However, Wolfram has, in the past, made use of specific hardware and OS functionality to fully exploit any advantages. Remember that the Notebook paradigm was available on the Mac (and Next) a long time before Windows because versions of Windows before Win 95 were not very capable. In addition, Mathematica users on Macs could make full use of the 32-bit (and 64-bit) architecture while Windows computers were still hobbled by 16 bit CPUs. Now the we have a glimpse of what Apple Silicon can do, I hope that Wolfram Research will once again take advantage of hardware and OS opportunities as it has in the past.

I have not done this type of coding for some time, or I would be tempted to take on the task myself.

Thanks. I should have led with that.

I hope someone from Wolfram management is looking at this thread.

Not a chance in the near future (about few years? ). There is no other technology which is able to be comparable with Nvida CUDA.

First of all, the current Mathematica support of CUDA is still very limited. Another tools (MATLAB) provides significantly better GPU computing capabilities.

NVIDIA CUDA is not only machine learning engine, this is a highly optimized eco-system of libraries which covers many domains of applied mathematics with excellent performance. The main intention of Wolfram Research should be to cover all as possible domain of applications, not only "machine learning", which is now very popular.

Moreover, especially for general high performance computing the Nvidia CUDA provide really unique programming tool set with excellent performance.

As I said before, there is not a comparable solution so far. From my point of view the Apple Silicon close the door of GPU high performance computing for their customers.

Another question is, how the WRI solve the problem with lack of highly optimized Intel math libraries (MKL, IPP, ...) for ARM64 CPU's (Apple Silicon).

Finally, Apple choose the complicated way to create brand new computer without x86-64 native compatibility and without Nvidia CUDA compatibility. I am afraid that especially for Mathematica users on Apple Silicon the next few years will be very frustrating. But, may be I am wrong and there is some secret solution ...??!!

Posted 3 years ago

Stephen noted that he uses a Mac Pro in his February 2019 entry on his blog. At that time, "Mac Pro" referred to the 2013 "trash can" model.

I could see him working quite well with a 16GB M1 Mac Mini with 2TB of storage. You can drive two 4K displays with that machine. It appears that Apple's swapping performance is improved in this version of the OS. Coupled with the faster SD access, that can compensate for the smaller RAM. As a bonus, you can get exactly the same mobile processor/RAM/storage on the M1 Macbook Air.

The thermals of this processor are astonishing. Many of the reviewers noted they went for days without recharging the battery. The MBA is built without a fan, but it takes many minutes of full load before it gets thermally throttled. Anyone who uses the current Intel processors to keep their lap warm will have to get a cat.

I hope the M1 becomes the de facto platform for Wolfram's technical staff. I'd love to see Wolfram's engine take advantage of the M1's GPU and Neural Engine for outstanding performance on the Mac. That same code should work for running code on the iPhone and iPad.

POSTED BY: Phil Earnhardt
Posted 2 years ago

Would love to see this as well, I'm so amazed at what my mid-spec M1 can do that my daughter is probably getting it soon as i'm going to upgrade to a M1Max. Support for the GPU and/or NeuralEngine would be awesome. Having said that though, for most of the ML stuff I'm doing, the support for remote job execution in AWS is sufficent. I just hate not being able to use all the power on my local machine.

Also, I believe a lot of the ML support is delegated out to MXNet so presumably once it supports Metal, etc, Mathematica's ML funcs will as well. That would be a step towards more general support.

POSTED BY: erich.oliphant

Responding to Yaroslav: I think that this is a bad advice. First, even though the headquarters for Wolfram Reserch is, indeed, in Illinois (hardly a backwater), the company itself is about as global as one can get.

Second, for the most part, Wolfram Research has been a leader in new technologies, from support of the GUI in Next and Macintosh, to the development of Wolfram|Alpha.

Third, while Apple sometimes tries out new technology (OpenDoc, Touch bar, etc.) that it later abandons, the use of Metal is now almost 10 years old, and is a matter of active development. Further, it would be in Wolfram's interest to make use of Apple's APIs, since they isolate the code from the hardware.

Fourth (and this relates to #3), NVIDIA may be the leader, but the technology is hard to work with and breaks frequently. (I used the technology when Apple used NVIDIA).

Now, I have something at stake here: all my computers use Apple Silicon, and I would really like to be able to use the option "UseGPU" and have it accelerate my computations. Based on what I have seen on the current implementation in Mathematica, it would certainly be worth the effort.

I am sure that there a lot of people who would like to take full advantage of things like Machine learning and neural nets that would benefit from using the Apple Silicon GPUs, but who would rather not have to learn the low level nuts and bolts, or, perhaps do not have funding to invest in a lot of extra hardware.

Recent comments from Stephen seem to indicate that taking full advantage of the new Apple hardware is under active development at Wolfram, so I am hopeful that some version of Mathematica (maybe even some version of 13.x) will do what we want.

As an advice to Wolfram Research, I would keep track of what other computation frameworks do, and follow their lead. Accelerator surface keeps changing, and being in Illinois, it's probably hard to keep track of the latest trends. API may look good until you meet someone for a beer and find that leadership has pulled the plug on maintenance (cough MxNET cough)

I found a relevant issue on Jax github and they aren't planning to support M1 GPUs yet https://github.com/google/jax/issues/8074

Other places to check are Julia numeric computing community, TensorFlow, someone eventually is going to get seduced by unused FLOPs in their laptop and figure out a good solutuon.

POSTED BY: Yaroslav Bulatov

It would indeed be nice to have "use GPU" option and have things run fast automatically.

The reason people use Nvidia accelerators is because they invested so much into software layer. Your accelerator may have the flops, but you also need the software to move the correct bits to the correct transistors! In particular, it's libraries like CuDNN/CuTensor.

AMD doesn't really have an equivalent so people either have to make their own version, which is an option for someone like Google, or pay a large markup for Nvidia offering.

Computational scientists went from doing GPGPU by

  1. programming shader units, to
  2. writing CUDA, to
  3. reusing high quality primitives like GEMM.

Incidentally, the reason we have CUDA because of a bet Nvidia made in early 2000s on virtual reality worlds. Jensen pushed to sacrifice some of the rendering performance to enable ability to do physics simulations inside the GPU. This decision almost bankrupted the company at the time, but years later it allowed the pivot from gaming into high-performance computing. You can see this dark period reflected in Nvidia's stock price

DateListPlot@FinancialData["NVDA", {"Jan. 1, 2000", "Jan. 1, 2008"}]

enter image description here

Now, M1 Max is out, gaming performance is great, how do you make use of these 57 billion transistors for general purpose computing?

You could either:

  1. Go back to programming shaders or
  2. Wait for someone else to program these shaders. This is an active area of development, a good discussion here

enter image description here

POSTED BY: Yaroslav Bulatov

Actually, after looking at MPS a bit, it seems these are not your graphics shaders from 2005, they have higher level primitives built-in

POSTED BY: Yaroslav Bulatov
Posted 2 years ago

MPS covers a broad spectrum of tasks, but they are defined tasks not generic GPU operations.

Thinking about this there is a Khronos SPIRV to Metal converter that works well for compute shaders, one would have to convert from GLSL or some other language to SPIRV within the confines of the limits of Metal to do this.

I have written a GLSL to Metal converter using this path, its quite simple and wouldn't be too hard to write as a FFI to Mathematica.

POSTED BY: Guy Madison
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract