Group Abstract

Message Boards

WOLFRAM COMMUNITY

3.5K Views

45 Replies

26 Total Likes

View groups...

Follow this post

Share this post:

GROUPS:

Wolfram U

[WSG25] Daily Study Group: Wolfram Language and LLMs: Ideal Complements

Arben Kalziqi

Arben Kalziqi, Wolfram Research

Posted 3 months ago

A Wolfram U Daily Study Group on the intersection between Wolfram Language and LLMs begins on the 10th of November, 2025. Join me and a cohort of fellow learners to learn about how you can use large language models in the Wolfram environment and—likewise—how you can use Wolfram's computational intelligence to augment the capabilities of large language models. This series starts with a discussion about basic back-and-forth chats with LLMs in the notebook environment. The next lessons show how to incorporate LLMs into different workflows and write your own functions that call upon LLMs for input, oftentimes saving a fair amount of time and effort. The series ends with discussions of how to take advantage of Wolfram Language's computational nature to make LLMs—which are fundamentally stochastic and prone to hallucination—much more robust and reliable, then gives examples of how to combine all of these ideas in order to create single Wolfram Language functions that can call multiple agents and tools simultaneously. Please feel free to use this thread to collaborate and share ideas, materials and links to other resources with fellow learners. This is a one week Study Group that will run from November 10 to November 14 at 11AM to 12PM CST. Register here.

POSTED BY: Arben Kalziqi

45 Replies

Sort By:

Arben Kalziqi

Arben Kalziqi, Wolfram Research

Posted 2 months ago

Hi Quan—could you post a screenshot or copy and paste the output here? I can talk to our team and/or submit a bug if I have a bit more information. Thanks!

POSTED BY: Arben Kalziqi

Quan Le Thien

Quan Le Thien, Physics Dept - Indiana University Bloomington

Posted 2 months ago

I am using Mathematica 14.2 and this issue with GPT-5 still persists. I have tried your fix of running the PacletInstall, then it worked for only one time. The second time I try again it still gives this same error "presence_penalty". I checked and it only occurs with GPT-5, all other GPT models run fine.

POSTED BY: Quan Le Thien

Arben Kalziqi

Arben Kalziqi, Wolfram Research

Posted 2 months ago

As with most any tool, I think there can be good uses for it. Materially speaking, however, I do not have a great view of how things are playing out more broadly. To keep it short and not-so-sweet: (American) society has evolved such that you need a degree to get most any kind of job (if you can get one in the first place). Many universities have taken this situation to heart, and we end up in a situation where said universities function less as institutions that help to create "whole", well-rounded people and instead simply meet the demand for diploma generation. This creates situations where students take classes that they don't want to take being taught by instructors who don't want to teach them. An insidious thing about LLMs is that they enable this process—I feel like maybe Zizek said something about this years ago—but it's very possible that we end up in a situation where the students in classes they don't want to take submit papers that they didn't actually write and those papers are not actually graded by the instructors who don't want to be there either, you know? I think LLMs can be handy for things like rapid retrieval from dense texts and for as introductory tools or augmentations for tasks like coding, but—as I mentioned in these last two study groups!—fundamentally LLMs do not have access to truth; they do not have an internal sense of the world that allows them to know things in the traditional sense. (Say: you or I know that 1+1=2; an LLM merely finds it exceedingly probable. There is a difference.) In an increasingly post-truth world, I don't think it's too hard to see how this can be a problem on several important axes. I don't think I can say much beyond this, and I must stress that even this is solely my personal view which I provide for a familiar face here :).

POSTED BY: Arben Kalziqi

Phil Earnhardt

Posted 2 months ago

"Neural accelerators" -- optimally efficient matrix multiplication -- have just been added to the GPUs in Apple's M5 processors. The M5 Pro, M5 Max, and M5 Ultra chips should be mightily impressive for high-octane AI researchers and users. At least machines with M5 Pro and M5 Max should be available in the first part of 2026. It sounds like Apple will suggest multiple Mac Studio boxes with a Thunderbolt 5 "bus" for its highest-end customers to distribute computational tasks to multiple machines. The Wolfram Language core numeric computation engine could be seriously enhanced over the next software update or two. A question for @Arben: it seems as if the Wolfram AI Course Assistant would be ideal for any questions in a Wolfram course -- including double-checking answers to Wolfram quiz questions. As a member of the faculty, are you encouraged or discouraged by the way students could use these marvelous tools?

POSTED BY: Phil Earnhardt

Paul Nielan

Posted 2 months ago

I saw an article today on Apple allowing linking of multiple Mac Studios via Thunderbolt 5 in the next version of macOS. https://www.engadget.com/ai/you-can-turn-a-cluster-of-macs-into-an-ai-supercomputer-in-macos-tahoe-262-191500778.html

POSTED BY: Paul Nielan

Carl Hahn

Carl Hahn, Self (with some exceptions)

Posted 2 months ago

Thanks Paul, I agree. I too tried it on a different computer without any success. It would be nice to at least show a message that says "Hey, it's us, don't waste time, we will be back as soon as we can" I am sure an LLM can come up with something to that effect.. Carl

POSTED BY: Carl Hahn

Paul Nielan

Posted 2 months ago

I have experienced the same thing on two different computers. I think the service is down and has been for several days. Now that Wolfram is now a serious "services" company, they need status pages that are updated 24/7 with the status of all key cloud services and repositories. I've wasted several hours on this, including filing a report with Wolfram.

POSTED BY: Paul Nielan

Carl Hahn

Carl Hahn, Self (with some exceptions)

Posted 2 months ago

So you are not experiencing the same issue?

POSTED BY: Carl Hahn

Phil Earnhardt

Posted 2 months ago

Hi, Carl. Here's what Grok suggested. "Flush caches" was the only thing I saw that you probably haven't already tried. The response is repeated because I prompted it to "think harder". HTH. https://x.com/i/grok?conversation=1990170094241214480

POSTED BY: Phil Earnhardt

Carl Hahn

Carl Hahn, Self (with some exceptions)

Posted 2 months ago

Yesterday and today when I try to access my LLM kit, whether by opening a new notebook or using ' in a notebook, I keep getting this message: [WarningSign] Cannot check LLMKit subscription status. Contact us » [SpanFromLeft] Most of my "play time' in Wolfram is over the weekend, and it does not seem there is anybody at Wolfram providing technical support on Saturdays and Sundays.... Does anybody know what this is? Is it on my end or Wolfram's? I have a valid subscription and it was working on Thursday.

POSTED BY: Carl Hahn

Arben Kalziqi

Arben Kalziqi, Wolfram Research

Posted 2 months ago

Hi all! Hope you enjoyed the study group. There were a few lingering survey responses that I wanted to address here, as promised. Please cover some methods to use personal reference documents in formats other than ASCII text with LLM functions. For example, create an LLM function to summarizes a collection of PDF files located in a specified directory. Another item would be a way direct an LLM function to synthesize its response based on a specified collection of documents in common document formats, (i.e XLSX, PDF, CSV, ODT, etc) Today's discussion of `LLMGraph` hopefully provided some helpful insight for this question, but I also want to note that in my particular graph, I used Wolfram Language to grab only the PDF files, but WL can import basically any kind of thing you're interested in. You can see all available `Import` formats by running: $ImportFormats Depending on what you're importing, it will often be useful when feeding it to an LLM to format it as markdown text. Remember that you can do this by calling this resource function: ResourceFunction["ExportMarkdownString"] As an applied mathematician with a background in signal processing, simulation, project management,, and Protege Ontology , I hope to learn enough to experiment with Agentic simulation of simple organizational governance policies via interacting Wolfram LLMFunctions—"coordinated" via ontologies . Is that hopelessly overly ambitious for a newcomer. to the Wolfram environment? My goal is to do some applied category theory work in the area of semiotics and the dynamics of organizational evolution. I hope that aim does not sound too abstract. As someone trained as a physicist, I can't say I have a full grasp of every part of the goal here, but I don't think it should be beyond your grasp even as a newcomer. You've seen how to combine WL with LLMs in a rather arbitrary way, now, so if you have any questions about "how can I do XYZ with WL?" you can always feel free to ask. Using 100s of documents along with a prompt. Anyway to make this efficient, so documents aren't processed by AI engine over again for each prompt. The naive method that I used in today's lecture (`LLMGraph` with a `"ListableLLMFunction"` is probably not extensible to hundreds of documents, but my understanding is that you should be able to manage this without much issue with semantic search indices and vector databases. Gay Wilson from Wolfram has written up a short example workflow here. There were a good number of questions about running local models, as well. This is not something I'm very familiar with, but maybe this notebook from Daniel—also here at Wolfram—can function as a jumping off point. I have in mind a "projection" from all the extensive Mathematica Documentation, as well as Community threads --- a large corpus to be filtered and structured against personal technological experience and personal objectives, suitably described. It sounds like a tall but fascinating order, which I would imagine pursuing incrementally. What do you think? This is basically Notebook Assistant—it has access to the full Wolfram documentation system, but also many of our repositories. I believe that we're working on adding Wolfram U notebooks and information from Community as well. It's maybe not quite as personal, but I'm not sure how we'd really objectively evaluate and thus implement that. I would appreciate a bit of guidance on how to use the very powerful graphical features that you showed, which present visualizations of the flow of computation. I believe that these would help me to explore, and get a better grasp of, the basic Transformer architecture -- which is beyond the scope of this Study Group. I recommend checking out our What is ChatGPT Doing... and Why Does It Work? study group. This series is based heavily on Stephen Wolfram's blog post (which was made into a book) and was adapted and delivered by yours truly. There was also a series of questions about `LLMTool` that gets the implementation a bit backward, so I wanted to use those questions to offer some clarity. How can an LLMTool be directed to use a model that is on the local machine or hosted on a local server? Can an LLMTool be directed to use multiple models simultaneously? Can an LLMTool specify a collection of documents in a specific directory for the model to use when preparing the response ? If so, what are the issues with respect to OCR and various file formats? It's models that use tools, not vice versa. Models with access to tools will "decide" based on the context of their token generation given some input that it's time to feed some portion of their tokens as textual input to a tool. The tool does whatever computation it was meant to do, then provides the results back as a string to the model. The model then tokenizes that string and can use it as part of its "input" for generating future tokens.

POSTED BY: Arben Kalziqi

Stephen Turner

Stephen Turner, Software Engineer

Posted 2 months ago

"https://www.perplexity.ai/search/tell-me-a-funny-joke-no-one-ha-cKpP47u6TNageZo465bXCQ#0" One-second response with GPT-5.1. I tried it multiple times. This on the first day of the 5.1 API release - and it will only get better.

POSTED BY: Stephen Turner

Arben Kalziqi

Arben Kalziqi, Wolfram Research

Posted 2 months ago

Further comment: when I mentioned this timing discrepancy to one of our devs, they said that they ran a pure API call to gpt-5 with default settings and the query "Tell me a funny joke no one has ever heard before" took 39 seconds to return a result. If your chat environments and API calls are not taking that long, then they're changing a default setting somewhere that we are not changing.

POSTED BY: Arben Kalziqi

Arben Kalziqi

Arben Kalziqi, Wolfram Research

Posted 2 months ago

It looks like "Reasoning" is an available setting for LLMEvaluator, but there seems to be a bug in it right now. I believe that it's fixed in our next major release and they're working on backporting it into a paclet update that will work in earlier versions. (And once 5.1 is working, that defaults to setting "Reasoning" to "None" anyway, which is not an option on the gpt-5 models [which default to "Medium"].) This is my current understanding, at least, but I can't claim to be fully authoritative on this.

POSTED BY: Arben Kalziqi

Arben Kalziqi

Arben Kalziqi, Wolfram Research

Posted 2 months ago

As far as I know, all of the available information we can provide is produced if you give `All` as a second argument to basically any `LLM*` function. If it isn't there, I think we must not have it accessible.

POSTED BY: Arben Kalziqi

Stephen Turner

Stephen Turner, Software Engineer

Posted 2 months ago

Thank you, Arben! I would also like to point out that this does not just apply to us poor home users. Any research project which involves repetitive operations on data (imagine sentiment analysis on a million samples) would have its budget blown out of the water by a more expensive model. Once one has a viable prototype solution, try it with a Nano model. These are dirt cheap in comparison to even the Minis. GPT-4.1 Nano is presently the one to test against. Once Wolfram has fixed their GPT-5 and GPT-5.1 interface code, these Nano versions are even cheaper.

POSTED BY: Stephen Turner

Arben Kalziqi

Arben Kalziqi, Wolfram Research

Posted 2 months ago

Indeed, 4o is more expensive and I wouldn't necessarily recommend it for general use. However, when I use an external model for the sake of examples in these notebooks, I'm more concerned with performance than cost (or performance per unit cost), despite the fact that (say) the mini models do almost as well for a fraction of the price. This is a relevant consideration for users at home, however, so thank you for pointing it out!

POSTED BY: Arben Kalziqi

Stephen Turner

Stephen Turner, Software Engineer

Posted 2 months ago

OK, last LLM question of the evening. I promise! Arben... Is there a way to hook into the (actual raw text) request-response conversation between Mathematica and the LLM service provider? Or, is there a way to get a log of this?

POSTED BY: Stephen Turner

Stephen Turner

Stephen Turner, Software Engineer

Posted 2 months ago

Minutes after writing the above post, OpenAI released the GPT-5.1 model series APIs. For Mathematica (or any coding application), the premier OpenAI model is unquestionably GPT-5.1 Codex. It's designed for exceptional instruction following and code generation. While the full model is intended for high-end planning, architecture, design, refactoring, etc, the Mini model excels at coding. And the Mini version is cheap. Using the above formula, the price per million tokens is (3 * 0.25 + 1 * 2.00) / 4 or $0.6875. Unfortunately, while it can now be selected in Mathematica Preferences / AI Settings and used with the LLM APIs (Model name "gpt-5.1-codex-mini"), it will immediately fail (see attached image). It seems Wolfram is still using the now-deprecated OpenAI "completions" API rather that the modern "responses" API. Crap! Attachments: GPT5.1 API Fail.png

POSTED BY: Stephen Turner

Stephen Turner

Stephen Turner, Software Engineer

Posted 2 months ago

Arben... Thanks for this. Really helpful. Something I wish to point out. Perhaps this should be a support request but, for LLM stuff, they never know what I'm talking about. When I do a fresh start of Mathematics (new 14.3 download and clean install with packlet update on Windows 11) and then execute your code, I get the error shown in "getServiceModelList.png". It then gives me the LLM model association. This will only happen once. I have to restart Mathematica or quit the kernel to repeat the error. The same error occurs the first time I open Preferences / AI Settings in a kernel session, as shown in "AI Settings.png". Attachments: AI Settings.png getServiceModelList.png

POSTED BY: Stephen Turner

Stephen Turner

Stephen Turner, Software Engineer

Posted 2 months ago

Another rant on the subject of LLM models... Please don't default to using GPT-4o. It's bloody expensive! The accepted way to calculate a combined input-output price for tool based applications (eg. Mathematica) is (3 * input-tokens + 1 * output-tokens) / 4. GPT-4o's price per million tokens is (3 * 2.50 + 1 * 10.00) / 4 or $4.375. GPT 4.1 Mini is (3 * 0.40 + 1 * 1.60) / 4 or $0.70 (yes, less than one-sixth) - and this is just as good as 4o for coding tasks and better at instruction following. Once Wolfram fixes the bugs in GPT-5 latency (see my other post), GPT-5 Mini (soon GPT-5.1 Mini) is much better and 62% of the cost of GPT-4.1 Mini. I would argue that GPT-5 Nano (at a very tiny fraction of the price of the aforementioned models) will probably do the jobs for a lot of tasks. https://platform.openai.com/docs/pricing

POSTED BY: Stephen Turner

Arben Kalziqi

Arben Kalziqi, Wolfram Research

Posted 2 months ago

Re: getting a list of models, here's how you can do that: In[139]:= services = Keys[(LLMServices`LLMServiceInformation[])[[1]]] You can then grab the models associated with a given service with: In[142]:= Wolfram`Chatbook`Common`getServiceModelList["OpenAI"] And just the model names explicitly with: #Name & /@ Wolfram`Chatbook`Common`getServiceModelList["OpenAI"]

POSTED BY: Arben Kalziqi

Stephen Turner

Stephen Turner, Software Engineer

Posted 2 months ago

On today's (Thursday) session, I raised the issue of a very long wait times for an LLM response - on the order of 20-30 seconds. Arben correctly surmised that I was using GPT-5 (actually using GPT-5 Mini), and suggested that I change to GPT-4o. This worked. Response times dropped to 2-3 seconds. So I decided to test the most of the modern OpenAI models. All GPT-5 models, full / Mini & Nano, were 20-30 seconds. All of the GPT-4o and GPT 4.1 models were 2-3 seconds. This is VERY fishy. When using GPT-5 through ANY chatbot (ChatGPT, Perplexity, Microsoft Copilot, GitHub Copilot, etc), I get instant responses. When using GPT-5 directly with the Python API, I get instant responses. My suggestion is that there is a GPT-5 specific code path within Mathematica which is broken and causing these delays. Please escalate this. The massive improvement in GPT-5 (and shortly GPT-5.1) reasoning and instruction following robustness is essential for a lot of tasks. It has to work. Note: I am using the Windows 11 version of Mathematica 14.3 (installed yesterday from a fresh download, and with the packlet update) on a high end Dell laptop.

POSTED BY: Stephen Turner

Arben Kalziqi

Arben Kalziqi, Wolfram Research

Posted 2 months ago

For anybody who was interested in extracting the tokens for a given input from GPT2 (or any other net model that tokenizes that you can get into a notebook, you just need to do this): tokenizer = NetExtract[NetModel["GPT2 Transformer Trained on WebText Data"], "Input"] Then, you can run `tokenizer["whatever text here"]` and it will produce the tokens. For example: In[591]:= tokenizer["Hello there! My name is Arben. Hello hello hello hello hello goodbye goodbye."] Out[591]= {15241, 357, 50257, 1756, 1183, 63, 688, 11467, 50244, 18180, 23493, 23493, 23493, 23493, 24574, 24574, 50244}

For anybody who was interested in extracting the tokens for a given input from GPT2 (or any other net model that tokenizes that you can get into a notebook, you just need to do this):

tokenizer = 
 NetExtract[NetModel["GPT2 Transformer Trained on WebText Data"], 
  "Input"]

Then, you can run tokenizer["whatever text here"] and it will produce the tokens. For example:

In[591]:= tokenizer["Hello there! My name is Arben. Hello hello hello hello hello goodbye goodbye."]

Out[591]= {15241, 357, 50257, 1756, 1183, 63, 688, 11467, 50244, 18180, 23493, 23493, 23493, 23493, 24574, 24574, 50244}

POSTED BY: Arben Kalziqi

Arben Kalziqi

Arben Kalziqi, Wolfram Research

Posted 2 months ago

I do wonder about the grading on the benchmark—for some reason I expect that it's not being done through the grader used for EIWL, but I couldn't exactly justify that expectation to you. The short of it is that we have several methods and technologies for judging code correctness, but different ones are used in different places... like any company, there are a lot of teams, a lot of different moving parts, and a lot of historical artifacts. The DeCSS stuff is interesting; I hadn't heard about it before. I do remember a similar incident with some type of Sony DRM being cracked and people saying "the key exists somewhere in the digits of pi, so how could it be illegal to distribute"? and so on :)

POSTED BY: Arben Kalziqi

Phil Earnhardt

Posted 2 months ago

The webpage I was asking about today was the Wolfram Support Document #62525, How can I start using Chat Notebooks in Wolfram products? It was announced in the chat panel on Monday by (I believe) Arben. I located it in my browser history. I thought that "Grok" was misspelled on that webpage; I hadn't realized that "Groq" was actually a completely different AI than xAI's product. AIs are easy, but product naming is complicated. :) The Wolfram LLM Benchmarking Project is a fascinating little project. As one of those many who went through the EIWL test framework, I viewed it as an idiosyncratic set of questions. I'm fascinated the best score from an AI was 64.7% -- especially when the quiz answers are available through the pages of the published books of EIWL. The test framework is doing some sophisticated pattern-matching, and sometimes its choices were completely arbitrary. I (imperfectly) recall one of the exercise's "correct" answers had to do with the names one chose for variables. I'd have to plow through that mass of exercises to find the specifics. If someone internally at Wolfram thought this was a good way to systematically benchmark all AIs, I'd love to hear a detailed explanation why. The generated Shakespearian AI output on Monday reminded me of one of the original "computational essays". The DeCSS Haiku was created in 2001 as a means of protesting the arrest of another programmer for creating the original DeCSS software. The code would allow decryption of a DVD: something Hollywood tried to prohibit in the late 1990s. Seth Schoen created the 465 stanza haiku to create "art" that was precise computational instructions for re-implementing the DeCSS. In this blog post, he says it took him about 13 hours to make almost all of the haiku. AIs could probably repeat his effort in a few minutes. AIs could also generate code in the language of choice to re-generate a DeCSS implementation from that original 465-stanza haiku. OTOH, current AIs are a terrible choice for either project.

POSTED BY: Phil Earnhardt

Arben Kalziqi

Arben Kalziqi, Wolfram Research

Posted 3 months ago

Hi Carl—thank you! This is actually more of a social issue than a technical one, which is not a thing I can say too frequently when asked about error messages :). Basically, Lena—the image in question—has long been used for image processing examples. However, the image is cropped from a Playboy centerfold and thus many people have thought that maybe different images should be used for testing. You can read about the image and its history at this wikipedia link. A few updates back, Wolfram decided that we would probably remove Lena as a test image some time in the future, so while the image is currently still available we pop up an "error" message that's really more of a warning that any code you write that involves the image might not work in the future as the image will eventually not be accessible though `ExampleData["TestImage","Lena"]`. I think that this context is probably not baked into the models used for Notebook Assistant, and in trying to fix an "error" that wasn't actually an error it started meandering and getting things wrong. For example, it tried to pull the "Iris" dataset for a machine learning example... but in Wolfram Language, we call that "FisherIris", not just "Iris", so it didn't work. If you run `ExampleData["MachineLearning"]`, you'll see it in there.

POSTED BY: Arben Kalziqi

Arben Kalziqi

Arben Kalziqi, Wolfram Research

Posted 3 months ago

Hey Phil! I also watch WWDC ;). I won't be running anything locally for this series, but maybe in the future. (Though I did just get a new laptop from our IT department earlier this year and won't be due for an upgrade for a while, so I won't be benefitting from the M5 improvements! But I really can't complain.) As far as whether it's any different to run things locally, functionally it isn't. It's more secure, obviously, and relatedly it's more private, but the actual functioning of the thing is not really particularly different. The time it takes to send and receive information to/from LLMs is really not that big a component in the total runtime because it's generally all just text. The extra time taken by network communication may in fact be canceled out by the fact that large companies are already running their setups on specialized hardware, but even if this isn't quite correct it's probably close enough as makes no difference. (It's possible that there are delays based on queuing of their massive numbers of users, though!)

POSTED BY: Arben Kalziqi

Carl Hahn

Carl Hahn, Self (with some exceptions)

Posted 3 months ago

Hey Arben, As always, when you teach a class, I sign up. I thought I had an example here but it was premature. But sometimes the notebook assistant get's lost trying to invent functions that don't exist. (when I am being lazy and trying to get the assistant to write some code for me) It seems like an inspiration for writing a function that does what it thinks should exist, but what do you do generally when that happens?

POSTED BY: Carl Hahn

Arben Kalziqi

Arben Kalziqi, Wolfram Research

Posted 3 months ago

Hey Stephen—could you run the following: ``PacletInstall["Wolfram/Chatbook", UpdatePacletSites -> True]` and fully quit and reopen the app and see if it starts to work? It looks like the bug you're reporting was known about in an older version of the Chatbook paclet that powers all of the Chat cell functionality but was fixed a little while back. (I'm actually going to just recommend everybody run this at tomorrow's session to head off any issues!)

POSTED BY: Arben Kalziqi

Stephen Turner

Stephen Turner, Software Engineer

Posted 3 months ago

I have registered a OpenAI API key in Preferences / AI Settings / Services. I then select "GPT 4.1 Mini" as the model. My natural language prompt works fine. But, if I change the model to "GPT-5 Mini", the same prompt fails with... The service returned the following error message: Unsupported value: 'temperature' does not support 0.7 with this model. Only the default (1) value is supported. I can fix this problem at Preferences / AI / Settings Services, setting Temperature to 1, and try again. Now it fails with... The service returned the following error message: Unsupported parameter: 'presence_penalty' is not supported with this model. What to do? All the GPT-5 models fail in the same way. I reported this two-months ago, and nothing has been done.

POSTED BY: Stephen Turner

Phil Earnhardt

Posted 3 months ago

Will you include a demo running a local LLM during this course? Is there anything special about accessing an LLM locally, or does that look the same as accessing remotely? I believe that 2026 will be the first year that local LLM usage becomes widespread. With their A19 Pro and M5 processors, Apple has straightened out a major kink in their architecture. When Apple announced the iPhone 17 models at their September 9 press event, they included a brief discussion of the A19 Pro's new GPU accelerator in the event: We have been at the forefront of AI acceleration since we first introduced the neural engine eight years ago. We later brought machine-learning accelerators to our CPUs. And while our GPU has always been great at AI compute, we’re now taking a big step forward, building neural accelerators into each GPU core, delivering up to three times the peak GPU compute of A18 Pro. This is MacBook Pro levels of compute in an iPhone, perfect for GPU-intensive AI workloads. That was a bit vague. Fortunately, the "Petar Tech" YT channel explained more in the video Demystifying Apple's AMX AI accelerator: when and why it outperforms the GPU. In a nutshell, matrix multiplication -- a cornerstone of LLM training and token generation -- happens atomically in this chip. Historically, the elements of the two matrices being multiplied are fetched from memory multiple times. Petar notes in his video commentary: *Because nowadays, the energy needed to transfer data significantly exceeds the energy needed to perform computations on the data.* Apparently, NVIDIA has been doing this for several generations on their Tensor Cores; Apple is just catching up on this particular processor feature. Apple has incorporated the same per-GPU accelerators in its M5 Processors that were released for the low-end MacBook Pro and iPad Pro models this month. In one early test, "Time to First Token" was measured about 3.6x faster than a M4 processor. That baseline M5 processor is limited to 32GB of unified memory, but the "M5 Pro" and "M5 Max" chip computers (both available early 2026) should have up to 128GB of unified memory available. Now that Apple has had time to percolate this change through the years of chip layout and fab, we should be able to run some general-purpose LLMs locally starting soon. All of Apple's M5 processor computers should be far more attractive than earlier processors for any sort of AI development or usage.

POSTED BY: Phil Earnhardt

Arben Kalziqi

Arben Kalziqi, Wolfram Research

Posted 3 months ago

Actually, Paul, could you run this and report back? `PacletInstall["Wolfram/Chatbook"]` We can't reproduce it in 14.3, or pre-release versions, so it might be that your paclet is out of date and hasn't auto-updated yet.

POSTED BY: Arben Kalziqi

Arben Kalziqi

Arben Kalziqi, Wolfram Research

Posted 3 months ago

This is being looked into :)

POSTED BY: Arben Kalziqi

Arben Kalziqi

Arben Kalziqi, Wolfram Research

Posted 3 months ago

Thanks for letting me know—I'll get this filed with the appropriate team, in case they're not already aware.

POSTED BY: Arben Kalziqi

Arben Kalziqi

Arben Kalziqi, Wolfram Research

Posted 3 months ago

Phew, glad to hear it!

POSTED BY: Arben Kalziqi

Arben Kalziqi

Arben Kalziqi, Wolfram Research

Posted 3 months ago

Ah, yes! Many interactive elements in notebooks subtly depend on dynamic updating's being enabled.

POSTED BY: Arben Kalziqi

Paul Nielan

Posted 3 months ago

Dark mode and chat cells and forward slash do have some problems with readability

POSTED BY: Paul Nielan

Paul Nielan

Posted 3 months ago

POSTED BY: Paul Nielan

Paul Nielan

Posted 3 months ago

POSTED BY: Paul Nielan

George Wolfe

George Wolfe, Syntax Indices & Data

Posted 3 months ago

I fixed this problem

POSTED BY: George Wolfe

George Wolfe

George Wolfe, Syntax Indices & Data

Posted 3 months ago

Hi Arben. My notebooks seem to be stuck going to ChatGPT. How do I undo the? Do I need to get the notebook advisor?

POSTED BY: George Wolfe

Arben Kalziqi

Arben Kalziqi, Wolfram Research

Posted 3 months ago

You know, I was flustered this morning by the account issue and called that forward slash when it's explicitly backslash :). But—as you mention—it is the key above Enter/Return on most keyboards. My only thought as to why it might not work is an obscure/arcane stylesheet setting. If you open a fresh notebook with cmd+N, immediately open a Chat cell, and type \ in there, does it work? If not, you may want to contact support@wolfram.com because I'm not sure why that would be...

POSTED BY: Arben Kalziqi

Paul Nielan

Posted 3 months ago

In Day 1 of the class (Monday), the use of the forward slash key within a chat is discussed. It's purpose is to enter Wolfram Language in the chat. On my Mac, running 15.7.2 and Wolfram 14.3, When I push the forward slash key (above return), it is not entered. And so nothing happens. Any ideas? I have TextExpander turned off. I can enter forward slash in other applications.

POSTED BY: Paul Nielan

Paul Nielan

Posted 3 months ago

Working at November 10, 2025 at 090351_AM.

POSTED BY: Paul Nielan

Reply to this discussion

Reply Preview

Attachments

Remove Add a file to this post

Follow this discussion

or Discard

Feedback