Message Boards Message Boards

Is Wolfram Siri-ous about AI?

You would have to have been living under a rock not to have noticed the hoopla over the launch of ChatGPT, OpenAI's large language model, which signed up over 1 million users in just a few days after launch. Tech pundits rushed out dozens of articles describing in breathless tones the wondrous capabilities of this new AI technology, in applications as diverse as musical composition and resume writing to programming.

In a December 2022 opinion piece, economist Paul Krugman wrote that ChatGPT would affect the demand for knowledge workers (Krugman, Paul (December 6, 2022). "Does ChatGPT Mean Robots Are Coming For the Skilled Jobs?" . The New York Times. Retrieved December 6, 2022.

James Vincent saw the viral success of ChatGPT as evidence that artificial intelligence had gone mainstream (Vincent, James (December 8, 2022). "ChatGPT proves AI is finally mainstream – and things are only going to get weirder"

Only now, in the cold light of the grey January morning are users beginning to wise up to the fact that a lot of what ChatGPT produces is gibberish, dressed up to look like a valid response to the user's prompts and delivered in a highly confident tone. But as for the content itself, well that's another matter entirely.

Dr Darren Hudson Hick, for example, provided an example of ChatGPT-inspired plagiarism in which the essayist

confidently and thoroughly described Hume's views of the paradox of horror in a way that were [sic] thoroughly wrong

In his article Testing ChatGPT in Mathematics — Can ChatGPT do maths?, Hari Prasad found that the system:

failed to admit its mistake, challenged right premises and processed inappropriate requests that it couldn’t or not supposed to handle.

The area where ChatGPT has attracted the most positive reviews is code generation, with many programmers claiming that the system has multiplied their own productivity many-fold. But a recent post in the community by I van Veen (Unexpected answer from ChatGPT for Wolfram language query) illustrates the problem. As I wrote in my response:

the answer given has some elements that appear correct and clearly suggest some grasp of the WL syntax. Consequently, the answer seems plausible, to someone with limited knowledge of WL.

Secondly, however, the answer contains several basic errors.

So what has any of this to do with Wolfram?

I believe that the code-generation capabilities of platform like ChatGPT and GitHub's Copilot represent a serious threat, as well as an opportunity, for Wolfram.

Firstly the threat: one of the arguments made in favor of using WL compared to alternatives is that, despite bring proprietary and (by comparison) expensive, it compensates for that in programming productivity, thanks to the power of the language. However, if I can learn to generate large blocks of code in Python or C++ automatically using a NLP interface, my programming productivity is eventually going to match or exceed anything Wolfram has to offer.

Which bring me to the opportunity, one that I have been banging the drum about for a very long time: automated code generation in the WL. What I am suggesting is not a competitor to ChatGPT, which can generate responses to almost any kind of question, but a more limited NLP interface to rival Copilot and similar products. The Wolfram language is built for this, not only because of its computational power, but also because the language itself is computable, making it an ideal choice for automated expression composition.

So the idea is that you would interact with the interface like this:

  1. How do I speed up this code using parallelization?

  2. Show me how to write a compiled function to do the following...

  3. Write a program to play play tic-tac-toe

  4. Find and fix the bug in this code

  5. Provide 3 examples of widely used ketones, show me their chemical structures and describe their properties and applications

  6. Give me a step-by-step solution for the following integral

  7. Identify all the actors appearing is this video clip

  8. Write a WL api interface from Mathematica to this platform, covering the following functionality...

And so on.

My thesis is that if WR develops a game-changing capability like this it will add rocket-fuel to the productivity of WL developers and attract a cohort of new users who currently have neither the time nor programming skills to navigate the steep terrain of the Wolfram Language.

On the other hand, if WR continues on its current trajectory of incremental releases, it is only a matter of time before AI-assisted programming platforms overtake and surpass the WL in terms of programming productivity. In that case the WL will likely become yet another could-have/should-have story in the history of defunct computer languages.

POSTED BY: Jonathan Kinlay
6 Replies

Hi Alec,

Thank you for this very thoughtful reply.

I 100% agree with you that

Programming languages are fundamentally about representing an abstract computer that developers know how to talk to and which is ideally easier to talk to than the underlying hardware. Language models have demonstrated an ability to... well... model language...

I would add that computer languages are (or are supposed to be) much more highly structured than natural languages and therefore easier to model. Plus, as I keep repeating, in the case of WL the language itself is computable and is architected in a tree-like structure that can be traversed programatically.

From what I have read, I also think this is likely to be true:

I would bet over 95% of accepted results required further modification from an experienced developer to actually solve the problem at hand.

However, I think the capabilities of ChatGPT/Co-pilot may be greater than you suggest. Here is one brief anecdote from a source I trust, who is a highly experienced quant. He was playing with ChatGPT over the holidays and asked it to build a Monte Carlo simulation of a Geometric Brownian Motion process. He was highly impressed with the Python code that ChatGPT produced in a couple of minutes which, he told me, would have taken him a couple of days to program himself. I didn't have the heart to tell him I could have probably done the same in WL in under a minute! The point is: he had the domain knowledge to assess that the output was correct and it saved him a bunch of time. The fact that it was a couple of hundred lines of Python code vs 1-2 lines of WL is neither here nor there, if they both work and take about the same time to produce. That's the "threat" that NL interface platforms like ChatGPT represent.

Your next point is well-made and I agree with much of it:

Beyond that, the difficult part of programming is not typing the syntax. I think it is instead figuring out what problem you want to solve and how to integrate that solution into the larger system in a coherent way. The real issue often is the part about figuring out the problem and how to integrate the solution, and you illustrate that very well with you Wolfram Alpha examples. But here, too, NLP interfaces can help the user iterate towards a correct formulation of the problem, which is indeed what happens with your WA examples.

But my complaint is that we are well past the point where Wolfram Alpha should be able to figure out the answer to the instruction:

  • generate a list of random ketones

just as easily as it does for the instruction:

  • generate 5 random ketones

It's frustrating that in the first case WA isn't capable of generating any kind of response. It should be able to (I am quite sure that ChatGPT would). So the issue is that the NL interface for Wolfram Alpha currently just isn't good enough (and there really isn't one for Mathematica/WL in general).

Secondly, I do not agree that programming syntax is not an issue. I believe that for the WL it is very much an issue, especially for beginners. Furthermore, even for experienced users like myself, there are scores of functions I almost never use, or am even aware of - other users have even begun to complain about the bloating of the language with duplicative functionality. It would be great to have an NLP interface that is able to deploy any and all WL functions and programming concepts and show me their syntax and utility - I would learn a huge amount that way.

I believe that, in general, WR is currently underperforming in terms of producing ML-oriented functionality. This applies not only to the issue discussed here, which I would say is a key missing ingredient from the Wolfram tech stack, but also more mundane but critically important ML architectural features such as Transformers and frameworks for Reinforcement Learning which Matlab, for example, has offered in least the last four releases. It's past time for WR to step up to this challenge.

Wolfram started out with a considerable competitive advantage in terms of its thoughtful application of a NL interface to a curated set of high quality computable data in Wolfram Alpha. But that was years ago and the lead has been squandered. Competitors are going to eat WR's lunch if it doesn't respond soon.

If it were up to me I would be prepared to divert some large percentage (say 20%) of dev resources from other projects to get this done - it's far more important than ongoing projects such as the Wolfram compiler, for instance.

Release 14 should be about Machine Learning and a new NL interface for the WL. Everything else can take a back seat, for now.

POSTED BY: Jonathan Kinlay
Posted 2 years ago

I agree that more should be done with NLP and code completion, as this is a path towards widespread end-user programmability and language adoption.

Programming languages are fundamentally about representing an abstract computer that developers know how to talk to and which is ideally easier to talk to than the underlying hardware. Language models have demonstrated an ability to... well... model language, and thereby provide code that appears correct in the context of an input prompt. I think we are still far away from a system that can generate programs that solve real-world problems given natural language inputs from inexperienced end users (and honestly, Wolfram Alpha is probably closer to that than CoPilot for many problems). Even the results from co-pilot are only accepted about 25% of the time according to GitHub's CoPilot FAQ - and I would bet over 95% of accepted results required further modification from an experienced developer to actually solve the problem at hand.

I believe the value of tools like CoPilot or Wolfram|Alpha-mode notebooks rests largely in their ability to teach elements of syntax and style of a new programming language to unfamiliar developers. This will likely be quite similar to how artists use reference images or developers in the distant, pre-AI past used Stack Overflow, tutorials, and documentation examples.

Beyond that, the difficult part of programming is not typing the syntax. I think it is instead figuring out what problem you want to solve and how to integrate that solution into the larger system in a coherent way. I believe doing this will always take a lot of effort, though I am hopeful that machines might eventually do some more heavy lifting. I think better starting points generated by language models would have a chance to improve developer productivity, just as an artist starting by tracing an image will be able to finish much faster than one starting from a blank canvas. However, I do worry that over-reliance on that starting point will result in an increase in needless complexity (and associated errors) and a stagnation in style.

Anyway, I think Wolfram|Alpha notebooks could use some serious machine learning NLP upgrades. Co-Pilot-style code generation and output filtering could pave the way to greatly increasing the number of inputs that can be turned into valid code. Also, while Wolfram Language does not have the largest dataset around, it certainly has one of the highest-quality and cleanest datasets: look at the documentation and Mathematica Stack Exchange answers!

POSTED BY: Alec Graves
Posted 2 years ago

Jonathan, thank you for the insightful discussion.

I agree completely that the WA Notebook interface should have been able to generate at least something for each query I provided.

I believe that for the WL it is very much an issue, especially for beginners. Furthermore, even for experienced users like myself, there are scores of functions I almost never use, or am even aware of - other users have even begun to complain about the bloating of the language with duplicative functionality.

I also agree completely with this. For new users, the WL syntax is especially tricky, being not only a different language but often a completely different programming paradigm (functional + pattern matching with reflective rewrite semantics) than many popular languages today. And even for experienced users, it can be a lot of work to figure out the exact syntax for the growing number of built-in functions. WL has notably fallen behind competitors like Python(+Jetbrains/VSCode) when it comes to development experience even without fancy autocomplete like CoPilot.

For example, modern users of Python libraries have become accustomed to beautiful docstrings with markdown-formatted documentation and examples appearing in their editors any time they hover over a function or class. Ctrl+click, and you are staring at the actual implementation. The WL documentation center cannot compete with the seamlessness and speed of this integration. These well-implemented classical text processing and presentation methods have the ability to improve the experience of writing WL 100x. Having a solid code-completion engine behind WL (like is available through Halirutan's WL plugin for JetBrains) makes the experience 10x easier than notebooks for real program development. Rainbow brackets makes real-world, bracket-heavy WL code 5x easier to read just by itself. Even a simple streamlined interface in which you can highlight code and have it pull up relevant documentation examples would be incredibly nice. That does not require GPT-3 levels of machine learning, either.

There is no excuse for these things to be missing from notebooks, and I was really happy to see all of the work that Brenton Bostick did for codeparser, codeinspector, and LSP Server integration. I really hope that work continues to be a high priority for WRI, as that effort has in my opinion begun the rescue of WL's notebooks.

WR is currently underperforming in terms of producing ML-oriented functionality.

Agreed, but I think that is mostly due to lack of real-world use cases. I had to deal with a pretty bad memory bug for several releases that required restarting kernels to free GPU memory. It seems like nobody used the framework for anything beyond toy problems, so the many edges were left in. NeuralNetworks' functionality is in a much better place today, and has been continually improving over the last few releases.

Transformers

Yeah, I wanted to use attention layers for a project in the past and they were technically supported, but the documentation had no examples for a year. It looks like it has example usage now, so that is pretty cool.

But that was years ago and the lead has been squandered. Competitors are going to eat WR's lunch if it doesn't respond soon.

Agreed, there is still some catching up to do.

If it were up to me I would be prepared to divert some large percentage (say 20%) of dev resources from other projects to get this done - it's far more important than ongoing projects such as the Wolfram compiler, for instance.

Yes, but compiler is the other thing that I am really excited about (and it directly helps machine learning e.g. FunctionLayer and general CPU efficiency for data-intensive applications). I really want to be able to compile my finished WL application to a nice 100MB download and ship it with my software.

Release 14 should be about Machine Learning and a new NL interface for the WL. Everything else can take a back seat, for now.

Giving ML functionality and Neural Network repository the attention it deserves would make me super happy (speaking of the network repository, have you seen HuggingFace?). An ML-NLP-super-charged Wolfram|Alpha Notebook mode would be epic!

( but please do not stop working on the compiler :) )

POSTED BY: Alec Graves
Posted 2 years ago

Also there is this beautiful work that was uploaded to YouTube yesterday:

"Natural Language Processing Template Engine" by Anton Antonov (YouTube)

I agree with many of the points made regarding ML and NLP. A super smart template engine is a very powerful and repeatable method for code generation. Add some ML template selection and improve the output filtering, and you have something that could rival CoPilot's 25% acceptance rate in many instances.

POSTED BY: Alec Graves

From Stephen Wolframs’s latest writings

What about ChatGPT directly learning Wolfram Language? Well, yes, it could do that, and in fact it’s already started. And in the end I fully expect that something like ChatGPT will be able to operate directly in Wolfram Language, and be very powerful in doing so. It’s an interesting and unique situation, made possible by the character of the Wolfram Language as a full-scale computational language that can talk broadly about things in the world and elsewhere in computational terms.

Q.E.D.

Nice to find oneself talking to something other than a brick wall, for once.

POSTED BY: Jonathan Kinlay

Stephen Wolfram just published a new article:

Wolfram|Alpha as the Way to Bring Computational Knowledge Superpowers to ChatGPT

https://writings.stephenwolfram.com/2023/01/wolframalpha-as-the-way-to-bring-computational-knowledge-superpowers-to-chatgpt

We started a new dedicated discussion here:

Stephen Wolfram on ChatGPT, Wolfram|Alpha & Computational Knowledge
https://community.wolfram.com/groups/-/m/t/2763581

POSTED BY: EDITORIAL BOARD
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract