Message Boards Message Boards

Notebook Assistant Wishlist

I have been using the Notebook Assistant for a while, and as my previous review indicates, results have been less than stunning.

I think it would be useful to have user input on what the Notebook Assistant should do, and this may help the software engineers craft a tool that is useful in the real world.

The discussion is not interested in how these goals are attained. I may be skeptical about the general utility of LLMs, but that is really not the point. Wolfram Language and its ecosystem has grown large and complicated enough that even the most experienced users can use help.

Some of the items on this wish list have been suggested by demos of the current iteration, plus some of my own exploration, so I think that they can be realized.

The main issue in any of this is what 'reliable' really means in context. For example, LLMs may produce useable output 90% of the time for general use, but 90% is not sufficient for mission critical tasks, and I count producing code as mission critical.

So here is my first stab at a list:

  1. Discovery The assistant should be able to find useful functions in both the built-in language and function repositories. That is, if I ask the assistant to make a function or answer a question, it will (always) make use of built-in functions or vetted functions in the function repository. (Stephen has commented that the language has grown so large that even he cannot remember all the functions.)

  2. Validation This wish has several parts. The first is that any code it presents should have been validated before the user sees it. Validation comes at several levels, of course. The lowest level, and one that the code should always meet, is syntax validation. (The code will not create any red or orange boxes, for example.) The next level is that the code should pass some basic unit tests. This, I realize is a non-trivial task, but I think it is doable with the current level of technology, and probably not requiring machine learning of any kind. The code should be checked to see if it actually answers the user's question. I think that this is the intent of the Notebook assistant, and may be difficult to achieve without technology beyond LLMs. Again, there are degrees of compliance, and getting 'close' for suitable definition of 'close' may be good enough. I understand that there will probably be some iteration needed to get a satisfactory solution, so the object is to get close enough so that the amount of back-and-forth is minimized.

  3. Debugging This wish is related to the second one, except there the object is validation of code that the assistant creates, where here the object is to fix or improve the user's code. Clearly, if the assistant can do this task with the user's code, it should be able to do the task with its own code. [In my experience, this is the most frustrating aspect of the current Notebook Assistant: it can't fix simple errors in its own code, let alone more subtle problems with my code. ]

  4. Boilerplate operations Stephen has hinted at this task. Once I have some function working, I should be able to ask the assistant to prepare it for the function repository, for example. There are a whole lot of other tasks of a similar nature.


That's it for now. I would really appreciate comments and additions.

Wolfram Language (indeed any advanced software system) needs this level of technology to deal with increasingly complicated tasks. Beginners, as well as advanced users need help, although of different types. I think that the community can help with the design as well a being beta testers for new systems.

Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract