Message Boards Message Boards

Wolfram's Notebook Assistant not included in LLM benchmarking

Posted 3 days ago

Why is wolfram's own Notebook Assistant not included in the LLM benchmarking site? I'm curious to see how it compares.

POSTED BY: Dinesh Rao
2 Replies

Hi Dinesh,

the benchmark is measuring the performance of raw LLMs on the task of writing WL code. The notebook assistant is using RAG based on language documentation to boost the relevance of its answers so it would be like cheating!

While the assistant won't have it's own row, in the future we could add a new column to measure how each model uses the same extra knowledge.

Cheers

Posted 2 days ago

Thanks!

POSTED BY: Dinesh Rao
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract