Group Abstract

Message Boards

1.2K Views

2 Replies

2 Total Likes

View groups...

Follow this post

Share this post:

GROUPS:

External Programs and Systems Wolfram Language Natural Language Processing

Posted 6 months ago

Why is wolfram's own Notebook Assistant not included in the LLM benchmarking site? I'm curious to see how it compares.

POSTED BY: Dinesh Rao

2 Replies

Sort By:

Posted 6 months ago

Hi Dinesh, the benchmark is measuring the performance of raw LLMs on the task of writing WL code. The notebook assistant is using RAG based on language documentation to boost the relevance of its answers so it would be like cheating! While the assistant won't have it's own row, in the future we could add a new column to measure how each model uses the same extra knowledge. Cheers

POSTED BY: Giulio Alessandrini

Posted 6 months ago

Thanks!

POSTED BY: Dinesh Rao

Reply to this discussion

Reply Preview

Attachments

Remove Add a file to this post

Follow this discussion

or Discard

Feedback