Message Boards Message Boards

Direct API access to new features of GPT-4 (including vision, DALL-E, and TTS)

11 Replies

Hi Everyone,

there is an updated version the functions and additional functionality available via my newer posts such as Google's Gemini - API access from the Wolfram Language and Mistral AI via the API in the Wolfram Language

Please have a look there for more on LLMs in MMA.

Cheers, Marco

POSTED BY: Marco Thiel

Great post!

FWIW, I was experimenting a bit over the weekend, and I suspect that you may be able to access Dall-E-3 via the OpenAI service function

ServiceExecute["OpenAI", "ImageCreate",
 {"Prompt" -> "A cartoon of a a capybara riding a motorcycle and wearing a bowtie",
 "Model" -> "dall-e-3"}]
POSTED BY: Joshua Schrier

Useful, thanks for posting the code! enter image description here

POSTED BY: Anton Antonov

Dear Joshua,

Thank you for your comment. You are absolutely right. There is a lot of functionality built into the Wolfram Language and Service connect. This post is supposed to be one out of a number of posts on different LLMs as well. By now I have implemented functions for Llama, Mistral and Gemini as well.

I really like the functionality within the Wolfram Language, but sometimes the "manual" method might offer some advantages. If new features roll out, direct calls allow us mostly to use the newest features directly, including new endpoints and parameters, for example a seed so that we get consistent answers or using TTS the day it came out. It is also quite easy for me to change the API keys I use if there are private and work related ones for example.

Also, it appears to me that right now the LLMs available and the features of them, change so quickly that it is basically impossible to add all the functionality immediately into cannon Wolfram Language. For my part I often cannot wait to play around with the new functionality.

I found that this is all really easy to do now as GPT writes the functions if you show it the API documentation of a new LLM; I wanted to test out how much of that can be automated by simply talking to GPT.

I do agree thought that if you want to use GPT only and for most users the built in functions will be more than enough.

Cheers,

Marco

POSTED BY: Marco Thiel

Thanks for making this post / notebook!

It simplified my newest post "AI vision via Wolfram Language". And, of course, it is referenced there. (A few times...)

POSTED BY: Anton Antonov

Dear Anton,

I read your post and really like it (as always). It is very useful indeed.

I have just posted a couple of posts on other LLMs such as Gemini and Mistral.

Cheers,

Marco

POSTED BY: Marco Thiel

This is so useful. Thank you very much!

POSTED BY: James Choi

If you enjoyed this post you might be interested in another post of mine where I introduce a function to interact with the API of Llama (another LLM) and compare it with GPT.

Cheers, Marco

POSTED BY: Marco Thiel

There is a related post I just made available:

https://community.wolfram.com/groups/-/m/t/3062832

It uses some of the functionality described in this post to produce images like these:

enter image description here

POSTED BY: Marco Thiel

enter image description here -- you have earned Featured Contributor Badge enter image description here Your exceptional post has been selected for our editorial column Staff Picks http://wolfr.am/StaffPicks and Your Profile is now distinguished by a Featured Contributor Badge and is displayed on the Featured Contributor Board. Thank you!

POSTED BY: Moderation Team
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract