On today's (Thursday) session, I raised the issue of a very long wait times for an LLM response - on the order of 20-30 seconds.
Arben correctly surmised that I was using GPT-5 (actually using GPT-5 Mini), and suggested that I change to GPT-4o. This worked. Response times dropped to 2-3 seconds.
So I decided to test the most of the modern OpenAI models.
All GPT-5 models, full / Mini & Nano, were 20-30 seconds. All of the GPT-4o and GPT 4.1 models were 2-3 seconds. This is VERY fishy.
When using GPT-5 through ANY chatbot (ChatGPT, Perplexity, Microsoft Copilot, GitHub Copilot, etc), I get instant responses. When using GPT-5 directly with the Python API, I get instant responses.
My suggestion is that there is a GPT-5 specific code path within Mathematica which is broken and causing these delays. Please escalate this.
The massive improvement in GPT-5 (and shortly GPT-5.1) reasoning and instruction following robustness is essential for a lot of tasks. It has to work.
Note: I am using the Windows 11 version of Mathematica 14.3 (installed yesterday from a fresh download, and with the packlet update) on a high end Dell laptop.