Group Abstract Group Abstract

Message Boards Message Boards

4
|
5.5K Views
|
18 Replies
|
15 Total Likes
View groups...
Share
Share this post:
GROUPS:

[WSG25] Daily Study Group: What is ChatGPT Doing... and Why Does It Work?

Posted 8 months ago

A one-week Wolfram U Daily Study Group covering Stephen Wolfram's best-selling book What is ChatGPT Doing... and Why Does It Work? begins on Monday, March 3, 2025.

enter image description here

Join a cohort of fellow learners to discover the principles that underly ChatGPT and other LLMs. I have adapted the material from the aforementioned book into a series of four notebooks covering topics ranging from probabilistic text generation to neural nets, machine learning, embeddings and even transformer models.

On the final day of the study group, I will go through a bunch of interesting examples using our new Notebook Assistant, which closely integrates powerful LLMs with the Wolfram documentation system and notebook interface.

Stephen Wolfram's book is aimed at anybody who is curious about these ideas, and this study group follows the book's lead. Therefore, no prior Wolfram Language, machine learning, or even coding experience is necessary to attend this study group.

Please feel free to post any questions, ideas and/or useful links in this thread between sessions—we always love to continue the discussion here on Community! If you'd like to read the discussion from the last time we ran this study group, you can find that here.

This is a one-week study group that will run from March 3 through March 7 2025 at 11:00am Central US Time each day.

REGISTER HERE

enter image description here

POSTED BY: Arben Kalziqi
18 Replies

Thanks, Laurence! One of the big reasons that LLMs don't get stuck is that they live in such a high-dimensional space. Imagine you're on some 2D surface: if you're in a well, it's hard to get out because you only have two directions to move—but imagine you had tens of thousands of directions to try to get out. It's much easier!

POSTED BY: Arben Kalziqi
Posted 8 months ago
POSTED BY: Updating Name
Posted 8 months ago
POSTED BY: Phil Earnhardt
POSTED BY: Arben Kalziqi

Hmm... annoying!! I absolutely dropped it in there yesterday. I'll make sure it works this time.

As far as DeepSeek goes, as best I can tell there are indeed some architectural improvements, but the overall idea and structure remain the same and most of the improvements are in the training process. This article from MIT Technology Review is insightful! https://www.technologyreview.com/2025/01/31/1110740/how-deepseek-ripped-up-the-ai-playbook-and-why-everyones-going-to-follow-it/

POSTED BY: Arben Kalziqi
Posted 8 months ago
POSTED BY: Angel Rojas

I don't! If you're referring to the Q&A digests, that's because those are just logs of the live chats from the sessions rather than email or otherwise "official"/formal communications. I think you'll find that the number of people who capitalized the first word of their messages back on AIM in 1997 was also quite small—though to your point, I do imagine that that number is shrinking over time as people have more access to instant back-and-forth communication. Language does change over time, and while I am largely a stickler for rules in a visceral sense I'm certainly not a prescriptivist. (If I were, I might point out that you use a hyphen rather than an em-dash in your last sentence, and add a novel space before the ellipses :). Language—written and spoken—always changes, particularly when exacerbated by the movement to new mediums and entry tools like keyboards where it's easier to type a hyphen than an en- or em-dash.)

POSTED BY: Arben Kalziqi
Posted 8 months ago

Also, the idea of converging to a global minimiser of the loss function would not be ideal. We do not want our model to overfit the outputs of the training set, as in the supervised learning scenario.

However, the "descent" direction that an optimiser computes should not be only descent. Why? Because the more we explore the space of the parameters is much better to find suitable "contextual" fits to the output space. By exploring as many neighborhoods of local minimizers (while converging to local minimum) as possible, we could improve the training by starting our search from a local minimizer that worked for certain context. However, the optimizer would need to be smart enough to scape again from that local minimizer.

POSTED BY: Angel Rojas

Gradient decent optimization procedures can get trapped at falsw (local) minimums. How does Chat GPT avoid or correct false mionimums? Aside Incredible clas, many thanks Arbin.

POSTED BY: Laurence Bloxham
Posted 8 months ago

Okay, I found it (posted two days ago) using a link from the latest email received two hours ago (3/12 at 11:59AM).

POSTED BY: Gerald Oberg

Added!! Sorry about that!

POSTED BY: Arben Kalziqi
Posted 8 months ago

Arben, Digest_Day5 is still not showing up in the Daily Q&A Digests. Did you by any chance put it in the Series 58 ChatGPT rather than the Series 62 ChatGPT? Thanks for the link. The three that I put in the other post are more about the geopolitical implications of AI and LLMs rather than their technology.

POSTED BY: Gerald Oberg
Posted 8 months ago

Arben, On Wednesday I had asked, "Is DeepSeek doing something fundamentally different, or did they find a way to do it more efficiently?" You responded, "I have to say I'm not sure on that front—I'll try to look into this tomorrow and post about it in the Community thread." Have you had a chance to look into this yet?

Also, I am not seeing a Digest_Day5 in the Daily Q&A Digests.

POSTED BY: Gerald Oberg

Ah, yes! The digest can be added asap, which at this hour probably means Monday morning :). As for an updated notebook—if you mean the transcript of the chats, I've added that already and it should be visible. Let me know if not! (If you mean the questions people asked in the Thursday survey, I'll try to review those more thoroughly and provide answers where reasonable.)

POSTED BY: Arben Kalziqi

Arben, Thanks 10^6 for a wonderful course! Question: Will you be posting an updated Q&A Digest along with an updated notebook for Day 5? We would appreciate it very much.

POSTED BY: Zbigniew Kabala
Posted 8 months ago

Arben, Can you please explain your decision in your communications to abrogate the standard rule of capitalizing the first letter in a sentence? Is that just a strategy to minimize the time to post a response to a question/comment? Is it a standard style followed by certain tech people? Would you do the same in more formal settings, such as published articles? Are you in the vanguard of a movement to change the English language? No disrespect or criticism intended - I am just curious …

POSTED BY: Gerald Oberg
Posted 8 months ago

In response to the survey question, "How will you use what you learned at this Daily Study Group?" I wrote: I will have a better appreciation and understanding of what is being discussed in the numerous newscasts, podcasts, articles, interviews that one encounters about ChatGPT or other LLMs. Something the course did not address: How could the technology overviewed lead to the catastrophic results people are predicting about AGI? One would not think that even monstrously large matrices could produce malicious consciousness. (I am sometimes reminded of the book I read as a teenager, "Colossus: The Forbin Project", a 1966 science fiction novel by D. F. Jones, about super-computers taking control of mankind.) The things Arben demoed are really impressive, but there is no "mind" (with potential intensions) producing them. I would like to hear Arben's thoughts about these issues. Even more, can you point us somewhere that Stephen Wolfram has discussed these issues (potential threats of AI or AGI)?

Here is a good expert discussion: https://drive.google.com/file/d/1JVPc3ObMP1L2a53T5LA1xxKXM6DAwEiC/view

Also: https://www.google.com/books/edition/The_Age_of_AI/Y2QwEAAAQBAJ?hl=en&gbpv=1

This could be considered a post-DeepSeek update to the reference above: https://www.csis.org/analysis/deepseek-huawei-export-controls-and-future-us-china-ai-race

POSTED BY: Gerald Oberg

Thanks Arbin, I did not see this aspect of the problem. Very interesting,

POSTED BY: Laurence Bloxham
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard