Message Boards Message Boards

8
|
2341 Views
|
30 Replies
|
32 Total Likes
View groups...
Share
Share this post:
GROUPS:

[WSG24] Daily Study Group: What is ChatGPT Doing... and Why Does It Work?

Posted 2 months ago

A one-week Wolfram U Daily Study Group covering Stephen Wolfram's best-selling book What is ChatGPT Doing... and Why Does It Work? begins on Monday, September 9, 2024.

enter image description here

Join a cohort of fellow learners to discover the principles that underly ChatGPT and other LLMs. I have adapted the material from the aforementioned book into a series of four notebooks covering topics ranging from probabilistic text generation to neural nets, machine learning, embeddings and even transformer models.

Giulio Alessandrini, our manager of machine learning at Wolfram, will join to answer your questions on the day we cover neural networks. On the final day of the study group, Alan Joyce—who directs content development at Wolfram|Alpha—will join us to explain how the computational powers of Wolfram Language can be integrated with ChatGPT, and why this integration is much more than the sum of its parts.

Stephen Wolfram's book is aimed at anybody who is curious about these ideas, and this study group follows the book's lead. Therefore, no prior Wolfram Language, machine learning, or even coding experience is necessary to attend this study group.

Please feel free to post any questions, ideas and/or useful links in this thread between sessions—we always love to continue the discussion here on Community!

This is a one-week study group that will run from September 9 through September 13, 2024 at 11:00am Central US Time each day.

REGISTER HERE

enter image description here

POSTED BY: Arben Kalziqi
30 Replies

Thank you very much! I'm glad I was able to properly convey the ideas outlined in Stephen's book.

POSTED BY: Arben Kalziqi
Posted 2 months ago

Reinforcement learning, maybe?

POSTED BY: Héctor Galaz

Thanks for the clarifications! And congrats, your presentations in the study group were excellent.

Honestly, I'm not aware of any. That doesn't mean that they don't exist, but if they do, I don't think I've heard of them. Now, ChatGPT can write code in a bunch of different languages, and you can train models to learn different programming languages, but as far as "providing information about the world from strictly factual sources and enabling more general computation", I don't think there's any other one.

POSTED BY: Arben Kalziqi

With models like these, the training process is something that costs O(billions) of dollars, so whenever you tell the model something in your prompt, it does not learn it or store it—it just takes it in as a token and passes it through the layers, and based on that alone it's able to provide a relevant answer. So the answer to your question is: only the current session that your student has open will be aware of the new info, and that's because a "session" stores all input and output and uses the whole dang thing as a big input every time.

That said—I can't guarantee that the things you type into a ChatGPT session are not being stored and used for further training. I'd be kind of surprised if they weren't, but that's just a technological judgment and not a moral or probabilistic one.

POSTED BY: Arben Kalziqi

What other software language does the similar to Wolfram language to synthesizing with ChatGPT?

POSTED BY: Taiboo Song

Awesome class and now understand the ChatGPT under the hood. It is really amazing that Steven Wolfram already knows the future and started using neural networks early. How did Steven get into using neural networks? Any good course that you recommend to learn about neural network?

POSTED BY: Taiboo Song

Thank you. Another question in another context. I am teaching thermodynamics. I was thinking about ways to use ChatGPT in my class. My students are using ChatGPT to solve their homeworks but they are have problems detecting some mistakes in the answers that AI gives to them. So I would like to assign the a homework where they get a wrong answer from the chat, but they would have to interact with it to "train" it and get the right answer. What would happen in this scenario? The whole AI is going to learn from the information my students are going to give the chat or is just their chat (account) who is going to learn from this interaction?

Of course! My background is in physics, so while this stuff is maybe a little closer to home, I was really impressed when I learned a bit about how it works and how "simple" everything is on the inside, mechanically speaking.

POSTED BY: Arben Kalziqi

Thanks for the answer and for sharing your point of view. For me a chemical engineer is very enlightening.

I think it could, but generally the problem of memorization is not too serious a one in modern neural net development, is my understanding. The ideal would be that the net builds a good model of what traffic flow looks like, and as long as there's a good amount of data it should be able to account for the fact that traffic has periods of extreme irregularity (e.g. crashes or construction). If the designers realize that it's doing less modeling and more memorizing, they can change the training to account for that (and typically, it's pretty easy to realize during the training process whether a net is "overfitting" or memorizing).

POSTED BY: Arben Kalziqi

Thank you Arben. In this scenario do you think the neural net could start to memorize the traffic information? If that happen I starting to see the problems that would trigger.

Hi Ana! I could see that being a possibility. I think the most likely thing is that traffic cameras could recognize how many cars there are along each direction and what sorts of speeds they're seeing, then change the lights dependent on those inputs. There are traditional ways to do this measurement, but with neural nets you can just plug in any camera and a cheap circuit board and very quickly get reasonable estimates for these parameters (as well as, say, bikes and pedestrians). You could then use this data to determine when it's time to change the lights, in something like the following fashion:

  • Camera uses neural nets to detect the number of cars, people, and cyclists, as well as their speeds
  • This data is fed into a neural net which has been trained on data that looks like "if you have this many cars and people and cyclists and these are their speeds, switch the light"
  • This decision is fed into some kind of traditional failsafe, as you couldn't guarantee with a neural net alone that you wouldn't wind up having an intersection have green lights in two perpendicular directions, which would be very bad!

So, you're right that you would need some kind of non-neural network "final check" for a case like this, and also right that this isn't something you'd need (or want) to use the newer generative-style AI for.

POSTED BY: Arben Kalziqi

Hi! I have a question regarding the implementation of AI. In my city, Guatemala (in Central America) there is a project to use IA to control the traffic lights. I haven´t see examples of this kind of implementation in any other places. And since in todays session was mention the problems that a neurological networks can have with algorithms, do you think this implementation could be effective? I belive is not a generative IA what is going to be used, but I wonder what would be the results.

Hey Carl—there shouldn't be any issues with either way of doing things, and they should indeed be equivalent. (Now, Day 3's notebook is so big that it might cause an issue here or there, but even that one ought to be fine.)

POSTED BY: Arben Kalziqi

Hey Arben, I had an open session of 14.0 running on my Mac and I double clicked the Machine Learning notebook after downloading it; my Mathematica session went off to LaLa land. Wouldn't even obey the interrupt Cmd + Option+ [.] or CTRL C. Had to force quit from MacOS. I started a new session and opened the notebook from within Mathematica and it opened as I expected. But I have not "enabled dynamics"... It should have opened the same by double clicking on the file, I think, yes?

POSTED BY: Carl Hahn

They're all a little bit different:

  • "Machine learning" is generally used to describe the process of training neural nets
  • "Deep learning" is a subset of machine learning referring to the training of "deep" neural networks, i.e. those with many layers
  • "AI" is super general and encompasses the above and more—it's so general that it can lead to miscommunications and mystification, though
  • "Generative AI" refers to "AI" (usually based on neural nets which have been trained via machine learning) which can "generate" novel text, images, audio, and so on

As for what we're doing at Wolfram, we've supported neural nets and machine learning for a long time now, and our fully symbolic approach is really well-suited to the construction and understanding of neural nets. Our new LLM (one example of generative AI) functionality takes advantage of the benefits of this new technology to do things like create helpful personas, write code, check or rephrase writing, and so on inside notebooks. Alan will address this in more detail on Friday!

POSTED BY: Arben Kalziqi

They're all a little bit different:

  • "Machine learning" is generally used to describe the process of training neural nets
  • "Deep learning" is a subset of machine learning referring to the training of "deep" neural networks, i.e. those with many layers
  • "AI" is super general and encompasses the above and more—it's so general that it can lead to miscommunications and mystification, though
  • "Generative AI" refers to "AI" (usually based on neural nets which have been trained via machine learning) which can "generate" novel text, images, audio, and so on

As for what we're doing at Wolfram, we've supported neural nets and machine learning for a long time now, and our fully symbolic approach is really well-suited to the construction and understanding of neural nets. Our new LLM (one example of generative AI) functionality takes advantage of the benefits of this new technology to do things like create helpful personas, write code, check or rephrase writing, and so on inside notebooks. Alan will address this in more detail on Friday!

POSTED BY: Arben Kalziqi

AI has so many ways to describe it: AI, Machine Learning, Deep Learning, and Generative AI. Which term is the best way to describe Wolfram's approach? Why is Wolfram's purpose to get into AI world?

POSTED BY: Taiboo Song

I heard AI for more than 40 years and it seems like a smart design? Are there any major differences?

POSTED BY: Taiboo Song

Hi everybody—I've just uploaded to the materials folder a slightly updated version of today's notebook with a few fixes we encountered during today's session.

POSTED BY: Arben Kalziqi

Hey Carl! Thanks for the kind words for our materials :). While I'm not a machine learning pro, I think that your example use case is quite doable (and depending on the image resolution, you may need fewer images than you imagine—high resolution images by definition have a lot of information stored in them, and IME even training on the order of tens of examples can be sufficient to do binary categorization with high accuracy in those cases).

As for perspective transformations, that could be a little trickier. I'm not sure what the extent of the different angles and viewpoints might be, but it could be that your training data could include an image and various transformations of it created with something like ImagePerspectiveTransformation or similar?

For spotting differences, theoretically you could train a neural network to do that by providing training data consisting of two input images with the output being a selected region where differences occur, but I'm not sure that that's the direction I'd take. If you can do something like FindGeometricTrransform to get your two "perspectives" roughly aligned, you might be able to do something like explicitly compute image differences and find the region over which they're maximized. This is kind of on the cusp of "better with traditional methods" vs "better with machine learning", to my (admittedly somewhat ignorant!) mind.

POSTED BY: Arben Kalziqi

That may well be true. For the words -> sentences analogy, sentences can have extremely variable lengths, so it might not just be the mean word length but the tightness of that distribution!

POSTED BY: Arben Kalziqi

Hey Arben, I have already learned more than I expected to from Wolfram's perspective on AI, Machine Learning and ChatGPT. Looking forward to your approach to teaching.

But could I really use Wolfram Language alone, or WL combined with (?), to do industrial grade things like spot defects in images of circuit boards or components soldered down on circuit boards? Like cracks, malformed parts, missing traces, debris, etc? Or would that still require developing a specialized tool? Or look at images of objects from different angles and determine if they are images of the same thing? Like an aerial photo of a city or a farm?

Or better yet, after recognizing whether they are images of the same thing, highlight what has changed in the thing after correcting for the different look angles and lighting? Or would that be too specialized an application?

(I am certain that if I wanted to do that I would at least look at WL as my experimental laboratory for learning how to do it)

POSTED BY: Carl Hahn
Posted 2 months ago

My mistake, the 2-gram generated 'wore' and 'hi' and the 3-gram generated 'was'. Though I imagine the increase in the proportion of valid words in the sample is related to the distribution of word lengths in English?

POSTED BY: Henry Ward
Posted 2 months ago

I was curious to see that the sample from the 4-gram model in today's session/notebook was the smallest to generate actual English words. Is that likely related to the fact that 4.7 letters is the average word length in English? (Source)

POSTED BY: Henry Ward
Posted 2 months ago

I tried 10

Attachment

Attachments:
POSTED BY: Tingting Zhao
Posted 2 months ago

Oh hey, you mean Arten, ok, np!

Just kidding, my bad for the typo Arben, pls don't hurt me :D

Your code worked!

Attachment

Attachments:
POSTED BY: Tingting Zhao

Try ImageIdentify[pic,All,3], perhaps :). (Also, it's "Arben" with a "b"!)

POSTED BY: Arben Kalziqi
Posted 2 months ago

Arden! I'm the first to plant my flag! Yay!!!!

I got the dark them working! Look, Mitch didn't register, lol :D

Attachment

Attachments:
POSTED BY: Tingting Zhao
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract