Introduction
At the recent Wolfram Technology Conference (WTC), I gave a talk about Eliza and Chat-bots. The talk was well received. There is a lot of work that remains to be done, and so I am making the materials available to the community in the hopes that collaboration will help.
The key feature of the talk is code that will implement a simple chat-bot interface. It was my intention to use this interface as a means of communicating with a user in an educational setting, so the chat-bot window might be implemented in a palette.
I implemented the Eliza program from the 1960s following the model of the original paper. However, I think that this is a dead-end, at least until the Wolfram Knowledge Base is sophisticated enough to use and understand metaphor. Of more use will be chat-bots set up to make use of a specific (narrow) domain. Examples of this are all over the Internet.
My next steps will be to work the problem from the other end -- to build a model (such as the instruction of Species counterpoint) that will make use of a chat-bot when the learner gets stuck.
The attachments are the WTC presentation notebook, a package that has the UI code for the chat-bot and the original Weizenbaum paper.
Background
Eliza is a program first developed in the 1960's by Joseph Weizenbaum at the MIT artificial Intelligence Laboratory. It simulated a Rogerian psychotherapist. I first came across Eliza in a BASIC implementation in the late 1970s. I expanded on the code that appeared in Creative Computing in 1977, and it was effective enough that it 'fooled' some graduate students into thinking it was 'intelligent'. It was, arguably, the first chat-bot. In its format, it resembled any number of messaging applications, and at times, it is hard to tell whether the 'person' at the other end is real or a computer. Today, chat-bots are all over. I found one that helps you beat parking tickets in New York City and other large cities. I am sure that at least some of the time, when you are chatting with customer service, you are actually interacting with a bot.
Revisiting Eliza
Wolfram Language has extensive language processing capabilities that make it possible to greatly extend the artificial intelligence that was in earlier implementations. This is reason enough to implement Eliza in Wolfram Language.
However, the main reason I embarked on this project was in response I had to an experience I had talking with some maths professors about doing some work in Mathematica. I was dissatisfied by the fact that they wanted to use true-false and multiple-guess questions for quizzes. Students deserve better.
A chat-bot would be a very useful interface for educational software, where the user can explore a topic, and then ask questions about something that is not understood. There is a limit to what can be discovered using sliders and other controls. It would be useful if a student could say something like "I don't understand what is happening in step 2."
As Conner Flood points out in his Wolfram Blog about his Wolfram|Alpha induction proof generator, there are lots of things that are not computation- based in mathematics, and by extension to a lot of other fields. As with the proofs by induction, the assistance provided by a chat-bot in an application does not need to cover 100% of the cases, only the 80% of the most common problems.
So, for this project, there are two stages:
Implement Eliza using Wolfram Language
Apply chat-bot technology to a specific educational app.
For the latter case, I chose a program that will teach species counterpoint. Unfortunately, it took longer than expected to complete stage 1.
Implementing Eliza in Wolfram Language (Overall Design considerations)
Eliza was originally written as a command-line program. I found an implementation in the Wolfram Library from 1992 by Matthew Markert that only really worked as a command-line program. However, I wanted to use a modern UI --- to make it resemble a modern chat program like Messages --- so a command line was out of the question.
The program has two basic parts: the UI and the natural language parser. The UI part is relatively easy to implement in Wolfram Language:
Framed[Column[{
makeConversationRowFromText["How can I help you?", False, 500] ,
makeConversationRowFromText["I'm very depressed", True, 500] ,
makeConversationRowFromText["Sorry to hear that. Can you tell me more?", False, 500] ,
makeConversationRowFromText[
"My research is going nowhere. No one understands what I am doing, and I am in danger of losing my funding!", True, 500] ,
makeConversationRowFromText["That's too bad.", False, 500] ,
makeConversationRowFromText["Is that the best you can do?", True, 500] }]]
However, one key part of the UI is the section where the user enters text. This part of the user interface proved to be much more difficult to implement. In the program, the program presents a place for the user to enter text (e.g. a field with a prompt), and waits until the user is done (hits the ENTER key or equivalent), and then processes the entered text. The program then returns to the prompt. We need to maintain this structure, while also allowing the entire chat-bot to be implemented as a panel in a larger program.
Three options for interactive text input
InputString[]
InputString[]
is the most primitive of the three functions, introduced in version 1.0 (revised in version 10). This is the proper choice for a command-line program. When used with a GUI, the input field appears in a separate dialog window. However, with a little work, a passable UI could be implemented
Module[{myString="", conversationPod = makeConversationRowFromText["How can I help you", False, 500],
conversations ,nPods,imDone = False},
conversations = {conversationPod};
While[!imDone,
nPods = Min[6, Length[conversations]];
myString =InputString[Column[Take[conversations, -nPods]], WindowSize->550, WindowTitle->"Eliza]["];
If [(myString != "bye"),
(
conversationPod = makeConversationRowFromText[myString, True, 500];
AppendTo[conversations, conversationPod];
conversationPod = makeConversationRowFromText["Clever response from Eliza here", False, 500];
AppendTo[conversations, conversationPod];
),
(
imDone = True;
conversationPod = makeConversationRowFromText[myString, True, 500];
AppendTo[conversations, conversationPod];
conversationPod = makeConversationRowFromText["It was a pleasure chatting with you.", False, 500];
AppendTo[conversations, conversationPod];
)];
]
]
The problem with this is that the dialog closes and re-opens with every input, and the final input never appears.
FormFunction[]
This function was introduced in Version 10, and is intended to be used primarily with web-forms. However, it does meet the formal requirements for program flow.
getMyString = FormFunction[{"entry" -> "String"}, #[[1]] & ]
testString = getMyString[];
testString
"test of string entry.?"
The only problem with this solution fact that it can't be simplified to eliminate the "submit" button in favor of just using return or tab. It probably would not do for commercial code, but for research purposes, it is sufficient. We can make a prototype application:
Module[{myString="", conversationPod = makeConversationRowFromText["How can I help you", False, 500],imDone = False},
Print[conversationPod];
While[!imDone,
myString =getInputFancy[];
If [(myString != "bye"),
(
conversationPod = makeConversationRowFromText[myString, True, 500];
Print[conversationPod];
conversationPod = makeConversationRowFromText["Clever response from Eliza here", False, 500];
Print[conversationPod];
),
imDone = True];
]
]
InputField[]
InputField was initially almost impossible to implement successfully. However, due to some excellent spelunking by Kyle Martin of the Wolfram Technology group, we have the code needed to implement a chat-bot using Wolfram Language. The key functionality is to use EventHandler[] to capture the return-key-pressed event to trigger the action. When the return key is pressed, the code takes the contents of the field and processes it, then resets the contents of the field. Unlike the implementations using InputString or FormFunction, there is no While loop, and you can see that the cell bracket is not continuously active.
Here is the basic code:
demoConversations = {makeConversationRowFromText["How can I help you", False, 500];};
DynamicModule[{in = Null},
Panel[
Column[{Text[Style["Your Message:", Medium]],
EventHandler[InputField[Dynamic[in], String],
{"ReturnKeyDown" :> (AppendTo[demoConversations,makeConversationRowFromText[in, True, 500]];in = Null)}],
Dynamic[Column@demoConversations]
}]
]
]
Working Chat-bot Code
The code for this example is given at the end of this notebook. For a real implementation, the code would be stored in a Package. The package is provide with the materials for this talk.
chatBot[mySimpleProcessor, "Hello \[HappySmiley]", 350]
Eliza ][
Now that we have a user interface, we can add the artificial intelligence [LongDash] replacing the "Clever response from Eliza" pods of the UI prototypes. The implementation is in the notes. It was non-trivial.
chatBot[elizResponse]
AI design in original paper by Weizenbaum
I found the original paper on the internet. While it does not have any code, it has enough information to write an implementation. This paper is available with the materials for this talk, along with a BASIC implementation
Modern Ideas
Wolfram Language has its own sophisticated language processing functionality that permits us to go beyond the original intent of the program.
TextStructure[] is very powerful, not yet ready for prime time. It is experimental.
LanguageIdentify[] is used to make sure that the user is typing English. Defaults to English too easily, and is confused by short inputs.
TextWords[] works better than StringSplit[] to separate a string into English words. Can be confused by punctuation.
TextSentences[] will be used in the next revision to see if the user typed in more than once sentence. Each sentence will be processed in turn.
Associations will be used in the next iteration. The current script structure is hard to maintain or extend.
Applications: Design Considerations
Use Associations in the Script and write a function to create and maintain scripts
This is what the script looks like, formatted to show the structure:
Panel[Framed[Pane[Grid[Table[{
Column[workingScript[[i,1]]], workingScript[[i,2]],
Column[Table[{Grid[{{workingScript[[i,3,j,1]],
Column[workingScript[[i,3,j,2]], Dividers->All]}}, Dividers->Center]},
{j,1,Length[workingScript[[i,3]]]}],Dividers->Center]},
{i, 1, Length[workingScript]}] ,
Frame->All, Alignment->Left], {540,300}]],ImageSize->{580, 330}]
As it is, it is difficult to edit or maintain. Since the goal of the project is to make it easy for a non-expert to create a chat-bot to use as an interface to another, more important, program, it is critical that the steps involved in making the script be as easy as possible.
Generalize the script for other types of interaction
Eliza is a special case of the chat-bot. It allows free-form input and does not know anything about the 'real world'. By contrast, the chat-bots that may be used in the classroom need to know a great deal about the real world. It needs to know the 'rules' of the system, and the current 'state' of the program. For example, in a counterpoint program, it needs to know about the rules of counterpoint as well as the notes of the exercise being done by the student.
The script will not contain this knowledge at all, but it needs to have patterns that are useful. For example, it will need a pattern "what is the note you used in measure 6", or "why do you think that the c# in measure 4 is a good choice". These scripts can become very involved without some sort meta-script elements.
Another feature of the system is the need to do assessment. For the Eliza program, the computer simply responded to a user's input. For an educational program, the computer will need to ask the user questions. There is also the need to determine what constitutes mastery. We do not want to make a score based on Q&A or a series of responses that would generate a percentage score. While it would be possible to design such a system, my philosophy is that the aim of the system is mastery by the student. How long it takes to get there is relatively unimportant.
Implement scripts in other languages
One application of the chat-bot could be to learn a second language. Other than the 'stunt' code that tries to restrict Eliza to English, the scripts are constructed in such a way that they could be constructed in any language that can be typed into the computer.
I can see an application where the student and a computer chat about a picture or video. The model for the 'state' would know various facts about the picture. It would be a feat of programming to use ImageIdentify to detect objects in a scene and construct relationships between them. ("La plume de mon tante est sure le bureau de mon oncle.")
Speaking and listening
Depending on the hardware, it would be easy enough to have the computer speak its side of the conversation. Whether this would be of more than novelty value remains to be seen. However, if it were possible to synthesize speech of a quality sufficient for language instruction (without specialized coding or hardware), it would be very valuable indeed.
On some systems, there is built-in dictation functionality, and on other systems, add-on software is available. This might be useful even if the quality was only marginal, but the real test would be to see of the speech recognition software could understand a language student.
What next?
Collaboration
I think that this project is potentially very useful for a lot of disciplines. I can't do all the work myself.
I will continue the project on the Wolfram Community website.
Feel free to contact me directly to work on this project: georgevw3@mac.com
Cloud Deployment
I think it would be easy enough to deploy Eliza to the cloud. Implementing a chat-bot to provide instruction for an area where a more sophisticated model is required will be a challenge.
New Technology
TextStructure[] --- needs to work for complex sentences and in different languages
Neural Nets --- abandon context-free bots
Implementation
The details of implementation can be fond in the attached files.