Message Boards Message Boards

[WSG20] New Machine Learning Basics Study Group begins Monday, November 16

A new study group covering the basics of machine learning begins Monday! This group will be led by Wolfram certified instructor @Abrita Chakravarty and meets daily, Monday to Friday for one week. Abrita will share the excellent short lesson videos created by @Jon McLoone for the Wolfram U course Zero to AI in 60 Minutes. Study group sessions include time for additional examples, exercises, discussion and Q&A. Certification of program completion is available. Sign up: https://wolfr.am/R2s6PERk

POSTED BY: Jamie Peterson
35 Replies

The issue of precision is principally one of performance. While Wolfam Language can handle precise symbolic arithmetic and arbitrary levels of high-precision arithemetic, they are both much slower than floating point arithmetic which is effectively performed on the CPU while the others are in software. Since machine learning methods applied to big data are already very demanding and the difference between 10^-10 and 0 rarely makes a difference, the priorit goes to speed. Indeed some GPUs are designed to work at even lower "single" precision to enable more calculations per second.

There is also a second issue. Some of the training process is probabalistic, so you will acheive different results each time you run the training. For reproducibility one should control the random state with functions like SeedRandom (and some new tools coming in Wolfram Language 12.2) but since I didn't do this when I recorded the videos, we will likely never know how to reproduce the exact answer that I got.

POSTED BY: Jon McLoone

For the deployment aspect, I have tried two more independent product approachs, first one deployed at

http://vfvalue.com/ai/image.html

this page needs authentication to login,

it features chinese input, The following example the input is dog in chinese, and then the result return "True"

Starting from this, I also tried APIFunction[],and embed the api call in node.js at the server side, that way I can independent get total control of the front end look like and pave a path for independent commercial projects using web interface.

Form Interface

The result: result

POSTED BY: vincent feng
Posted 4 years ago

thanks

POSTED BY: tsukisan

Right. The neural net model is being used to extract features from the words here and those features are used to place the words in the 2-dimensional feature space.

Posted 4 years ago

Detecting the oldest and youngest in a picture.

https://www.wolframcloud.com/obj/tsukisan/LookingForExtremes

POSTED BY: tsukisan
Posted 4 years ago

Great. If I am not mistaken, the net model is used to create some kind of distance measurement for the featurePlot?? Is that correct?? thanks

POSTED BY: tsukisan

A final challenge

Create an deploy a simple machine learning app of your choice. Feel free to use one of our built-in classifiers or predictors.


To get you started, here is the example we looked at in the study group session:

CloudDeploy[
 FormPage[
  {{"myPicture", "Your picture"} -> "Image"},
  Module[{entity},
    entity = Classify["NotablePerson", #myPicture];
    Row[{#myPicture, Spacer[10], entity, Spacer[10], 
      entity["Image"]}]
    ] &,
  AppearanceRules -> <|
    "Title" -> "Which famous person do you look like?", 
    "Description" -> 
     "Enter your image and we'll show you your famous twin!!",
    "SubmitLabel" -> "Show me who"|>,
  PageTheme -> "Black"
  ],
 "FamousLookAlike",
 Permissions -> "Public"
 ]

Here's the simpler version for testing: https://wolfr.am/RddbHZR4

Predict is working. i think the conclusion is the date and text of the address is probably not very good at predicting the age. At least while we use the features as is.

Thanks!!!! I totally didn't even look at the emails because I had the event saved in my calendar. My bad.

POSTED BY: Stephanie Meyer

@Stephanie Meyer, please check your Study Group email reminder that was sent earlier today. It includes a link to the quiz.

POSTED BY: Jamie Peterson

Where do we find the quiz for the course? I haven't seen an email with a link or a file for it. I want to try it before our final Q and A class tomorrow.

POSTED BY: Stephanie Meyer
Posted 4 years ago

Abrita, Values[Normal[...]] didn't work for me but interchanging them worked fine (at least there were no errors flagged by mma).I am somewhat confused by the results though.Based on my interpretation it appears that based on the test results, the prediction is always between 57 and 59 years. I got these results by running Predict with the Method set to "Neural Network" and Performance goal set to "Quality".
I had 163 records for the training set and 70 for the test set. Is Predict working properly? Is my interpretation correct? enter image description here

POSTED BY: rjehanathan
FeatureSpacePlot[
 DeleteDuplicates@
  TextWords[
   "The Wolfram Language includes a wide range of state-of-the-art \
integrated machine learning capabilities, from highly automated \
functions like Predict and Classify to functions based on specific \
methods and diagnostics, including the latest neural net approaches. \
The functions work on many types of data, including numerical, \
categorical, time series, textual, image and audio. 
   The Wolfram Language has state-of-the-art capabilities for the \
construction, training and deployment of neural network machine \
learning systems. Many standard layer types are available and are \
assembled symbolically into a network, which can then immediately be \
trained and deployed on available CPUs and GPUs. 
   Using a variety of state-of-the-art methods, the Wolfram Language \
provides immediate functions for detecting and extracting features in \
images and other arrays of data. The Wolfram Language supports \
specific geometrical features such as edges and corners, as well as \
general keypoints that can be used to register and compare images."], 
 FeatureExtractor -> 
  NetModel["GloVe 100-Dimensional Word Vectors Trained on Tweets"]]

GloVe WL ML

Actually this seems to work for me, in spite of the red syntax coloring!

rdata = ResourceData["State of the Union Addresses"]
data = rdata[All, {"Date", "Text", "Age"}]
sample = IntegerPart[.9*Length[data]]

trainingDataset = RandomSample[data, sample];
testDataset = Complement[data, trainingDataset];

p = Predict[trainingDataset -> "Age"]
PredictorMeasurements[p, testDataset -> "Age", "ComparisonPlot"]

PredictorMeasurements Dataset

Or by transforming the Dataset:

tData = data[All, {#Date, #Text} -> #Age &] // Normal;
trainingData = RandomSample[tData, sample];
testData = Complement[tData, trainingData];

p = Predict[trainingData]
PredictorMeasurements[p, testData, "ComparisonPlot"]

From the same test dataset and predictor function as above, it gives the same result: PredictorMeasurements List of rules

Posted 4 years ago

thank you. I'm just now beginning to see how esoteric this topic is typically considered, yet how clearly it was evident while immersed in learning R just after your Summer Boot Camp sessions.

I believe yet another example was provided in today's lecture when we were evaluating predictive probabilities for boolean operators (or something like that) with values of .99998 accompanied by mean error of 1.00058 (or something like that) note: it's been awhile since I've taken a stats class.

So... perhaps a more precise version of my question for Friday is: If we accept that these tiny values are, in fact, measurements of computational error with magnitude attributable to specifications of the system which had computed it; and we are willing to admit these values of something times 10 to the negative 50th power are meaningless aside from representations of computational processor error...

why not just round it zero? or perhaps some symbolic feature of irrelevance? continued use seems to create user confusion from inconsistent outputs and higher risk of error propagation than benefit.

looking through the documentation I saw some functions which appear to effectuate this under the heading of Precision & Accuracy Control, but why not make such settings default? or default for a particular user type? (like "student/learner") then everyone in a class would receive similar results regardless of what type of computer they are using.

POSTED BY: C Ellis

Let's bring your question to Friday's Q&A session with the development team.

Unfortunately it appears that PredictorMeasurements cannot deal with test data being in a Dataset. We will have to wrangle it into key-value pairs {feature1, feature2,...} -> target

For this specific dataset we can do the following:

modifiedTesting = Most[#] -> Last[#] & /@ Values[Normal[testing]]

which will set up the test data as: {date, text of address}-> age and allow you to use the PredictorMeasurements function.

Daily Challenge: Day 3

Today's challenge will lead nicely into tomorrow's topic "Neural Networks". Visit the GloVe 100-Dimensional Word Vectors Trained on Tweets available in the Wolfram Neural Net repository. Look at the example offered under "Feature Visualization". Try FeatureSpacePlot with FeatureExtractor set to this particular model to visualize a clustering of words in any text of your choice. Share any interesting results you find here.

A new study group covering the basics of machine learning begins Monday! This group will be led by Wolfram certified instructor @Abrita Chakravarty and meets daily, Monday to Friday for one week. Abrita will share the excellent short lesson videos created by @Jon McLoone for the Wolfram U course Zero to AI in 60 Minutes. Study group sessions include time for additional examples, exercises, discussion and Q&A. Certification of program completion is available. Sign up: https://wolfr.am/R2s6PERk

POSTED BY: Kevin Hawekotte

I would attribute such results really to the training data. The performance of most classification algorithms is closely tied in with the quality and availability of the training data.

This is being automatically done by the Classify or Predict function. You will observe it in the training progress panel as the training happens. You will see it is trying different algorithms.

Posted 4 years ago

Kevin - While the algorithm in me desires to reply that "extremely" represents 75% or greater level of emphasis and could typically be interchanged with the term "high degree;" the human in me dictates need to break out the red felt tip pen, circling words "extremely low crime rate" and scribbling "AWK."

The truth is that English-speaking humans rarely speak like this. (ie high degree of low in something that is unpleasant) Instead, one would much more likely focus upon the positive (eg my city is safe)

Given numerous other tests haven't revealed the WL sentiment algorithm to be completely irrational (like say... perhaps some reddit automod gone awry) I consider it an intelligent resource. Perhaps a bit too eager to provide non-neutral responses, but intelligent nonetheless. Hope that helps.

Attachment

Attachments:
POSTED BY: C Ellis
Posted 4 years ago

I run into some trouble with the format of DataSet. The following runs without errors and trains fairly:

addressData = ResourceData["State of the Union Addresses"];
addressEssensialData = addressData[All, {"Date", "Text", "Age"}];
{training, testing} = TakeDrop[addressEssensialData, 200];

Note that at this point:

In[16]:= training // Head
Out[16]= Dataset

Next, I can successfully use variable training in the following way

p = Predict[training -> "Age"]

Further, I am OK with getting predictions and actual data "by hand":

predictionsOnTesting = p[testing]
actualsOnTesting = testing[[All, 3]] // Normal

And further I can plot their comparison plot (also "by hand")

ByHand

However, if I try feeding testing into PredictorMeasurements[..]

pm = PredictorMeasurements[p, testing]

or similarly as it worked earlier with Predict[..]

pm = PredictorMeasurements[p, testing->"Age"]

It runs with the following error

PredictorMeasurements::bdfmt: Argument Dataset [<<33>>] should be a rule or a list of rules.

And usually crashes Mathematica if I try doing anything further. Though one time I somehow managed:

ComparisonPlot

I am a bit at a loss how to properly figure out PredictorMeasurements[...] with DataSet variable for the training set... (I also tried turning DataSets into Lists but this seems wrong and I didn't have much success with it either)

Attachments:
POSTED BY: r t

Just to give context: My approach could just be naïve (I'm a beginner here, so pardon me in advance) but using the Classifier function to analyze the sentiment is throwing some weird results my way. I wanted to just use a few sentences before I approached with a larger text file.

Here is what I have:

enter image description here

I was wondering if the word 'extremely' had some weight to it that made the output different? Thanks for the help!

POSTED BY: Kevin Hawekotte
Posted 4 years ago

Thank you! That worked perfectly!

I'm wondering about slide 7 of the presentation where it mentions automating the method choice by testing different approaches on subsets of data - do you have any examples of this?

Thanks!

POSTED BY: Ivan G

Daily Challenge: Day 2

The dataset at https://datarepository.wolframcloud.com/resources/State-of-the-Union-Addresses contains the complete text of State of the Union addresses from 1790 to 2019. Test the performance of a Predictor trained to predict the age of the president from the date and text of the address.

Here's my attempt at analyzing the sentiment of "Pride and Prejudice":

prideAndPrejudice = 
  TextCases[ResourceData["Pride and Prejudice"], "Sentence"];

sentiment = 
  Classify["Sentiment", prideAndPrejudice, 
   "Probability" -> "Positive"];

MovingAverage[sentiment, 500] // ListLinePlot

enter image description here

@ivang1 Perhaps the example in the attached notebook will help.

Attachments:
Posted 4 years ago

Hi,

I'm confused about how to import excel files into the "{parameters}->label" format. My excel file looks like the picture below, with some of the data being blank.

Input Data Format in Excel

POSTED BY: Ivan G
Posted 4 years ago

Ok. So I have one of those pesky theory questions... In yesterday's session, we saw several examples of "very small numbers" returned for (admittedly unlikely) probabilities within the classification function exercises. Returning to the concept of how Mathematica/WL symbolically evaluates the expression of SQRT(2)^2-1 from a previous seminar series... would it be safe to say that these "very small numbers" are themselves measurements of computational error?

I've attached screen captures with evaluation of the "Supervised Learning: Classification - Basic Example" from my own computer (64 bit PC) and from the lecture video. (Mr. McLoone's obviously superior computing cluster) So what result does everyone else get when evaluating this expression?

And if my hypothesis is correct (more processing cores = reduced computational error) is there a method for "forcing" WL to askew floating point arithmetic? Or does this even matter provided we askew "mission critical" uses as mentioned in the video and remain cognizant of error propagation?

Or... is there some form of symbolic approach to this issue? (for which I admittedly have absolutely no idea what such methods might even look like) Just curious. Seems like an emerging issue as we continue to measure more things more often with ever increasing levels of precision.

Attachment

Attachments:
POSTED BY: C Ellis
Posted 4 years ago

Pretty fun. I have used The Picture of Dorian Gray. After taking off sentences of less than 3 words (arbitrary decision...) I have used a lowpassfilter with very low cuttoff. Funny part, the graphs showed a high probability of a neutral end (??). After reviewing the text I realized that the end of the text was license agreement. After removing that, the end of the book could be positive too.... In the images green is positive, red is negative and orange is neutral.

With license at the end With no license agreement at the end

POSTED BY: tsukisan

Here are a few stories which I applied the sentiment classifier and generated matrix plots. SCF's 1,2, and 3 are the various stories. P 1,2, and 3 are the resulting probabilities extracted from the association returns from SCF's.

Attachments:
POSTED BY: Brad Button
Posted 4 years ago
ListLinePlot[
 Classify["Sentiment", 
  TextSentences[ExampleData[{"Text", "AliceInWonderland"}]], 
  "Probability" -> "Positive"]]
POSTED BY: rjehanathan

Daily Challenge: Day 1

Here is a challenge to celebrate NaNoWriMo. Write a short story (or select an extract from your favorite story - perhaps one for which the text is available in the public domain). Use the "Sentiment" classifier in the Wolfram Language to detect the sentiment of each sentence. Plot the probability of the sentiment of each sentence in the story being "positive", as the plot progresses.

If it's your own story, have fun tweaking the plot to see how the changes are reflected in the sentiment plot.

Looking forward to reviewing these topics along with you all. Please feel free to post any questions you may have, from the course or from the daily study group sessions, here. Also look out for the fun daily challenges to showcase your new machine learning skills.

See you on Monday.

Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract