Message Boards Message Boards

0
|
8814 Views
|
4 Replies
|
3 Total Likes
View groups...
Share
Share this post:

Random rounding in classifier function

Posted 10 years ago

I noticed this while viewing the documentation for "ClassifierMeasurements". I executed each of the code lines. I saw that Out[4] was .5 instead of the expected .75. Upon returning to the same page and executing the result was .75 again. This can be reproduced by executing the code, doing something else, and trying again. The flip also occurs in a notebook copy of the code. Looks like the problem comes from 4.5 being halfway between "A" and "B". Notebook of the code attached.

Picture is first execution from a pristine load of M10, then directly to documentation.

first execution

Attachments:
POSTED BY: Douglas Kubler
4 Replies

Per my later post on this same observed non-determinism in Classify on small data sets, the behavior can be seen even when the Method is fixed, e.g., with NaiveBayes. Example attached.

Attachments:
POSTED BY: Mark Tuttle

Hi Mark, as you suspected in the notebook, this behavior is observed when there is a tie. When the most likely classes have the same probability (or more generally the same utility), a RandomChoice of these classes is done. To avoid this behavior you can use the undocumented options "TieBreakerFunction" and put any function you like. For example "TieBreakerFunction" -> First will give a determinate result.

POSTED BY: Etienne Bernard

This behaviour is frequent when training on tiny dataset such as this one. Here, Classify cannot discriminate between the method "LogisticRegression" and the method "NearestNeighbors", it thus choses one at random (and for this test set, logistic regression has a better accuracy).

POSTED BY: Etienne Bernard

Here is a bit more of the puzzle:

In[278]:= 
trainingset = {1 -> "A", 2 -> "A", 3.5 -> "B", 4 -> "A", 5 -> "B", 
   6 -> "B"};

In[279]:= Table[Classify[trainingset, 3.9, "Probabilities"], {15}]

Out[279]= {<|"A" -> 0.833333, "B" -> 0.166667|>, <|"A" -> 0.423239, 
  "B" -> 0.576761|>, <|"A" -> 0.423239, 
  "B" -> 0.576761|>, <|"A" -> 0.423239, 
  "B" -> 0.576761|>, <|"A" -> 0.423239, 
  "B" -> 0.576761|>, <|"A" -> 0.833333, 
  "B" -> 0.166667|>, <|"A" -> 0.423239, 
  "B" -> 0.576761|>, <|"A" -> 0.423239, 
  "B" -> 0.576761|>, <|"A" -> 0.833333, 
  "B" -> 0.166667|>, <|"A" -> 0.4, "B" -> 0.6|>, <|"A" -> 0.423239, 
  "B" -> 0.576761|>, <|"A" -> 0.423239, 
  "B" -> 0.576761|>, <|"A" -> 0.833333, 
  "B" -> 0.166667|>, <|"A" -> 0.423239, 
  "B" -> 0.576761|>, <|"A" -> 0.423239, "B" -> 0.576761|>}

So, the question is why the classification probabilities that are the result of the Classify function are non-deterministic.

Now, I am posting this "blind" in the sense that I do not know enough about how the classification works. And it may be that the algorithm is intrinsically stochastic. But I am showing my ignorance here and the net result is that I need to read a book on classification and machine learning... ;-)

It may be that a random initialization in a gradient descent approach to optimization is finding differing local minima.

POSTED BY: David Reiss
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract