Message Boards Message Boards

What is the naive Bayes Classifier doing?

Posted 2 years ago

I can't reconcile Mathematica's (12.3.1.0) naive Bayes classification probability for simple {Boolean} -> {Boolean} datasets. I've expected that Mathematica would use a Bernoulli model for this setup but that's clearly not the case.

Example:

Classify[
{True, True, False, True, True, True} -> {False, True, False, True, False, False}, 
Method -> "NaiveBayes"][{True}, "Probability" -> True]

gives p = 0.63... (for True -> True) but even my dog sees that it should be < 0.5. What's going on here?

POSTED BY: Christoph S.

I believe this is due to some overzealous standardization step in the automated processing pipeline (boolean vectors are converted to numerical vectors for processing). You can disable that using the "Minimal" feature extraction:

data = {True, True, False, True, True, True} -> {False, True, False, True, False, False};
cf1 = Classify[data, Method -> "NaiveBayes"];
cf2 = Classify[data, Method -> "NaiveBayes", FeatureExtractor -> "Minimal"];

cf1[{True, False}, "Probabilities"]
(* {<|False -> 0.30733, True -> 0.69267|>, <|False -> 0.963594, True -> 0.0364056|>} *)

cf2[{True, False}, "Probabilities"]
(* {<|False -> 0.579921, True -> 0.420079|>, <|False -> 0.579921, True -> 0.420079|>} *)
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract