Message Boards Message Boards

GROUPS:

Finding the Classifier Boundary Function For Method->LogisticRegression

Posted 11 months ago
1178 Views
|
1 Reply
|
7 Total Likes
|

Here' s a diagram we often see in machine learning. I want to construct something like it using the Wolfram Language and Classify, at least when the method is logistic regression. Here's how I go about it. Boundary Drawing

Let' s generate some simple data.

 SeedRandom[121217];
 trainx = RandomReal[{0, 1}, {20, 2}];
 trainy = Map[
    If[LogisticSigmoid[{-2.7, 1.4}.# + 
         RandomVariate[NormalDistribution[0.3, 0.5]]] > 0.5, 1, 0] &, 
    trainx];

If our classification task does not require much complexity, we can use LogitModelFit and some algebra to find the boundary.

boundaryExpression[fm_FittedModel] := 
  y /. Quiet@First@Solve[fm[x, y] == 1/2, y] // Expand;
boundaryExpression[lomf]

We find it is y=-3.69718 + 10.2671 x

We can now visualize the boundary and get a picture like the one above with the following code.

 With[{be = boundaryExpression[lomf]}, 
  Show[ListPlot[
    KeySort[GroupBy[
      MapThread[List, {trainx, trainy}], (ToString[Last[#]] &) -> 
       First]], PlotMarkers -> {{"-", 18}, {"+", 18}}, Axes -> False, 
    Frame -> True], 
   Plot[be, {x, 0, 1}, PlotStyle -> {Thick, Dashed, Black}]]

But all of that is using LogitModelFit. Sometimes we need regularization or other features of Classify. So, here's how we generate a similar picture using Classify.

 cl = Classify[trainx -> trainy, Method -> {"LogisticRegression", "L1Regularization" -> 0, 
 "L2Regularization" -> 0}, TrainingProgressReporting -> None];

We get the probabilities for each class.

 ci = ClassifierInformation[cl, "ProbabilitiesFunction"]

This generates a function that produces an Association:

 Association[{0 -> (0.0307046 E^(8.58421 #1))/(
     0.0307046 E^(0. + 8.58421 #1) + 0.675585 E^(0.836087 #2)), 
    1 -> 1./(1. + 0.0454489 E^(8.58421 #1 - 0.836087 #2))}] &

We can again use a little algebra to find the boundary.

 boundaryExpression[a_Function] := 
   y /. Quiet@
     First@Simplify[Solve[Equal @@ a[x, y], y, Reals], 
       x \[Element] Reals];
 boundaryExpression[cl_ClassifierFunction] := 
  boundaryExpression[
   ClassifierInformation[cl, "ProbabilitiesFunction"]]

We can now use this function to make a plot quite similar to the one above.

 With[{be = boundaryExpression[cl]}, 
  Show[ListPlot[
    KeySort[GroupBy[
      MapThread[List, {trainx, trainy}], (ToString[Last[#]] &) -> 
       First]], PlotMarkers -> {{"-", 18}, {"+", 18}}, Axes -> False, 
    Frame -> True], 
   Plot[be, {x, 0, 1}, PlotRange -> {{0, 1}, {0, 1}}, 
    PlotStyle -> {Dashed, Black}]
   ]
  ]

Here's another way that does not require use of algebra. Instead we rely on RegionPlot and Ordering.

 Show[ListPlot[
   KeySort[GroupBy[
     MapThread[List, {trainx, trainy}], (ToString[Last[#]] &) -> 
      First]], PlotMarkers -> {{"-", 18}, {"+", 18}}, Axes -> False, 
   Frame -> True], 
  RegionPlot[
   Ordering[ci[x, y], -1][[1]] == 1, {x, 0 - 1, 2}, {y, -1, 2}, 
   PlotRange -> {{0, 1}, {0, 1}}, BoundaryStyle -> {Dashed, Black}, 
   PlotStyle -> {Opacity[0], White}]
  ]

By using RegionPlot we can readily extend the production of boundary diagrams to situations involving more than two classes.

 cl3 = Classify[trainx -> trainy2, 
    Method -> {"LogisticRegression", "L1Regularization" -> 0, 
      "L2Regularization" -> 0}, TrainingProgressReporting -> None];
 ci3 = ClassifierInformation[cl3, "ProbabilitiesFunction"];
 c3plot = Show[
   ListPlot[KeySort[
     GroupBy[MapThread[List, {trainx, trainy2}], (ToString[Last[#]] &) -> 
       First]], PlotMarkers -> {{"-", 18}, {"+", 18}, {"2", 18}}, 
    Axes -> False, Frame -> True], 
   RegionPlot[{Ordering[ci3[x, y], -1][[1]] == 1, 
     Ordering[ci3[x, y], -1][[1]] == 2, Ordering[ci3[x, y], -1][[1]] == 3}, {x,
      0 - 1, 2}, {y, -1, 2}, PlotRange -> {{0, 1}, {0, 1}}, 
    BoundaryStyle -> {{Dotted, Black}}, PlotStyle -> {Opacity[0.05]}]
   ]

Boundaries with three classes

I attach a notebook that recapitulates this post and adds an animation.

Attachments:

enter image description here - Congratulations! This post is now a Staff Pick as distinguished by a badge on your profile! Thank you, keep it coming!

Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract