Group Abstract

Message Boards

WOLFRAM COMMUNITY

1.6K Views

0 Replies

1 Total Like

View groups...

Follow this post

Share this post:

GROUPS:

Data Science Wolfram Language Machine Learning

Observation on Feature Order Sensitivity in Machine Learning Models

Clarisse Wagner

Posted 1 year ago

Hello fellow Mathematica users, I trust a lot Mathematica's machine learning framework, which automatizes many aspects of ML by default and let user enjoy an easy to use tool. Lately, I realized that merely changing the order of features during training can significantly affect the predictive function, as shown in my example. Importantly, this change occurs without altering the names, types, values or values sequence of the features, only their order as a column. I even used "DirectTraining" in PerformanceGoal to prevent search of a model. Why does the order of feature columns impact the creation of the PredictorFunction, if all columns have unique names and their values stay the same ? The learning dataset used to create the function doesn't change, apart its "column-wise" ordering. Here is a simple code to let you test : (* generate some data ) x=RandomReal[100,100]; y=RandomReal[500,100]; z=x+y/2.+RandomVariate[NormalDistribution[]]; ( learning set with x column as first feature ) assoXAndY=MapThread[Association["x"->#1,"y"->#2]->#3&,{x,y,z}]; ( learning set with y column as first feature ) assoYAndX=MapThread[Association["y"->#2,"x"->#1]->#3&,{x,y,z}]; ( p learns {x,y} and p2 learns {y,x} ) p=Predict[assoXAndY, Method->"RandomForest",PerformanceGoal->"DirectTraining",FeatureTypes->{"Numerical","Numerical"}]; p2=Predict[assoYAndX,Method->"RandomForest",PerformanceGoal->"DirectTraining",FeatureTypes->{"Numerical","Numerical"}]; ( computes differences between predicted values, given same features, just different column order ) p[First/@assoXAndY]-p2[First/@assoXAndY] Zero difference between the two predicted values are not that common. Of course, playing with Methods yield some different behaviors for this dataset, for instance, NearestNeighbors gives the most zero differences between the two models. Even using "key" -> value as argument for both predictors will probably output different values : randX = RandomSample[x,1]; randY=RandomSample[y,1]; {p[<\|"x"->randX,"y"->randY\|>],p2[<\|"x"->randX,"y"->randY\|>]} To sum up, behind the ease of use, it's quite surprising and hard to realize that such small changes during the elaboration of the Predictor could impact their outputs. I may consider exploring features order impact on Predictors using RandomSample or Permutations functions, at the cost of simplicity. EDIT* : Using FeatureExtractor->"Minimal" seems to suppress the feature order sensitivity of the learning.

POSTED BY: Clarisse Wagner

Reply to this discussion

Reply Preview

Attachments

Remove Add a file to this post

Follow this discussion

or Discard

Feedback