Message Boards Message Boards


Use FeatureExtraction before splitting data into training and testing sets?

Posted 11 months ago
0 Replies
0 Total Likes

I have a data-set on which I want to run the Classify[] function.

To extract features I will make use of the FeatureExtraction[] function.

My question is: should I extract the features before I split the data into the training set and testing set or should I run FeatureExtraction[] for each set separately?

So, should I do this:

fe = FeatureExtraction[dataset]
classifier = Classify[trainSet -> targetTrain, FeatureExtractor -> fe]

or should I do this:

feTrain = FeatureExtraction[trainSet]
classifier = Classify[trainSet -> targetTrain, FeatureExtractor -> feTrain]]

I am inclined to think I should use the first approach but I am not quiet sure how FeatureExtraction[] works. But, if that is true, should I expect that the classifier will know how to extract features from the testing set?

Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
or Discard

Group Abstract Group Abstract