I have a data-set on which I want to run the Classify function.
To extract features I will make use of the FeatureExtraction function.
My question is:
should I extract the features before I split the data into the training set and testing set or should I run FeatureExtraction for each set separately?
So, should I do this:
fe = FeatureExtraction[dataset]
classifier = Classify[trainSet -> targetTrain, FeatureExtractor -> fe]
or should I do this:
feTrain = FeatureExtraction[trainSet]
classifier = Classify[trainSet -> targetTrain, FeatureExtractor -> feTrain]]
I am inclined to think I should use the first approach but I am not quiet sure how FeatureExtraction works.
But, if that is true, should I expect that the classifier will know how to extract features from the testing set?