Message Boards

WOLFRAM COMMUNITY

2949 Views

0 Replies

3 Total Likes

View groups...

Follow this post

Share this post:

GROUPS:

Understanding how classify is fitting my data

Brady Hunt

Brady Hunt, Rice University

Posted 9 years ago

Hi all, I'm pretty new to the Machine Learning package in Mathematica, and so far I like how easy it is to use. However, one thing that I wish it had was more information about how mathematica is treating high-dimensional datasets. In particular, I would like to be able to understand which features are relatively more or less important in the classification model. For example, I have a 34-dimensional dataset of clinical variables for patients who either did or did not respond to cancer treatment. The classifiication label being used is 'CR' for complete response and 'RESISTANT' for resistant to treatment. trainSet = Import["TrainSetCR.mx"]; validationSet = Import["ValidationSetCR.mx"]; I am using `Classify` to train both Logistic Regression and Random Forest classifiers for these data. I can get some high-level information about the classifiers produced using these methods with the `ClassifierInformation` function, but I would like to understand how the classifier is treating each feature. For Logistic Regression, I can use the `Function` property to get the function the classifier is using, but it is hard to understand. CRClassifier = Classify[trainSetCR, Method -> "LogisticRegression"]; ClassifierInformation[CRClassifier] CRClassifierProperties = ClassifierInformation[CRClassifier, "Properties"] ClassifierInformation[CRClassifier, "Function"] ClassifierMeasurements[CRClassifier, validationSet] /@ {"Accuracy", "ConfusionMatrixPlot"} For Random Forest, I cannot find any property that would allow me to understand how each feature is being used to classify the data. CRClassifier = Classify[trainSetCR, Method -> "RandomForest"]; ClassifierInformation[CRClassifier] CRClassifierProperties = ClassifierInformation[CRClassifier, "Properties"] ClassifierMeasurements[CRClassifier, validationSet] /@ {"Accuracy", "ConfusionMatrixPlot"} I really do enjoy using the Machine Learning package in Mathematica because it is easy to configure and try various machine learning techniques. However, I think the package could do a little bit better at allowing users to understand how these models are treating various features. I've attached my dataset and mathematica notebook for any who would like to look at the data. Any suggestions on how I could approach understanding these models in greater depth would be greatly appreciated. Attachments:

POSTED BY: Brady Hunt

Reply to this discussion

Reply Preview

Attachments

Remove Add a file to this post

Follow this discussion

or Discard

Group Abstract

Feedback