GROUPS:

# Introduction

So here is a simple approach that applies Wolfram Language machine learning functions to a classification problem for finding possible Higgs particles. It uses a labeled data set with 30 numerical physical attributes (things like measured spins, angles, energies, etc.) and with labels being either 'signal' (s) or 'background' (b). The attached notebook runs through a sample analysis in the Wolfram Language: importing the training data, cleaning it up for using it, setting up a neural network, training the network with the data, and finally checking how well the trained neural network does at making predictions. Here is the description of data from the source website at KAGGLE:

Discovery of the long awaited Higgs boson was announced July 4, 2012 and confirmed six months later. 2013 saw a number of prestigious awards, including a Nobel prize. But for physicists, the discovery of a new particle means the beginning of a long and difficult quest to measure its characteristics and determine if it fits the current model of nature.

A key property of any particle is how often it decays into other particles. ATLAS is a particle physics experiment taking place at the Large Hadron Collider at CERN that searches for new particles and processes using head-on collisions of protons of extraordinarily high energy. The ATLAS experiment has recently observed a signal of the Higgs boson decaying into two tau particles, but this decay is a small signal buried in background noise.

The goal of the Higgs Boson Machine Learning Challenge is to explore the potential of advanced machine learning methods to improve the discovery significance of the experiment. No knowledge of particle physics is required. Using simulated data with features characterizing events detected by ATLAS, your task is to classify events into "tau tau decay of a Higgs boson" versus "background."

The winning method may eventually be applied to real data and the winners may be invited to CERN to discuss their results with high energy physicists.

# Training data

Import the training data:

training = Import["D:\\machinelearning\\higgs\\training\\training.csv", "Data"];
Dimensions[training]


{250001, 33}

Look at the data fields (they are described in the pdf link above). "EventId" should not be used as part of the training, since it has no predictive value. The last column "Label" is the classification (s=signal, b=background)

training[[1]]


{"EventId", "DER_mass_MMC", "DER_mass_transverse_met_lep", \ "DER_mass_vis", "DER_pt_h", "DER_deltaeta_jet_jet", \ "DER_mass_jet_jet", "DER_prodeta_jet_jet", "DER_deltar_tau_lep", \ "DER_pt_tot", "DER_sum_pt", "DER_pt_ratio_lep_tau", \ "DER_met_phi_centrality", "DER_lep_eta_centrality", "PRI_tau_pt", \ "PRI_tau_eta", "PRI_tau_phi", "PRI_lep_pt", "PRI_lep_eta", \ "PRI_lep_phi", "PRI_met", "PRI_met_phi", "PRI_met_sumet", \ "PRI_jet_num", "PRI_jet_leading_pt", "PRI_jet_leading_eta", \ "PRI_jet_leading_phi", "PRI_jet_subleading_pt", \ "PRI_jet_subleading_eta", "PRI_jet_subleading_phi", "PRI_jet_all_pt", \ "Weight", "Label"}

Sample vector:

training[[2]]


{100000, 138.47, 51.655, 97.827, 27.98, 0.91, 124.711, 2.666, 3.064, \ 41.928, 197.76, 1.582, 1.396, 0.2, 32.638, 1.017, 0.381, 51.626, \ 2.273, -2.414, 16.824, -0.277, 258.733, 2, 67.435, 2.15, 0.444, \ 46.062, 1.24, -2.475, 113.497, 0.00265331, "s"}

Set up a simple neural network (this can be tinkered with to improve the results):

net=NetInitialize[
NetChain[{
LinearLayer[3000],Ramp,LinearLayer[3000],Ramp,LinearLayer[2],SoftmaxLayer[]
},
"Input"->{30},
"Output"->NetDecoder[{"Class",{"b","s"}}]
]]


Set up the training data:

data = Map[Take[#, {2, 31}] -> Last[#] &, Drop[training, 1]];


Numerical vectors that each point to a classification (s or b):

RandomSample[data, 3]


{{87.06, 23.069, 67.711, 162.488, -999., -999., -999., 0.903, 4.245, 318.43, 0.523, 0.839, -999., 105.019, -0.404, 1.612, 54.943, -0.898, 0.855, 13.541, 1.729, 409.232, 1, 158.469, -1.363, -1.762, -999., -999., -999., 158.469} -> "b", {163.658, 55.559, 116.84, 50.019, -999., -999., -999., 2.855, 38.623, 130.132, 0.906, 1.359, -999., 51.993, 0.36, 2.142, 47.112, -0.932, -1.596, 22.782, 2.663, 191.557, 1, 31.026, -2.333, 0.739, -999., -999., -999., 31.026} -> "b", {100.248, 27.109, 60.729, 132.094, -999., -999., -999., 1.405, 10.063, 218.474, 0.541, 1.414, -999., 62.519, -0.401, -1.974, 33.817, -0.893, -0.657, 54.857, -1.298, 396.228, 1, 122.137, -2.369, 1.689, -999., -999., -999., 122.137} -> "s"}

# Training

Length[data]


250000

{tdata,vdata}=TakeDrop[data,240000];
result=NetTrain[net,tdata,TargetDevice->"GPU",ValidationSet->Scaled[0.1],MaxTrainingRounds->1000]


DumpSave["D:\\machinelearning\\higgs\\higgs.mx", result];


# Testing

This is the test data (unlabeled):

test = Import["D:\\machinelearning\\higgs\\test\\test.csv", "Data"];


Extract the validation data:

validate = Map[Take[#, {2, 31}] &, Drop[test, 1]];


Predictions made on the unlabeled data:

result /@ RandomSample[validate, 5]


{"b", "b", "s", "b", "b"}

Sample from the labeled data and compute classifier statistics:

cm = ClassifierMeasurements[result, RandomSample[vdata, 1000]]


cm["Accuracy"]


0.846

Plot the confusion matrix:

cm["ConfusionMatrixPlot"]


Attachments:
4 months ago
7 Replies
 - Congratulations! This post is now Staff Pick! Thank you for your wonderful contributions. Please, keep them coming!
4 months ago
 Fantastic! Thanks for sharing! Is there also a video on this?
4 months ago
 Excellent post. I see you are using TargetDevice->"GPU". I am shopping for a new desktop and thinking about getting one with a GPU. Can you discuss the run-time improvement associated with this problem and the GPU, and give as much detail about your hardware and operating system environment as you care to share.
9 days ago
 Alan,I use an NVIDIA GeForce GTX 1080 card:Speedups differ a lot for different training scenarios. For this example I am seeing about 2x speedup. The LeNet example in the NetTrain reference documentation shows a 10x speed-up, and the CIFAR-10 example is about 60x for me.