Message Boards Message Boards

Higgs Boson Classification via Neural Network

Posted 8 years ago

enter image description here

Introduction

So here is a simple approach that applies Wolfram Language machine learning functions to a classification problem for finding possible Higgs particles. It uses a labeled data set with 30 numerical physical attributes (things like measured spins, angles, energies, etc.) and with labels being either 'signal' (s) or 'background' (b). The attached notebook runs through a sample analysis in the Wolfram Language: importing the training data, cleaning it up for using it, setting up a neural network, training the network with the data, and finally checking how well the trained neural network does at making predictions. Here is the description of data from the source website at KAGGLE:

Discovery of the long awaited Higgs boson was announced July 4, 2012 and confirmed six months later. 2013 saw a number of prestigious awards, including a Nobel prize. But for physicists, the discovery of a new particle means the beginning of a long and difficult quest to measure its characteristics and determine if it fits the current model of nature.

A key property of any particle is how often it decays into other particles. ATLAS is a particle physics experiment taking place at the Large Hadron Collider at CERN that searches for new particles and processes using head-on collisions of protons of extraordinarily high energy. The ATLAS experiment has recently observed a signal of the Higgs boson decaying into two tau particles, but this decay is a small signal buried in background noise.

The goal of the Higgs Boson Machine Learning Challenge is to explore the potential of advanced machine learning methods to improve the discovery significance of the experiment. No knowledge of particle physics is required. Using simulated data with features characterizing events detected by ATLAS, your task is to classify events into "tau tau decay of a Higgs boson" versus "background."

The winning method may eventually be applied to real data and the winners may be invited to CERN to discuss their results with high energy physicists.

References and sources

Training data

Import the training data:

training = Import["D:\\machinelearning\\higgs\\training\\training.csv", "Data"];
Dimensions[training]

{250001, 33}

Look at the data fields (they are described in the pdf link above). "EventId" should not be used as part of the training, since it has no predictive value. The last column "Label" is the classification (s=signal, b=background)

training[[1]]

{"EventId", "DER_mass_MMC", "DER_mass_transverse_met_lep", \ "DER_mass_vis", "DER_pt_h", "DER_deltaeta_jet_jet", \ "DER_mass_jet_jet", "DER_prodeta_jet_jet", "DER_deltar_tau_lep", \ "DER_pt_tot", "DER_sum_pt", "DER_pt_ratio_lep_tau", \ "DER_met_phi_centrality", "DER_lep_eta_centrality", "PRI_tau_pt", \ "PRI_tau_eta", "PRI_tau_phi", "PRI_lep_pt", "PRI_lep_eta", \ "PRI_lep_phi", "PRI_met", "PRI_met_phi", "PRI_met_sumet", \ "PRI_jet_num", "PRI_jet_leading_pt", "PRI_jet_leading_eta", \ "PRI_jet_leading_phi", "PRI_jet_subleading_pt", \ "PRI_jet_subleading_eta", "PRI_jet_subleading_phi", "PRI_jet_all_pt", \ "Weight", "Label"}

Sample vector:

training[[2]]

{100000, 138.47, 51.655, 97.827, 27.98, 0.91, 124.711, 2.666, 3.064, \ 41.928, 197.76, 1.582, 1.396, 0.2, 32.638, 1.017, 0.381, 51.626, \ 2.273, -2.414, 16.824, -0.277, 258.733, 2, 67.435, 2.15, 0.444, \ 46.062, 1.24, -2.475, 113.497, 0.00265331, "s"}

Set up a simple neural network (this can be tinkered with to improve the results):

net=NetInitialize[
NetChain[{
LinearLayer[3000],Ramp,LinearLayer[3000],Ramp,LinearLayer[2],SoftmaxLayer[]
},
"Input"->{30},
"Output"->NetDecoder[{"Class",{"b","s"}}]
]]

enter image description here

Set up the training data:

data = Map[Take[#, {2, 31}] -> Last[#] &, Drop[training, 1]];

Numerical vectors that each point to a classification (s or b):

RandomSample[data, 3]

{{87.06, 23.069, 67.711, 162.488, -999., -999., -999., 0.903, 4.245, 318.43, 0.523, 0.839, -999., 105.019, -0.404, 1.612, 54.943, -0.898, 0.855, 13.541, 1.729, 409.232, 1, 158.469, -1.363, -1.762, -999., -999., -999., 158.469} -> "b", {163.658, 55.559, 116.84, 50.019, -999., -999., -999., 2.855, 38.623, 130.132, 0.906, 1.359, -999., 51.993, 0.36, 2.142, 47.112, -0.932, -1.596, 22.782, 2.663, 191.557, 1, 31.026, -2.333, 0.739, -999., -999., -999., 31.026} -> "b", {100.248, 27.109, 60.729, 132.094, -999., -999., -999., 1.405, 10.063, 218.474, 0.541, 1.414, -999., 62.519, -0.401, -1.974, 33.817, -0.893, -0.657, 54.857, -1.298, 396.228, 1, 122.137, -2.369, 1.689, -999., -999., -999., 122.137} -> "s"}

Training

Length[data]

250000

{tdata,vdata}=TakeDrop[data,240000];
result=NetTrain[net,tdata,TargetDevice->"GPU",ValidationSet->Scaled[0.1],MaxTrainingRounds->1000]

enter image description here

DumpSave["D:\\machinelearning\\higgs\\higgs.mx", result];

Testing

This is the test data (unlabeled):

test = Import["D:\\machinelearning\\higgs\\test\\test.csv", "Data"];

Extract the validation data:

validate = Map[Take[#, {2, 31}] &, Drop[test, 1]];

Predictions made on the unlabeled data:

result /@ RandomSample[validate, 5]

{"b", "b", "s", "b", "b"}

Sample from the labeled data and compute classifier statistics:

cm = ClassifierMeasurements[result, RandomSample[vdata, 1000]]

enter image description here

cm["Accuracy"]

0.846

Plot the confusion matrix:

cm["ConfusionMatrixPlot"]

enter image description here

POSTED BY: Arnoud Buzing
7 Replies

enter image description here - Congratulations! This post is now Staff Pick! Thank you for your wonderful contributions. Please, keep them coming!

POSTED BY: EDITORIAL BOARD

Fantastic! Thanks for sharing! Is there also a video on this?

POSTED BY: Sander Huisman
Posted 8 years ago

Excellent post.

I see you are using TargetDevice->"GPU". I am shopping for a new desktop and thinking about getting one with a GPU. Can you discuss the run-time improvement associated with this problem and the GPU, and give as much detail about your hardware and operating system environment as you care to share.

POSTED BY: Alan Lewis

Alan,

I use an NVIDIA GeForce GTX 1080 card:

enter image description here

Speedups differ a lot for different training scenarios. For this example I am seeing about 2x speedup. The LeNet example in the NetTrain reference documentation shows a 10x speed-up, and the CIFAR-10 example is about 60x for me.

POSTED BY: Arnoud Buzing
Posted 8 years ago

Arnoud,

Thank you! I actually didn't realize I could use my existing NVIDIA video card for this. But now (after a driver update) I see I can.

POSTED BY: Alan Lewis

Good to hear! Keeping drivers up to date is always a good thing, especially with graphics drivers.

Just to satisfy my curiosity, what nVidia card do you have?

POSTED BY: Arnoud Buzing
Posted 8 years ago

It's quite old, as is my desktop system: a GeForce GTX 650 Ti. I am currently shopping for a new desktop system. If I actually go for a full-blown GPU co-processor card (in addition to one for the video display), any recommendations?

POSTED BY: Alan Lewis
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract