Message Boards Message Boards

Improving accuracy of neural network for determining qubit rotation angle

Posted 2 months ago

The physics example problem (to illustrate the use of a basic neural network using Mathematica) I am looking at is a qubit rotated about the y-axis, where the rotation angle is discretized as $\theta_j \in (0, \pi)$. The setup involves the y-rotated qubit measured in the z-basis (hence spin-up and spin-down projector measurements). This scenario involves first analytically determining the measurement outcome probabilities in as a function of the rotation angle $\theta$, then generating measurement outcomes for training, for specific fixed rotation angles $\theta_j$. Then generating another set of test measurement data for some fixed rotation angle $\theta$, we use the neural network to infer the most probable rotation angle. My training data involves generating m = 1000 total measurements for each discrete rotation angle $\theta_j \in [0, \pi]$, then saving the measurement outcomes as tuples of spin-up and spin-down outcomes for each discrete angle. These outcomes are associated with each of the discrete $\theta_j$ values which are one-hot vectors (hence training data of the form {1000,0} -> {1,0,0,0,0...} if for the first rotation angle we get all spin-up outcomes).

The idea is that after training, setting some true rotation angle $\theta$, and generating a new set of test measurement outcomes, the trained neural network should be able to output a probability distribution that shows the most likely discrete rotation angle is the true angle. The code below works but I am having difficulty improving the accuracy without simply increasing the layers and MaxTrainingRounds (this seems to have it's limits in improving accuracy). Can anyone advise on how to improve the accuracy of the code in determining the correct discrete rotation angle (I would like to maintain the general framework of the code)? I am very new to using Mathematica for machine learning applications hence the query. Thanks for any assistance, this is the code in question:

POSTED BY: Byron Alexander
8 Replies
Posted 2 months ago
Attachments:
POSTED BY: Sangdon Lee
Posted 2 months ago
Posted 2 months ago
POSTED BY: Sangdon Lee
Posted 2 months ago

@SangdonLee Thanks, all the discussions are very helpful. The physics problem that I am considering, for learning purposes at this stage, is the problem of a rotating qubit. I consider that a qubit rotates about it's y-axis by an angle of \theta in [0, pi]. For training we consider the z measurements of spin-up and spin-down for discretized angles of [0, pi]. Hence the training result {9989, 11} -> "1" indicates that I obtain 9989 spin up results and 11 spin down when measuring the qubit during the first discretized rotation angle. Since I discretize the range [0, pi] in 50 intervals, I obtain 50 such training data results. Have a look at my final code, let me know what you think of my attempt to model this example (not sure how familiar you are with quantum mechanics?). You can advise if anything is unclear in the code or example.

Attachments:
POSTED BY: Byron Alexander
Posted 2 months ago

The syntax looks correct although the training and testing data sets are usually 70% and 30% split, that is, ValidationSet->Scaled[.3]. You can split the input data into a training dataset and a validation dataset also, e.g., ValidationSet->myValidationSet. By doing this way, you can apply the NetMeasurements function to compute various measurements for your validation set.

I noticed that you are changing the one-hot coding as a number, not a "string" and think about whether the "string" would make more sense or not. If string is used, then your problem becomes classification and thus your net has to be modified, especially the last layer.

  • Number? e.g., {1,0,0,0,0.......} to 1, {0,1,0,0,0.......} to 2,
  • String? e.g., {1,0,0,0,0.......} to "1", {0,1,0,0,0.......} to "2",

Adding more hidden layers does not necessarily increase prediction accuracy as demonstrated by the Stephen Wolframs' blog. https://writings.stephenwolfram.com/2024/03/can-ai-solve-science/.

POSTED BY: Sangdon Lee
Posted 2 months ago

@SangdonLee Many thanks for your response, just one query, when I try to evaluate NetMeasurements using the following code:

validationData = trainingData2b;
accuracy = NetMeasurements[trainedNet, validationData, "Accuracy"]
precision = NetMeasurements[trainedNet, validationData, "Precision"]

I obtain the following strange results: 
1.
<|1 -> 1., 2 -> 1., 3 -> 1., 4 -> 1., 5 -> 1., 6 -> 1., 7 -> 1., 
 8 -> 1., 9 -> 1., 10 -> 1., 11 -> 1., 12 -> 1., 13 -> 1., 14 -> 1., 
 15 -> 1., 16 -> 1., 17 -> 1., 18 -> 1., 19 -> 1., 20 -> 1., 21 -> 1.,
  22 -> 1., 23 -> 1., 24 -> 1., 25 -> 1., 26 -> 1., 27 -> 1., 
 28 -> 1., 29 -> 1., 30 -> 1., 31 -> 1., 32 -> 1., 33 -> 1., 34 -> 1.,
  35 -> 1., 36 -> 1., 37 -> 1., 38 -> 1., 39 -> 1., 40 -> 1., 
 41 -> 1., 42 -> 1., 43 -> 1., 44 -> 1., 45 -> 1., 46 -> 1., 47 -> 1.,
  48 -> 1., 49 -> 1., 50 -> 1.|>

Firstly, I don't think the accuracy could be 1. Secondly, I would have expected some real number between 0 and 1 for the precision (instead I get this strange output). Do you have any idea what is going on here?

POSTED BY: Updating Name
Posted 2 months ago

The followings are mere suggestions.

  • The inputs are a 100 x 2 matrix (i.e., only 2 input parameters), but the outputs are a 100 x 100 matrix (a vector of 100 values due to one-hot vectors) which is huge compared to the 2 predictors. It might be helpful to change the one-hot coding as a number or string. e.g., {1,0,0,0,0.......} to 1, {0,1,0,0,0.......} to 2, etc.

    trainingData2 = Thread[trainingData[[All, 1]] -> Range[100]]
    net = NetChain[{LinearLayer[50], ElementwiseLayer["ReLU"], 
        LinearLayer[50], ElementwiseLayer["ReLU"], LinearLayer[1]}];  
    
  • The "net" has only very few hidden layers and it might be helpful to increase the number of hidden layers. Adding more layers does not necessarily increase the accuracy thus apply other layers such as batch normalization layer (adding which layer may not be straightforward)

  • The last 4 samples of the "trainingData" have the same input values but the output values are different, which does not make sense.
POSTED BY: Sangdon Lee
Posted 2 months ago

@SangdonLee Thanks for your response. Having implemented your suggestions I do note an improvement. One query, do you maybe know how to incorporate the ValidationSet[] built-in function into the type of neural network to increase the accuracy and prevent overfitting? I left highlighted in purple my attempt at including the ValidationSet[]. It does run but I don't think it is set in an optimal way. I attach the revised Notebook.

POSTED BY: Byron Alexander
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract