Message Boards Message Boards

Train NN for regression of f(x,y;t_i(y))?

I have physics research that I'm working on. I have a function called the spectral function ( $\sigma(s,P)$). It is a function of s (invariant mass squared) and P (center of mass momentum). The spectral function is the sum of an interacting part and non-interacting parts ( $\sigma(s,P)=\sigma_{non}(s,P)+\sigma_{inter}(s,P)$). The spectral function has the same general shape for all momentum, but it's parameters change as functions of momentum. For example: $$\sigma_{inter}(s,P)=\frac{A(P)\sqrt{s+P^2}\Gamma(P)}{\pi((s-M_{J/\Psi}^2(P))^2+(\sqrt{s+P^2}\Gamma(P))^2)}$$ You'll find in the notebook that what I'm actually trying to fit the interacting part is more complicated than that, but that's an idea of what I'm trying to do. I'm trying to use the NN to learn $A(P),M\_{J/\Psi}(P),\Gamma(P)$ and the 8 other parameters as functions of momentum.

I'm doing this so that I can extend the table. Once the parameters have been learned by the NN as functions of P, I'm going to try to fit those functions and run the table out further. The next step for the spectral function is calculate the spatial correlation function with it:$$G(z)=\int_{-P^2}^0ds\int\frac{dP\cos(Pz)}{s+P^2}\sigma(s,P)$$ This correlation function converges in momentum as $1/P_{cutoff}^3$ and covers a range that is 8 orders of magnitude from 10^-6 to 100. This requires me to go out 200TeV from the end of the full table to ensure the proper smoothness of G(z).

I have all of the layers for the NN built up for the functional shape. Everything is working correctly except the training, which returns NetTrain::arrdiv for the error. It suggests I normalize my data to have a standard normal distribution. Given that this is a regression of a function (as opposed to messy population data) I'm trying to do, I don't understand what that's supposed to mean (I do know what its talking about thanks to a Computerphile series). It makes some other suggestions, but I don't know how useful those might be as I've seen them before it wasn't the solution then. That other time was for the simpler case that I started with. I was trying to get the training to converge by giving it other measures of the data (log(y), y', y"/y, and a few others) to highlight other features of the data. The solution was to reduce the redundancy of the NN.

Here's the list of the things I've tried to do:

  • Fit the sum of the non-interacting and interacting parts
  • Fit the list of the non-interacting and interacting parts
  • Fit the non-interacting and interacting parts independently

Things I could do but haven't because it is 2am:

  • Fit a function (such as the log or log(abs())) of the data under the 3 cases listed above

The thing I think it is, and am slowly working through:

  • It appears that the several of the subnets are generating the error NetNaN. It's reporting an underflow, overflow, or division by zero in a net evaluation. It shold also add to that list complex number generation (Log[x<0]).

This is the culmination of a previous post for a simpler problem that I consider resolved at this point (https://community.wolfram.com/groups/-/m/t/1710236 ). I'm just looking for advise/solutions to getting the NN to train at this point as everything else appears to be in order.

Important note for following along: The original version of this notebook file is expecting tables that go out 788 in j instead of these that stop at 288. The means the momentum is only valid up to 100 GeV in momentum. If it complains out about being out of array bounds when generating data of the provided tables, this why. I tried to fix that for you, but I might have missed something.

(Edited to add clarity that appears to not have been present when John Hendrickson read it the first time)

POSTED BY: Isaac Sarver

I'm doing this so I can extend the function a long ways out. The full table ends at 600 GeV in momentum and 552.25 GeV^2 in s. I need it to go out to 200TeV in momentum and maybe 400^2 GeV^2 in s. I need to go this far because the next step is stick it in an integral that doesn't like to converge (converges with $1/P_{cutoff}^3$) and has a range of 8 orders of magnitude: $$\int \frac{ds dP\cos(Pz)}{s+P^2}\sigma(s,P)$$

The stuff about Context doesn't make sense to me as I didn't ask about that. There's only one question here. I was trying to explain what it was you were looking at when you open the notebook. (I found the math typeface stuff, so I'm about to clarify the original post.)

When I say it suggests a thing, that means it came from the error message, specifically NetTrain::arrdiv. This suggestion is included in the error message for divergent gradients while training. Sometimes error messages include hints about how to fix them. The last time I caused it, I was trying to get the training to converge by giving it other measures (log(y), y', y"/y, and few others) of the data to try to fit, but the convergence problem was caused by excessive redundancy in the NN.

I did make a separate notebook to ensure that the idea was sound to start with and figure out how it should done (but I'll attach that here for your convenience). That's the other tread that I point to. It is much simpler. The alternative is to do this trick by hand for each momentum. That's not too bad past 30 GeV, but it isn't so good below that.

Attachments:
POSTED BY: Isaac Sarver
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract