Group Abstract

Message Boards

WOLFRAM COMMUNITY

6.4K Views

0 Replies

1 Total Like

View groups...

Follow this post

Share this post:

GROUPS:

Writting and Training NN with an arbitary (NIntegrate) function layer

Isaac Sarver

Isaac Sarver, Cyclotron Institute, Texas A&M University

Posted 6 years ago

6/30: I have programmed the arbitrary layer that I need in terms of layers. It wasn't easy, but it was done. It required 49 layers when it gets flattened out. I'll tell you, it calls for one of the prime rules of programming: If you do it more than once, or think of it steps, encapsulate it in a function (sublayer). Dispersion Relation.nb is that layer. If you can find a better way to do that, I'd love to hear it. I noticed that "ConstantArrayLayer" is trainable. I suppose that means I need to embed this layer that I just finished into my loss function in order to keep NetTrain from modifying it don't I? 6/27: After much futzing, I found the easy way to make the arbitrary function with ThreadingLayer[] and how to make the whole thing work the way I needed. I found that, in this case, I just needed the data around the strongest feature and to not over complicate the NN. I went from NetChain[{150, Tanh, 150, Tanh, 3}] to NetChain[{20, Tanh, 20, Tanh, 3}]. In my piecing this together, I think I went from 22,500 quadratics pieced together into 3 equations, to 400 quadratics. I wasn't having a problem with over-fitting (which, with a exact function as the starting point, an absurd idea). I was having a problem with convergence because there was just way too much redundancy. Now for the real trick: I need a layer that is defined by a numeric integral. In looking at the layers available, it looks like that is not a thing. But, if any one knows of good way to do that without invoking many layers, that would be amazing. 6/24: I think I may have misunderstood the point of the NetEncode[] and NetDecode[] functions. I've gotten away on this simpler fit with using a ThreadingLayer[] that won't be stripped off from either the input of the loss function or the output of the network. But I don't think that solves my overall problem of the fact that I need an NIntegrate[] and a compiled function to make the final problem work correctly. I'm sure those things can be made with the standard layers, but that still sounds like hell and really long training times. Here's what I was trying to stuff into the NetDecode[] because it's custom function doesn't have restrictions like ThreadingLayer[]: CustomFunction=NetGraph[<\|"s"->PartLayer[1],"A"->PartLayer[2],"\[Gamma]"->PartLayer[3],"MJ\[CapitalPsi]"->PartLayer[4],"Function"->ThreadingLayer[#2#3/(\[Pi]((#1-#4^2)^2+#3^2))&,"Inputs"->4]\|>,{NetPort["Input"]->"s"->NetPort["Function","1"],NetPort["Input"]->"A"->NetPort["Function","2"],NetPort["Input"]->"\[Gamma]"->NetPort["Function","3"],NetPort["Input"]->"MJ\[CapitalPsi]"->NetPort["Function","4"]}] At this point, something seems wrong and I could use some advice. The NN that does A(P), gamma(P), MJPsi(P) trained off the functions of the parameters converged very quickly to good fits. I don't have functions of the parameters to train off in the final problem. I only have f(s,P). Training off f(s,P) is 1) going slowly (5000 rounds in "5 hours" instead of 10 minutes), and 2) not converging (no progress in 300 rounds instead of instant improvement). Does anyone know how make it converge and in a timely manner? Parameter Function Fit.nb is the first case and is there to show that the NN can hold a fit of the 3 functions. Function Fit.nb is the one that is step closer to the final problem where I fit the parameters as if they are unknown except for the effect on the final f(s,P). 6/23: I'm trying train a NN that will fit a given f(s,P) as f(s,A(P),gamma(P),M(P)). The reason for this is that this is the simpler version of this problem and I know that f(s,P) has a certain form as a function of s and it's parameters are functions of P. At this point, I think I'm ready to train it, but I'm generating the error NetTrain::invindim3. I tried to search the forum and I think I found this related post (https://community.wolfram.com/groups/-/m/t/1262168). In the replies, the OP comes across a similar problem and is told he needs a SoftmaxLayer before the Output decoder so NetTrain doesn't munch something. My NetDecoder is a custom function representing the functional form I know is there. I suppose I need to make a training network to prevent NetTrain from munching my output layer, but if there are warnings, hazarads, other solutions, or advice I should aware, please tell me. I trained the NN to reproduce the parameters as a function of P. I previously trained it directly off the parameters. This is cheating for the final problem, but I can get it into the network and reproduce the final function correctly. So I know the NN is working correctly. I just need to train the whole thing against my data before I try final problem. To get your notebook ready to replicate the error (I understand that the custom decoder is a version 12 feature): A[P_]:=.5/Sqrt[1.2^2+P^2]+.2 \[Gamma][P_]:=.05+(.6Log[2.1 .194 5])/Log[(P^2+(2.1 .194)^2)25] MJ\[CapitalPsi][P_]:=3.04-.5Exp[-((P-5)^2/5)] \[Sigma][s_,P_]:=(A[P]\[Gamma][P])/(\[Pi]((s-MJ\[CapitalPsi][P]^2)^2+\[Gamma][P]^2)) Data=Flatten[RandomSample[Table[{{s,P},\[Sigma][s,P]},{s,0,20,.1},{P,0,30,.1}]],1]; TrainingData=<\|"Input"->Data[[;;50000,1]],"Output"->Data[[;;50000,2]]\|>; ValidationData=<\|"Input"->Data[[50001;;,1]],"Output"->Data[[50001;;,2]]\|>; BlackBox=NetChain[{150, Tanh, 150, Tanh, 3}] Decoder=NetDecoder[{"Function",(#[[2]]#[[3]])/(\[Pi]((#[[1]]-#[[4]]^2)^2+#[[3]]^2))&}] EmbededBlackBox=NetGraph[<\|"BlackBox"->BlackBox,"Join"->PrependLayer[],"s"->PartLayer[1],"P"->PartLayer[2]\|>,{NetPort["Input"]->"P"->"BlackBox"->NetPort["Join","Input"],NetPort["Input"]->"s"->NetPort["Join","Element"]},"Input"->2,"Output"->Decoder] This is the line where everything falls apart: result=NetTrain[EmbededBlackBox,TrainingData,All,ValidationSet->ValidationData,MaxTrainingRounds->5000,TrainingProgressMeasurements->"MeanSquare"] NetTrain::invindim3: Data provided to port "Output" should be a non-empty list of length-4 vectors of real numbers, but was a length-50000 vector of real numbers. I understand the custom decode layer maybe unusual, but it can't be replaced with standard layers without a great deal of pain as it will become a function that includes an NIntegrate or a related custom GPU kernel in the next step. Attachments: Dispersion Relation.nb Function Fit.nb Parameter Function Fit.nb

POSTED BY: Isaac Sarver

Reply to this discussion

Reply Preview

Attachments

Remove Add a file to this post

Follow this discussion

or Discard

Feedback