Yes, there is an example and the documentation does provide some assistance. But it's not complete enough. Look at NetTrain. It shows four potential forms of data. The example in the documentation for LossFunction covers Case 2. There's no explicitly named "Target." Rather, the kernel figures out what data is the Target. But what about cases 3 and 4, for example, where there is nothing actually named "Target." How do we hook up the network then?
I find the documentation confusing on the specification of the loss function. Here's one example from the documentation.
loss = CrossEntropyLossLayer["Probabilities"];
trained =
NetTrain[net, {{1, 2} -> {1., 0.}, {2, 3} -> {1., 0.}, {4, 2} -> {0.,
1.}, {3, 1} -> {0., 1.}, {2, 2} -> {0.5, 0.5}, {3, 3} -> {0.5,
0.5}}, LossFunction -> loss]
So, here we train on loss. BUT .. look at the next example. You've added an additional port "Loss" (different than lowercase "loss"). Why do we need that? Why can't we train on "loss" as before?
lossNet =
NetGraph[<|"net" -> net,
"loss" ->
ThreadingLayer[(#1 - #2)^2 &]|>, {{"net", NetPort["Target"]} ->
"loss" -> NetPort["Loss"]}]
The next line of code in the documentation further complicates matters by not telling the user what exactly is being trained on. Are you training on "loss" on "Loss" or something else. And, so far as I can figure out there is no function in the System` context of 11.3 to figure out what function NetTrain is or has used to do its training.
data = Flatten@
Table[{x, y} -> Exp[-(x^2 + y^2)], {x, -2, 2, .01}, {y, -2,
2, .01}];
trainedLossNet = NetTrain[lossNet, data]
Perhaps some assistance can be found in the tutorial here in which you make clear that you are training on that final output (in the above example "Loss" and in that example "WeightedLoss" but there's no clear explanation of why this additional layer needs to be added.
The documentation for NetTrain reads: "When specifying target outputs using the specification Subscript[port, i]->{Subscript[data, i1],Subscript[data, i2],[Ellipsis]}, any provided custom loss layers should take the Subscript[port, i] as inputs in order to compute the loss." Does this mean you do it in the second argument to NetGraph? Is it some sort of optional argument to the layer that is an entry point in the custom loss function? An example or two would definitely help.
The documentation for NetTrain reads "When loss layers are automatically attached by NetTrain to output ports, their "Target" ports will be taken from the training data using the same name as the original output port." Honestly, I have read this numerous times and I still don't know what it means. This stuff is unavoidably complicated and is difficult to translate from programming language into English. Examples can really help clarify matters.
So, please, trust me. I've read the documentation. Maybe not perfectly, but quite a bit. And if a pretty experienced Wolfram Language user struggles mightily to figure it out -- and cares enough to write lengthy posts on the subject -- there is at least some possibility that the issue lies at least in part with documentation that needs improvement. I hear from Taliesin Beynon in another post that in fact there is work being done on that front. Good! I can only encourage you to try as hard as you can to imagine matters from a user's perspective. As the NeuralNetworks infrastructure matures please provide, as Wolfram generally does so well, examples and explanations that tackle the borderline and confusing cases in ways that bring clarity.
Note:
Here's an example of a documentation issue. In another post I suggested that one needed to use a function from the NeuralNetworkscontext in order to extract connectivity information from a NetGraph. A developer helpfully noted that one could instead use EdgeList from the System
context to do so. Great. But the documentation for EdgeList indicates that it works on Graph objects, not on NetGraphs. And the documentation for NetGraph says nothing about use of EdgeList.