User Portlet User Portlet

Discussions
Sorry, you should try `TransposeLayer[{2, 3, 1}]` in the discriminator instead of `TransposeLayer[{3, 2, 1}]` (which is the same as `TransposeLayer[1 3]`). And `TransposeLayer[{3, 1, 2}]` in the generator. BTW, when I try to use your...
It's true that the example in section "Train a classifier model with the subword embeddings" should be completed to show how to "reconstruct" the full classifier and apply it. The part that is missing for you is the following: Extract the...
I confirm that there is currently no way to plug a custom optimizer in NetTrain. We will check how to support more optimizers, like Adamax. However your "divergence" problem is not due to the optimization algorithm, as you discovered in the...
[@Mike Sollami][at0] Thank you for digging into this. In order to make Sabrina's code work again in v13, you can use the following definitions (before evaluating the layers/networks): ScalarTimesLayer[s_] := ElementwiseLayer[s*#&] ...
This is a bug (memory leak) in `NetEncoder["Image"]` when the inputs are files. The bug is present in 12.3 and 13.0. It was fixed in 13.0.1. Unfortunately, I don't know of any workaround.
There was a bug found in the pre-processing BERT when the front-end language is not English (for instance when it is Chinese). We will update BERT to fix it. (and people who already downloaded the model will have to re-download it after having...
When the net specified in LossFunction produces an array (not a scalar number), a SummationLayer[] is automatically added. So the total of the numbers in the output array is used as a loss. Thanks for spotting, we will add a note in the...
The thing is that the batch losses during training were NOT computed with the final trained net, but with a partially trained network, which is different for every training batch. You can get the partially trained network using the association key...
You are referring to the right tutorial page (tutorial/NeuralNetworksSequenceLearning#1680168479 : subsection "Integer Addition with Variable-Length Output" of "Sequence-to-Sequence Learning"). So you should also define an architecture in two...
Indeed things changed in 12.2 w.r.t the order of NetGraph's port. The rule is the following: the ports are in the order in which they appear in the list of edges. Note that some ports, like NetPort["Input"] (or NetPort["Output"]), can be...