Group Abstract

Message Boards

WOLFRAM COMMUNITY

5.4K Views

6 Replies

10 Total Likes

View groups...

Follow this post

Share this post:

GROUPS:

NetDecoder[ ] failed to decode NetEncoder["AudioSTFT"] output?

John M.

Posted 3 years ago

POSTED BY: John M.

6 Replies

Sort By:

John M.

Posted 3 years ago

POSTED BY: John M.

Jérôme Louradour

Jérôme Louradour, Wolfram Research

Posted 3 years ago

Sorry, you should try `TransposeLayer[{2, 3, 1}]` in the discriminator instead of `TransposeLayer[{3, 2, 1}]` (which is the same as `TransposeLayer[1 <-> 3]`). And `TransposeLayer[{3, 1, 2}]` in the generator. BTW, when I try to use your EXAMPLE2.nb, I don't understand how it can fit the dimensions. I have this error: NetInitialize[discriminator][Audio[File["ExampleData/car.mp3"]]] During evaluation of In[17]:= NetChain::invindata3: Data supplied to port "Input" could not be encoded; "Function" encoder did not produce an output that was a 2562562 array of real numbers. Out[17]= $Failed because indeed the NetEncoder is not producing arrays of size {256,256,2} (the first dimension varies depending on the length of the signal): Dimensions[enc[Audio[File["ExampleData/car.mp3"]]]] Out[18]= {2693,256,2} Do you use audio signals (`FileNames["*.wav", NotebookDirectory[]]`) that have all a particular length? Also, do you get why your "c" seems to be 2 while it's 1 in the paper?

POSTED BY: Jérôme Louradour

Jérôme Louradour

Jérôme Louradour, Wolfram Research

Posted 3 years ago

Happy to see GANs with audio in the Wolfram Language :) Quick guess: Can you try `TransposeLayer[{3, 1, 2}]` instead of `TransposeLayer[{1 <-> 3}]` and `TransposeLayer[{3 <-> 1}]`

POSTED BY: Jérôme Louradour

John M.

Posted 3 years ago

For sure!, I wish there was more examples of how to use `NetGANOperator[]` online, & I was excited when it was implemented. I tried changing the `TranspsoseLayers[]` from {3 <-> 1} to {3, 1, 2} & it gave this error: NetChain::valfail: Validation failed for ConvolutionLayer: kernel size 44 cannot exceed input size 1128 plus padding size 22. Then, I changed them to from {3 <-> 1} to {3, 2, 1} & I could evaluate the nets, but I still got bad from the generator results after training. I even tried adjusting my parameters: kern = {4, 4}; chan = 128; α = 0.2; & restructuring the generator & discriminator more closely following the example : discriminator = NetChain[ { TransposeLayer[{3, 2, 1}, "Input" -> {256, 256, 2}], ConvolutionLayer[chan, kern, "Stride" -> 2, PaddingSize -> 1], ParametricRampLayer[{}, "Slope" -> \[Alpha]], ConvolutionLayer[chan2, kern, "Stride" -> 2, PaddingSize -> 1], ParametricRampLayer[{}, "Slope" -> \[Alpha]], ConvolutionLayer[chan4, kern, "Stride" -> 2, PaddingSize -> 1], ParametricRampLayer[{}, "Slope" -> \[Alpha]], ConvolutionLayer[chan8, kern, "Stride" -> 2, PaddingSize -> 1], ParametricRampLayer[{}, "Slope" -> \[Alpha]], ConvolutionLayer[chan16, kern, "Stride" -> 2, PaddingSize -> 1], ParametricRampLayer[{}, "Slope" -> \[Alpha]], ConvolutionLayer[chan32, kern, "Stride" -> 2, PaddingSize -> 1], ParametricRampLayer[{}, "Slope" -> \[Alpha]], ReshapeLayer[{4412832, 1}], LinearLayer[{}] }, "Input" -> enc ] . generator = NetChain[ { LinearLayer[{409644 }], ReshapeLayer[{4096, 4, 4}], ElementwiseLayer["ReLU"], DeconvolutionLayer[chan32, kern, "Stride" -> 2, PaddingSize -> 1], ElementwiseLayer["ReLU"], DeconvolutionLayer[chan16, kern, "Stride" -> 2, PaddingSize -> 1], ElementwiseLayer["ReLU"], DeconvolutionLayer[chan8, kern, "Stride" -> 2, PaddingSize -> 1], ElementwiseLayer["ReLU"], DeconvolutionLayer[chan4, kern, "Stride" -> 2, PaddingSize -> 1], ElementwiseLayer["ReLU"], DeconvolutionLayer[chan2, kern, "Stride" -> 2, PaddingSize -> 1], ElementwiseLayer["ReLU"], DeconvolutionLayer[2, kern, "Stride" -> 2, PaddingSize -> 1], ElementwiseLayer[Tanh], TransposeLayer[{3, 2, 1}] }, "Input" -> 100, "Output" -> dec ] After training, though, the generator only generated noise. I'm certain it has something to do with the dimensions {256,256,2} getting somehow switched around in the net, but I don't know where/how. In the MATLAB example, the `TransposeLayer[]` equivalents come at the opposite ends of the generator & discriminator (i.e., BEFORE the `DeconvolutionLayer[]`s in the generator & AFTER the `ConvolutionLayer[]`s in the discriminator). I tried doing building the nets that way, but I get errors & can't evaluate the cells with my `NetChain[]`s until I do it in reverse. The dimensions in the NetChain] box are reverse of the way the dimensions are outlined in [the paper too, e.g., _ _ _ I'm sure it's just a simple transposition issue, any tips would be greatly appreciated, I'd love to get this going in Mathematica but there are obviously some details here I'm missing. I've included an updated EXAMPLE notebook. Thanks. Attachments: EXAMPLE2.nb

For sure!, I wish there was more examples of how to use NetGANOperator[] online, & I was excited when it was implemented.

I tried changing the TranspsoseLayers[] from {3 <-> 1} to {3, 1, 2} & it gave this error:

NetChain::valfail: Validation failed for ConvolutionLayer: kernel size 4*4 cannot exceed input size 1*128 plus padding size 2*2.

Then, I changed them to from {3 <-> 1} to {3, 2, 1} & I could evaluate the nets, but I still got bad from the generator results after training. I even tried adjusting my parameters:

kern = {4, 4};
chan = 128;
α = 0.2;

& restructuring the generator & discriminator more closely following the example :

discriminator =
 NetChain[
  {
   TransposeLayer[{3, 2, 1}, "Input" -> {256, 256, 2}],
   ConvolutionLayer[chan, kern, "Stride" -> 2, PaddingSize -> 1],
   ParametricRampLayer[{}, "Slope" -> \[Alpha]],
   ConvolutionLayer[chan*2, kern, "Stride" -> 2, PaddingSize -> 1],
   ParametricRampLayer[{}, "Slope" -> \[Alpha]],
   ConvolutionLayer[chan*4, kern, "Stride" -> 2, PaddingSize -> 1],
   ParametricRampLayer[{}, "Slope" -> \[Alpha]],
   ConvolutionLayer[chan*8, kern, "Stride" -> 2, PaddingSize -> 1],
   ParametricRampLayer[{}, "Slope" -> \[Alpha]],
   ConvolutionLayer[chan*16, kern, "Stride" -> 2, PaddingSize -> 1],
   ParametricRampLayer[{}, "Slope" -> \[Alpha]],
   ConvolutionLayer[chan*32, kern, "Stride" -> 2, PaddingSize -> 1],
   ParametricRampLayer[{}, "Slope" -> \[Alpha]],
   ReshapeLayer[{4*4*128*32, 1}],
   LinearLayer[{}]
   }, "Input" -> enc

  ]

generator =
NetChain[
{

LinearLayer[{4096*4*4 }],
ReshapeLayer[{4096, 4, 4}],
ElementwiseLayer["ReLU"],
DeconvolutionLayer[chan*32, kern, "Stride" -> 2, PaddingSize -> 1],
ElementwiseLayer["ReLU"],
DeconvolutionLayer[chan*16, kern, "Stride" -> 2, PaddingSize -> 1],
ElementwiseLayer["ReLU"],
DeconvolutionLayer[chan*8, kern, "Stride" -> 2, PaddingSize -> 1],
ElementwiseLayer["ReLU"],
DeconvolutionLayer[chan*4, kern, "Stride" -> 2, PaddingSize -> 1],
ElementwiseLayer["ReLU"],
DeconvolutionLayer[chan*2, kern, "Stride" -> 2, PaddingSize -> 1],
ElementwiseLayer["ReLU"],
DeconvolutionLayer[2, kern, "Stride" -> 2, PaddingSize -> 1],
ElementwiseLayer[Tanh],
TransposeLayer[{3, 2, 1}]
},
"Input" -> 100,
"Output" -> dec
]

After training, though, the generator only generated noise. I'm certain it has something to do with the dimensions {256,256,2} getting somehow switched around in the net, but I don't know where/how. In the MATLAB example, the TransposeLayer[] equivalents come at the opposite ends of the generator & discriminator (i.e., BEFORE the DeconvolutionLayer[]s in the generator & AFTER the ConvolutionLayer[]s in the discriminator). I tried doing building the nets that way, but I get errors & can't evaluate the cells with my NetChain[]s until I do it in reverse. The dimensions in the NetChain] box are reverse of the way the dimensions are outlined in [the paper too, e.g.,

enter image description here _ _ _

I'm sure it's just a simple transposition issue, any tips would be greatly appreciated, I'd love to get this going in Mathematica but there are obviously some details here I'm missing.

I've included an updated EXAMPLE notebook. Thanks.

POSTED BY: John M.

Jérôme Louradour

Jérôme Louradour, Wolfram Research

Posted 3 years ago

POSTED BY: Jérôme Louradour

John M.

Posted 3 years ago

Attachments: EXAMPLE.nb

POSTED BY: John M.

Reply to this discussion

Reply Preview

Attachments

Remove Add a file to this post

Follow this discussion

or Discard

Feedback