Message Boards Message Boards

1
|
3238 Views
|
0 Replies
|
1 Total Likes
View groups...
Share
Share this post:

Comments on paper about "Fourier feature networks"

Posted 3 years ago

So I stumbled upon this work : https://people.eecs.berkeley.edu/~bmild/fourfeat/

It is related to a previous paper about Sine REpresentation Networks (SIREN): https://vsitzmann.github.io/siren/

I had tried to implement SIRENs on Mathematica previously, and I wasn't convinced it was practical, at least without a GPU.

This paper about Fourier feature networks is different mostly in the sense that they only try to encode position with sine functions in the first layer of the network.

Anyway, here is my simplified implementation on Mathematica :

fox = ImageCrop[
   Import[
    "https://live.staticflickr.com/7492/15677707699_d9d67acf9d_b.jpg"],
   {#, #} &@512
   ];

size = 128;
img = ImageResize[fox, size];

net[width_ : size, depth_ : 4] := NetChain[
  {
   {width, Sin},
   ConstantArray[{width, Tanh}, depth - 1],
   {3, Ramp}
   }, 
  "Input" -> 2
  ]

f[n_ : Integer] := Range[-#, #, 1] &[(n - 1)/2]

trainedNet = NetTrain[
   net[size],
   Tuples[f /@ ImageDimensions@img] -> Flatten[ImageData@img, 1],
   MaxTrainingRounds -> 1000,
   BatchSize -> Quotient[size^2, 10]
   ];

ImageAssemble[
 {
  Image[Partition[trainedNet /@ Tuples[f /@ ImageDimensions@img], 
    size]],
  img
  }
 ]

Few points :

  • I don't normalize the input coordinates, as it seems to me some amplitude is needed to get significant values in low frequencies ;

  • I just do the positional encoding by using Sine as the first activation function. It seems to me that this accomplishes exactly the same thing as what the authors do with their gaussian random matrix or whatever. It's after all what was done in the SIREN paper anyway (except in the SIREN paper they did it in all layers) ;

  • I don't get why the authors use a Ramp in the hidden layers. That seems like a terrible idea as that makes these layers incapable to distinguish colinear vectors (I realized this when I saw the generated pictures exhibiting some radial symmetry). Instead I used Tanh which seems to train much better.

  • I use a Ramp in the output layer. I don't know why the authors use a logistic sigmoid.

  • I got a very good result in 128x128, with only a few minutes of training with CPU.

128x128 training result

POSTED BY: Lucien Grondin
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract