Group Abstract

Message Boards

538 Views

2 Replies

0 Total Likes

View groups...

Follow this post

Share this post:

GROUPS:

Computer Science Data Science Neural Networks

Posted 2 months ago

Maybe the theorem has a set of measure 0 for which it fails? I have 3 nested functions, each quadratic in its inputs. A standard multi-layer linear plus logistic NN with ADAM optimizer can't seem to minimize the loss (or it gets stuck in a local minimum). I've tried varying the number of layers, and their width, but to no avail. What's going on?

POSTED BY: Iuval Clejan

2 Replies

Sort By:

Posted 1 month ago

I get better behavior when I change the activation function to RELU. Still awful generalization, but that's another matter.

POSTED BY: Iuval Clejan

Posted 2 months ago

Just run the file and you'll see that the loss stays constant (and high). This doesn't occur if even one of the intermediate functions is linear, the algorithm works to reduce the loss to a small number in that case.

POSTED BY: Updating Name

Reply to this discussion

Reply Preview

Attachments

Remove Add a file to this post

Follow this discussion

or Discard

Feedback