Message Boards Message Boards

How to implement a custom regularisation for a given LinearLayer?

Posted 3 years ago

Hi everybody,

I would like to implement a custom norm regularisation on the weights of a LinearLayer. But Let's suppose for ease of discussion that I want to implement a L1 regularisation. In this case my objective will be to minimise a loss which is: (loss of task) + (Total[Abs[weights]])

I tried to play around with NetArray with without much success, here's an example:

net = NetGraph[
  Association[
   "linear" -> LinearLayer[1, "Weights" -> NetArray["c"]],
   "reg" -> 
    FunctionLayer[{Total[
        Abs[NetArray[<|"Name" -> "c", "Dimensions" -> 100|>]]]} &],
   "thread" -> ThreadingLayer[#1 + #2 &]
   ],
  {
   NetPort["Input"] -> "linear",
   "reg" -> "thread",
   NetPort["linear", "Output"] -> "thread"
   }]

That can be trained for example with:

dataTrain = Table[RandomReal[1, 100] -> {RandomReal[]}, 100];
trained = NetTrain[net, dataTrain]

But when I go to inspect the actual value of the weights, I get different results. That is:

NetExtract[trained, {"linear", "Weights"}]

is different from

NetExtract[trained, {"reg", "Net", 1, "Array"}]

How can I implement it? Do you have any ideas/comment/observations?

(I'm pretty sure that the example I gave is wrong as I would like to minimise the loss of the ouput of linear layer + the sum of the absolute value of the weights, but I think that the essence is the same)

POSTED BY: Ettore Mariotti
4 Replies

Hi everybody,

I would like to implement a custom norm regularisation on the weights of a LinearLayer. But Let's suppose for ease of discussion that I want to implement a L1 regularisation. In this case my objective will be to minimise a loss which is: (loss of task) + (Total[Abs[weights]])

I tried to play around with NetArray with without much success, here's an example:

net = NetGraph[
  Association[
   "linear" -> LinearLayer[1, "Weights" -> NetArray["c"]],
   "reg" -> 
    FunctionLayer[{Total[
        Abs[NetArray[<|"Name" -> "c", "Dimensions" -> 100|>]]]} &],
   "thread" -> ThreadingLayer[#1 + #2 &]
   ],
  {
   NetPort["Input"] -> "linear",
   "reg" -> "thread",
   NetPort["linear", "Output"] -> "thread"
   }]

That can be trained for example with:

dataTrain = Table[RandomReal[1, 100] -> {RandomReal[]}, 100];
trained = NetTrain[net, dataTrain]

But when I go to inspect the actual value of the weights, I get different results. That is:

NetExtract[trained, {"linear", "Weights"}]

is different from

NetExtract[trained, {"reg", "Net", 1, "Array"}]

How can I implement it? Do you have any ideas/comment/observations?

(I'm pretty sure that the example I gave is wrong as I would like to minimise the loss of the ouput of linear layer + the sum of the absolute value of the weights, but I think that the essence is the same)

POSTED BY: Lothar Thiele

Here's a file. I believe it is close to correct, but I am still trying to figure out why the answer differs from that using Fit with Regularization, so it is quite possible I am doing something wrong. I have not had time to work through the issues, but perhaps this will help you overcome some barriers I encountered.

Attachments:
POSTED BY: Seth Chandler

This question looks very similar to one I posted a few weeks ago. https://community.wolfram.com/groups/-/m/t/2179969 I have made some progress on it but have not had time to post a notebook of results. My approach looks similar to yours with shared arrays. If you want to message me, I can send you a no warranties notebook.

POSTED BY: Seth Chandler

Yes I would be interested in knowing your approach!

POSTED BY: Ettore Mariotti
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract