Group Abstract Group Abstract

Message Boards Message Boards

Kolmogorov-Arnold networks (KANs) in Wolfram language

Kolmogorov Arnold Networks

Attachments:
POSTED BY: Andreas Hafver
7 Replies

Dear Andreas,

I implemented the MLP and KANs models using my own dataset. Please see the attached notebook.

The KANs model generated the following results: {{0.}, {0.}, {0.}, {0.}}

I’m not sure what the problem is.

Thank you for your help.

Attachments:
POSTED BY: M.A. Ghorbani
Posted 7 days ago

Dear Ghobani,

I checked your Mathematica codes and found that your KAN model does not reduce the loss. It seems that only one KANlayer[2,1,15,3] is not complex enough to learn.

kan = NetTrain[NetChain[{KANlayer[2,1,15,3]},"Input"->2],training]

I added one more KANlayer, then the loss is reduced.

dataSDL = ArrayReshape[N@Range[300], {100, 3}];

Two KANlayers:

kanSDLEE = 
 NetTrain[NetChain[{KANlayer[2, 5, 15, 3], KANlayer[5, 1, 15, 3]}], 
  dataSDL[[All, 1 ;; 2]] -> Transpose@{dataSDL[[All, 3]]}].

Three KAN layers:

kanSDLEE03 = 
 NetTrain[
  NetChain[{KANlayer[2, 5, 15, 3], KANlayer[5, 5, 15, 3], 
    KANlayer[5, 1, 15, 3]}], 
  dataSDL[[All, 1 ;; 2]] -> Transpose@{dataSDL[[All, 3]]}].
POSTED BY: Sangdon Lee

Hi Sangdon,

I appreciate your help. You did awesome.:)

I’m going to start working on the problem now.

Regards

POSTED BY: M.A. Ghorbani
Posted 3 days ago

Hi Dr. Ghorbani,

Note that I quickly created the following data to run MLP and KAN and it is ill-conditioned data, thus there will be infinite number of solutions (the columns are perfect linear combination of other columns), thus any least square estimation or its variations will be unstable. Please use different training dataset that you are familiar with.

dataSDL = ArrayReshape[N@Range[300], {100, 3}];

I remember reading a paper that claims KAN is better than MLP but I also remember a paper claiming that KAN is not. I guess it deepens on the goal, precise prediction with complex model or understandable structure with transparent model.

I think you are in the process of fine-tuning the net structure and it will be a time-consuming process.

It is a good practice in analyses that the input data should be standardized or normalized. I experienced that batch normalization layer improves the fitting well.

kanSDLEE04 = NetTrain[
  NetChain[
   {
    BatchNormalizationLayer[],
    KANlayer[2, 50, 10, 3],
    KANlayer[50, 1, 10, 3]
    }
   ],
  dataSDL[[All, 1 ;; 2]] -> Transpose@{dataSDL[[All, 3]]}, All, 
  MaxTrainingRounds -> 1000
  ]
POSTED BY: Sangdon Lee

Hi, I really appreciate your time.

I got the attached message when I ran the new command.I even ran different data, but the program still gave me the same error.

Best

Attachment

Attachments:
POSTED BY: M.A. Ghorbani
Posted 5 months ago

Dear Andreas: I use modification of your code for my paper: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4419282 (unpublished). What (pseudo-) metrics do you use to fit the polynomials -- MSE, Kullback-Leibler, something different? If you can answer, write me at pblerner18@gmail.com because I rarely use this place.

POSTED BY: Peter Lerner

enter image description here -- you have earned Featured Contributor Badge enter image description here Your exceptional post has been selected for our editorial column Staff Picks http://wolfr.am/StaffPicks and Your Profile is now distinguished by a Featured Contributor Badge and is displayed on the Featured Contributor Board. Thank you!

POSTED BY: EDITORIAL BOARD
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard