Group Abstract Group Abstract

Message Boards Message Boards

Best optimization for maximum entropy?

Hello. I am attempting to implement the maximum entropy ("MaxEnt") approach to species distribution modeling in the WL. The attached notebook has a function which calculates a penalized log-likelihood. It is based on the Excel workbook here: https://www.ecography.org/appendix/e7872. It also has the example data.

My question is how best to optimize over the parameters, which in this case are in the form of a twelve element vector? There are now so many optimization routines and options that it is a bit overwhelming!

POSTED BY: Gareth Russell
6 Replies

It seems to me (check it first) that instead of

expSum = Exp[sum - Max[sum]];
q = expSum/Total[expSum];
lnQ = Log[q];
logL = Total[Take[lnQ, n]]/n;

you can write

logL = Total[Take[sum, n]]/n - Log[Total[Exp[sum]]];

which is symbolically simpler. I don't know if this helps with the optimization.

POSTED BY: Gianluca Gorni

OMG! It is so much faster, presumably because, according to ByteCount, the symbolic expression is less than 4% the size of my original version.

POSTED BY: Gareth Russell

It's also a slightly better fit again. Still, I would like to know how it compares to a purely numerical fit to a compiled version of the code as given, since the main calculation is a matrix multiplication, which should be very fast. (Surely the symbolic expression generated internally by FindMaximum loses the matrix form?)

But thanks so much!

POSTED BY: Gareth Russell

If I am not mistaken, you had better get rid of the line

sum = sum - Max[sum]

because the Max[sum] term cancels out anyway in the next step (in theory). In practice it may survive and make calculations more complicated than necessary.

POSTED BY: Gianluca Gorni

Ha! You are right it doesn't change the output, and removing it makes it 10x faster again. So the lesson here folks is improvements in the underlying function can make the biggest difference. In my defense I was following a template made by others, and now I have to figure out why, for example, the 'divide by the max value' step was in there. Maybe it is used later. But even if it is, it can be left out of the fitting process!

Having said this, I still want to know how one would code this as a direct numerical optimization, ideally specifying the 12 parameters as a single vector (list). It is possible?

POSTED BY: Gareth Russell

Attempting to answer my own question by some trial and error. First up: Wolfram is finding better optima than Excel! (No surprise there.)

POSTED BY: Gareth Russell
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard