Message Boards Message Boards


Large Scale Optimization in Mathematica

Posted 9 years ago
6 Replies
1 Total Likes


I am trying to perform a large scale optimization on Mathematica - however the speed is very poor - in fact, when the number of variables is high then it just makes computer really slow, and I have to restart. Code snippets are below:

NoVariables = 10;
NoExamples = 5000;
allData = RandomReal[1, {NoExamples, NoVariables}];
X = Transpose[Prepend[Transpose[allData], Table[1, {NoExamples}]]];
Dimensions @ X  (* X is 200 by 101 matrix, with first column of only \
1's *)
Thetas = Table[Subscript[\[Theta], i], {i, 0, NoVariables}];
Dimensions @ Thetas (* Thetas are 101 size vector, with Subscript[\
\[Theta], 0] representing the coefficient of the constant term *)
Y = RandomChoice[{0., 1.}, NoExamples ];
Dimensions @ Y (*y is a vector of size 200, being either 0 or 1 *)

This creates the main data variables: Y (vector), X(Matrix). Thetas is a vector of parameters I want to optimize. Below I create the objective function:

(* Convex Objective Function, row wise, is: y Log[1+\[ExponentialE]^-\
\[Theta].x]-(-1+y) Log[1+\[ExponentialE]^(\[Theta].x)] 
We need to sum this over each row - there are 200 (NoExamples)
We will actually be scaling this by 1/NoExamples

We can vectorize this like follows to create the objective function \
element wise format: *)

ElementWiseObjective = 
  Y*Log[1 + Exp[-X.Thetas]] - (-1 + Y)*Log[1 + Exp[X.Thetas]];
Dimensions @ ElementWiseObjective

    (* Perform summation and scaling to get final form objective \
function: *)

Objective  = (1/NoExamples)*(Plus @@ ElementWiseObjective);

And then I run the optimization:

(* Perform Optimization *)
Timing @ FindMinimum[Objective, Thetas]

When NoVariables = 10 this takes about 7 seconds, and when NoVariables = 10, it takes about 40 seconds. When NoVariables = 100 then the computer becomes really slow, and it is impossible to abort or access other programs. Ideally I want NoVariables to be very large, perhaps even 500 or larger.

Is there a way I can tweak my code to tun this optimization, and then run it quicker? Any help would be much appreciated!

Thanks! Priyan.

POSTED BY: Priyan Fernando
6 Replies

I still can't run this when NoVariables = 400. This is a convex objective function, so thought the optimization should be quite straightforward.

This method seems to run quite fast in Matlab:

"The Polack-Ribiere flavour of conjugate gradients is used to compute search directions, and a line search using quadratic and cubic polynomial approximations and the Wolfe-Powell stopping criteria is used together with the slope ratio method for guessing initial step sizes."

I couldn't find a similar method in Mathematica, only Method -> "ConjugateGradient". But this still does not produce any answer at all for 400 variables, even if I set MaxIterations to 1 (computer seems to freeze even though I'm working on a moderate spec one).

Any help would be much appreciated! Thanks.

POSTED BY: Priyan Fernando

Doesn't FindMinimum automatically compile the objective function?

POSTED BY: Frank Kampas


Dimensions@X  (*X is 200 by 101 matrix,with first column of only 1's*)
Dimensions@Thetas (*Thetas are 101 size vector,with Subscript[\[Theta],0] representing the coefficient of the constant term*)
Dimensions@Y (*y is a vector of size 200,being either 0 or 1*)
Timing@FindMinimum[Objective[ X,Y,Thetas],Thetas,Method->"PrincipalAxis"]

needs 2.5 seconds (in V10 there is a strange CompiledFunction::cfta warning message, which is not there in V9)

POSTED BY: Rolf Mertig

I guess that becomes a very large function. Possibly if you can provide a sparse gradient via the Gradient option, that might offer some improvement. But it may just be too large.

POSTED BY: Daniel Lichtblau

Thanks Daniel. I still don't seem to be able to do an example with NoVariables = 400. Do you think this may be out of bounds for Mathematica?

POSTED BY: Priyan Fernando

It's a big objective function. I found that these settings improve somewhat on the speed.

FindMinimum[Objective, Thetas, Gradient -> "FiniteDifference",  Method -> "QuasiNewton"]
POSTED BY: Daniel Lichtblau
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
or Discard

Group Abstract Group Abstract