Message Boards Message Boards

Find global and local optima of a function?

I have to find local and global optima of a function, e.g.:

f[x_, y_, z_] := x^3 + y^2 + z^2 + 12 x y + 2 z

I use the following Wolfram functions:

Maximize[f[x, y, z], {x, y, z}]
NMaximize[f[x, y, z], {x, y, z}]
FindMaximum[f[x, y, z], {x, y, z}]

Minimize[f[x, y, z], {x, y, z}]
NMinimize[f[x, y, z], {x, y, z}]
FindMinimum[f[x, y, z], {x, y, z}]

None of them find any optimal points. Nevertheless, I try to find stationary/critical points of the function:

In[10]:= criticalPoints = Solve[D[f[x, y, z], {{x, y, z}}] == {0, 0, 0}, {x, y, z}]
matrixHessian = D[D[f[x, y, z], {{x, y, z}}], {{x, y, z}}]

Out[10]= {{x -> 0, y -> 0, z -> -1}, {x -> 24, y -> -144, z -> -1}}
Out[11]= {{6 x, 12, 0}, {12, 2, 0}, {0, 0, 2}}

There are two stationary/critical points. Now, I have to consider the Hessian definition in these points:

In[22]:= PositiveDefiniteMatrixQ[matrixHessian /. criticalPoints[[1]]]
NegativeDefiniteMatrixQ[matrixHessian /. criticalPoints[[1]]]
PositiveDefiniteMatrixQ[matrixHessian /. criticalPoints[[2]]]
NegativeDefiniteMatrixQ[matrixHessian /. criticalPoints[[2]]]

Out[22]= False
Out[23]= False
Out[24]= True
Out[25]= False

In[20]:= NegativeSemidefiniteMatrixQ[
 matrixHessian /. criticalPoints[[1]]]
PositiveSemidefiniteMatrixQ[matrixHessian /. criticalPoints[[1]]]

Out[20]= False
Out[21]= False

In conclusion, the point {x -> 24, y -> -144, z -> -1} is a local minimum.

Also, this may be confirmed by applying FindMinimum[] and some appropriate initial point:

In[27]:= FindMinimum[f[x, y, z], {{x, 24}, {y, -23}, {z, 0}}]

Out[27]= {-6913., {x -> 24., y -> -144., z -> -1.}}

I solved stated problem, but some dissatisfaction remains: is this an efficient approach to find optimal points of a function?

5 Replies

One more example that highlights the beauty of the Wolfram Language and Wolfram Mathematica.

We need to find optimal points of a classical problem of constraint optimization:

Minimize $f[x,y]$

Subject to $g[x,y]=0,$

where

f[x_, y_] := x Sin[y]
g[x_, y_] := 3 x^2 - 4 Cos[y] - 1

First, let us apply built-in Wolfram functions:

In[3]:= Maximize[{f[x, y], g[x, y] == 0}, {x, y}]
Minimize[{f[x, y], g[x, y] == 0}, {x, y}]

Out[3]= Maximize[{x Sin[y], -1 + 3 x^2 - 4 Cos[y] == 0}, {x, y}]
Out[4]= Minimize[{x Sin[y], -1 + 3 x^2 - 4 Cos[y] == 0}, {x, y}]

Unfortunately, these functions do not find solutions.

Let us apply numerical methods and/or the correspondent Wolfram functions:

In[5]:= NMaximize[{f[x, y], g[x, y] == 0}, {x, y}]
NMinimize[{f[x, y], g[x, y] == 0}, {x, y}]

Out[5]= {0.866025, {x -> 0.999992, y -> 1.04721}}
Out[6]= {-0.866025, {x -> 1., y -> -1.0472}}

In[7]:= FindMaximum[{f[x, y], g[x, y] == 0}, {x, y}]
FindMinimum[{f[x, y], g[x, y] == 0}, {x, y}]

Out[7]= {0.866025, {x -> 1., y -> 1.0472}}
Out[8]= {-0.866025, {x -> 1., y -> 5.23599}}

We found one global maximum:

{0.866025, {x->1.,y->1.04721}}

and two global minima:

{-0.866025, {x->1.,y->-1.0472}}

{-0.866025, {x->1.,y->5.23599}}.

But some questions still remain... Did we find all optimal points?

To respond to this and any other questions, let us apply the Lagrange function and principle.

Let us consider the regular Lagrange function $L[\lambda, x,y] = f[x,y] + \lambda g[x, y]$:

L[\[Lambda]_, x_, y_] := f[x, y] + \[Lambda] g[x, y]

We can find critical points :

In[10]:= criticalPoints = Solve[D[L[\[Lambda], x, y], {{\[Lambda], x, y}}] == 0, {\[Lambda], x, y}, Reals]

Out[10]= {{\[Lambda] -> ConditionalExpression[-(1/6) Cos[\[Pi]/6 - 2 \[Pi] C[1]], C[1] \[Element] Integers], 
  x -> ConditionalExpression[1, C[1] \[Element] Integers],   y -> ConditionalExpression[\[Pi]/3 + 2 \[Pi] C[1], 
C[1] \[Element] Integers]},
{\[Lambda] -> ConditionalExpression[1/6 Cos[\[Pi]/6 - 2 \[Pi] C[1]],  C[1] \[Element] Integers], 
  x -> ConditionalExpression[-1, C[1] \[Element] Integers],  y -> ConditionalExpression[\[Pi]/3 + 2 \[Pi] C[1], 
    C[1] \[Element] Integers]},
{\[Lambda] -> ConditionalExpression[-(1/6) Cos[\[Pi]/6 + 2 \[Pi] C[1]], C[1] \[Element] Integers], 
  x -> ConditionalExpression[-1, C[1] \[Element] Integers],  y -> ConditionalExpression[-(\[Pi]/3) + 2 \[Pi] C[1], 
    C[1] \[Element] Integers]},
{\[Lambda] -> ConditionalExpression[1/6 Cos[\[Pi]/6 + 2 \[Pi] C[1]], C[1] \[Element] Integers], 
  x -> ConditionalExpression[1, C[1] \[Element] Integers], y -> ConditionalExpression[-(\[Pi]/3) + 2 \[Pi] C[1], 
    C[1] \[Element] Integers]}}

Remark, the number of critical points is infinity. Nevertheless, they form only four groups of points for which $x$ may have only two values: 1 and -1, and $y$ may have an infinity number of values, but all of them are based on two main values: $\pi$/3 and -$\pi$/3. Formally, we can represent the critical points in a simplified manner:

  1. {x->1,y-> $\pi/3 + 2 k \pi$, k = 0, $\pm$1, $\pm$2, ...
  2. {x->-1,y-> $\pi/3 + 2 k \pi$, k = 0, $\pm$1, $\pm$2, ...
  3. {x->-1,y->-$\pi/3 + 2 k \pi$, k = 0, $\pm$1, $\pm$2, ...
  4. {x->1,y->-$\pi/3 + 2 k \pi$, k = 0, $\pm$1, $\pm$2, ...

What is the nature of these points? To respond to this question, let us consider the Hessian:

In[11]:= matrixHessian = D[D[L[\[Lambda], x, y], {{x, y}}], {{x, y}}]

Out[11]= {{6 \[Lambda], Cos[y]}, {Cos[y],  4 \[Lambda] Cos[y] - x Sin[y]}}

Let us observe now that the Hessian may be Negative Definite for the first and third group of points as the correspondent value of $\lambda$ is negative, and may be positive definite for the second and fourth group of points, for the similar reason.

Let us observe that the Hessian determinant is positive and has the same value $3/4$ for all the critical points:

In[12]:= FullSimplify[
   Det[D[D[L[\[Lambda], x, y], {{x, y}}], {{x, y}}]] /. 
    criticalPoints[[#]]] & /@ {1 ;; 4}

Out[12]= {{ConditionalExpression[3/4, C[1] \[Element] Integers], 
  ConditionalExpression[3/4, C[1] \[Element] Integers], 
  ConditionalExpression[3/4, C[1] \[Element] Integers], 
  ConditionalExpression[3/4, C[1] \[Element] Integers]}}

According to the Sylvester criterion, the Hessian of the Lagrange function is Negative Definite for the first and third group of points, and it is Positive Definite for the second and fourth group of points.

In conclusion, the first and the third group of points are the points of maximum, the second and the fourth group of points are the points of minimum. It is easy to compute the value of the objective function in these points:

f[x, y] /. criticalPoints

{ConditionalExpression[Cos[\[Pi]/6 - 2 \[Pi] C[1]], 
  C[1] \[Element] Integers], 
 ConditionalExpression[-Cos[\[Pi]/6 - 2 \[Pi] C[1]], 
  C[1] \[Element] Integers], 
 ConditionalExpression[Cos[\[Pi]/6 + 2 \[Pi] C[1]], 
  C[1] \[Element] Integers], 
 ConditionalExpression[-Cos[\[Pi]/6 + 2 \[Pi] C[1]], 
  C[1] \[Element] Integers]}

So, the objective function is equal to $\frac{\sqrt[] 3}{2}$ on all the maximum points, and is equal to $-\frac{\sqrt[] 3}{2}$ on all the minimum points.

The next Manipulate[] function illustrates additionally that all found points are global optima.

Manipulate[
 ContourPlot[{x Sin[y] == \[Alpha], 
   3 x^2 - 4 Cos[y] - 1 == 0}, {x, -20, 20}, {y, -20, 20}, 
  PerformanceGoal -> "Quality"], {\[Alpha], -2, 2}]

The documentation for Minimize and Maximize state that they use the Kuhn-Tucker equations for bounded problems. However, since they find a global optimum, that doesn't help with your problem.

POSTED BY: Frank Kampas

I use Wolfram Mathematica to teach the course of Optimization Methods. The example presented above is only an elementary sample which I use to explain how to find optima of a function. The same approach I use to find the optima of a constrained problem with the means of Lagrange function and KKT conditions. I addressed the question to the community because it was strange for me that for a such simple problem the built-in functions leave for an inexperienced user a chance to not find the solution.

Minimize, Maximize, NMinimize and NMaximize look for global solutions. As you discovered, FindMinimum will work if you start it near your local minimum. If you want all the critical points for an optimization problem, use the Kuhn-Tucker equations.

POSTED BY: Frank Kampas

Thank you for the comment, Frank. Sure, for a constrained optimization problem I can use Karush-Kuhn-Tucker conditions. But, the Wolfram Language is a symbolic one and it is somewhat strange that buit-in functions do not apply the KKT conditions to find the optima.

Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract