Message Boards Message Boards

[✓] Use of Gradient in FindMinimum?

GROUPS:

I use FindMinimum using objective and gradient functions that take a matrix and vector as inputs, as in

FindMinimum[f[mat, vec], {{mat, matstart}, {vec, vecstart}}, Gradient :> g[mat, vec}], Method -> "QuasiNewton"]

The arguments to f[mat,vec] and g[mat,vec] are a matrix (mat) and a vector (vec). FindMinimum wants the Gradient function g to return a vector, which makes sense. Question is in which order should the derivatives be specified? The most logical option appears to be that corresponding to Flatten[{mat,vec}]... but I'm not having much luck with that.

Thanks for any help.

-Eric

POSTED BY: Eric Michielssen
Answer
15 days ago

I think this does seem to be working for me. Am I doing something different from what you are?:

In[1]:= f[
  {{m11_, m12_}, {m21_, m22_}},
  {v1_, v2_}
  ] = (2 - 3 m11)^2 + (5 - 7 m12)^2 + (11 - 13 m21)^2 + (17 - 
     19 m22)^2 + (23 - 29 v1)^2 + (31 - 37 v2)^2

Out[1]= (2 - 3 m11)^2 + (5 - 7 m12)^2 + (11 - 13 m21)^2 + (17 - 
   19 m22)^2 + (23 - 29 v1)^2 + (31 - 37 v2)^2

In[2]:= g[
  {{m11_, m12_}, {m21_, m22_}},
  {v1_, v2_}
  ] = Flatten[
  ({
      D[#, {  {{m11, m12}, {m21, m22}}  }],
      D[#, {  {v1, v2}  }]
      } &)[
   f[{{m11, m12}, {m21, m22}}, {v1, v2}]
   ]
  ]

Out[2]= {-6 (2 - 3 m11), -14 (5 - 7 m12), -26 (11 - 
    13 m21), -38 (17 - 19 m22), -58 (23 - 29 v1), -74 (31 - 37 v2)}

In[3]:= FindMinimum[
 f[m, v],
 {m, RandomReal[{-99, 99}, {2, 2}]},
 {v, RandomReal[{-99, 99}, 2]},
 Gradient :> g[m, v],
 Method -> "QuasiNewton"
 ]
Rationalize[%, 10^-6]

Out[3]= {6.54678*10^-20, {m -> {{0.666667, 0.714286}, {0.846154, 
     0.894737}}, v -> {0.793103, 0.837838}}}

Out[4]= {0, {m -> {{2/3, 5/7}, {11/13, 17/19}}, v -> {23/29, 31/37}}}

In[5]:= (* if i pass in an incorrect gradient, it gives me an \
incorrect answer: *)
With[{m0 = RandomReal[{-99, 99}, {2, 2}], 
  v0 = RandomReal[{-99, 99}, 2]},
 Table[LogPlot[
   First@FindMinimum[
     f[m, v],
     {m, m0}, {v, v0},
     Gradient :> (g[m, v] + wrong UnitVector[6, k]),
     Method -> "QuasiNewton"
     ],
   {wrong, -1, 1}
   ], {k, 6}]
 ]

Result of the last plots look like this: enter image description here

POSTED BY: Brad Chalfan
Answer
14 days ago

Beautiful. Thank you! Yes, in the end I was able to make this work in my code, too. Thanks again.

POSTED BY: Eric Michielssen
Answer
6 days ago

Group Abstract Group Abstract