Message Boards Message Boards

GROUPS:

[✓] Use of Gradient in FindMinimum?

Posted 10 months ago
864 Views
|
2 Replies
|
1 Total Likes
|

I use FindMinimum using objective and gradient functions that take a matrix and vector as inputs, as in

FindMinimum[f[mat, vec], {{mat, matstart}, {vec, vecstart}}, Gradient :> g[mat, vec}], Method -> "QuasiNewton"]

The arguments to f[mat,vec] and g[mat,vec] are a matrix (mat) and a vector (vec). FindMinimum wants the Gradient function g to return a vector, which makes sense. Question is in which order should the derivatives be specified? The most logical option appears to be that corresponding to Flatten[{mat,vec}]... but I'm not having much luck with that.

Thanks for any help.

-Eric

2 Replies
Posted 10 months ago

I think this does seem to be working for me. Am I doing something different from what you are?:

In[1]:= f[
  {{m11_, m12_}, {m21_, m22_}},
  {v1_, v2_}
  ] = (2 - 3 m11)^2 + (5 - 7 m12)^2 + (11 - 13 m21)^2 + (17 - 
     19 m22)^2 + (23 - 29 v1)^2 + (31 - 37 v2)^2

Out[1]= (2 - 3 m11)^2 + (5 - 7 m12)^2 + (11 - 13 m21)^2 + (17 - 
   19 m22)^2 + (23 - 29 v1)^2 + (31 - 37 v2)^2

In[2]:= g[
  {{m11_, m12_}, {m21_, m22_}},
  {v1_, v2_}
  ] = Flatten[
  ({
      D[#, {  {{m11, m12}, {m21, m22}}  }],
      D[#, {  {v1, v2}  }]
      } &)[
   f[{{m11, m12}, {m21, m22}}, {v1, v2}]
   ]
  ]

Out[2]= {-6 (2 - 3 m11), -14 (5 - 7 m12), -26 (11 - 
    13 m21), -38 (17 - 19 m22), -58 (23 - 29 v1), -74 (31 - 37 v2)}

In[3]:= FindMinimum[
 f[m, v],
 {m, RandomReal[{-99, 99}, {2, 2}]},
 {v, RandomReal[{-99, 99}, 2]},
 Gradient :> g[m, v],
 Method -> "QuasiNewton"
 ]
Rationalize[%, 10^-6]

Out[3]= {6.54678*10^-20, {m -> {{0.666667, 0.714286}, {0.846154, 
     0.894737}}, v -> {0.793103, 0.837838}}}

Out[4]= {0, {m -> {{2/3, 5/7}, {11/13, 17/19}}, v -> {23/29, 31/37}}}

In[5]:= (* if i pass in an incorrect gradient, it gives me an \
incorrect answer: *)
With[{m0 = RandomReal[{-99, 99}, {2, 2}], 
  v0 = RandomReal[{-99, 99}, 2]},
 Table[LogPlot[
   First@FindMinimum[
     f[m, v],
     {m, m0}, {v, v0},
     Gradient :> (g[m, v] + wrong UnitVector[6, k]),
     Method -> "QuasiNewton"
     ],
   {wrong, -1, 1}
   ], {k, 6}]
 ]

Result of the last plots look like this: enter image description here

Beautiful. Thank you! Yes, in the end I was able to make this work in my code, too. Thanks again.

Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract