# [✓] Use of Gradient in FindMinimum?

Posted 10 months ago
864 Views
|
2 Replies
|
1 Total Likes
|
 I use FindMinimum using objective and gradient functions that take a matrix and vector as inputs, as in FindMinimum[f[mat, vec], {{mat, matstart}, {vec, vecstart}}, Gradient :> g[mat, vec}], Method -> "QuasiNewton"] The arguments to f[mat,vec] and g[mat,vec] are a matrix (mat) and a vector (vec). FindMinimum wants the Gradient function g to return a vector, which makes sense. Question is in which order should the derivatives be specified? The most logical option appears to be that corresponding to Flatten[{mat,vec}]... but I'm not having much luck with that.Thanks for any help.-Eric
2 Replies
Sort By:
Posted 10 months ago
 I think this does seem to be working for me. Am I doing something different from what you are?: In[1]:= f[ {{m11_, m12_}, {m21_, m22_}}, {v1_, v2_} ] = (2 - 3 m11)^2 + (5 - 7 m12)^2 + (11 - 13 m21)^2 + (17 - 19 m22)^2 + (23 - 29 v1)^2 + (31 - 37 v2)^2 Out[1]= (2 - 3 m11)^2 + (5 - 7 m12)^2 + (11 - 13 m21)^2 + (17 - 19 m22)^2 + (23 - 29 v1)^2 + (31 - 37 v2)^2 In[2]:= g[ {{m11_, m12_}, {m21_, m22_}}, {v1_, v2_} ] = Flatten[ ({ D[#, { {{m11, m12}, {m21, m22}} }], D[#, { {v1, v2} }] } &)[ f[{{m11, m12}, {m21, m22}}, {v1, v2}] ] ] Out[2]= {-6 (2 - 3 m11), -14 (5 - 7 m12), -26 (11 - 13 m21), -38 (17 - 19 m22), -58 (23 - 29 v1), -74 (31 - 37 v2)} In[3]:= FindMinimum[ f[m, v], {m, RandomReal[{-99, 99}, {2, 2}]}, {v, RandomReal[{-99, 99}, 2]}, Gradient :> g[m, v], Method -> "QuasiNewton" ] Rationalize[%, 10^-6] Out[3]= {6.54678*10^-20, {m -> {{0.666667, 0.714286}, {0.846154, 0.894737}}, v -> {0.793103, 0.837838}}} Out[4]= {0, {m -> {{2/3, 5/7}, {11/13, 17/19}}, v -> {23/29, 31/37}}} In[5]:= (* if i pass in an incorrect gradient, it gives me an \ incorrect answer: *) With[{m0 = RandomReal[{-99, 99}, {2, 2}], v0 = RandomReal[{-99, 99}, 2]}, Table[LogPlot[ First@FindMinimum[ f[m, v], {m, m0}, {v, v0}, Gradient :> (g[m, v] + wrong UnitVector[6, k]), Method -> "QuasiNewton" ], {wrong, -1, 1} ], {k, 6}] ] Result of the last plots look like this: