# Discovering symbolic square roots?

Posted 4 months ago
1010 Views
|
|
6 Total Likes
|
 Is there any technology in Mathematica (or even as pure research), that would let me discover formulas for square root of a matrix specified as a formula?For example for any vector with positive entries adding up to 1, the following factorization seems to hold $$A=\text{diag}(p)-pp'=B^T B$$where $$B=\text{diag}(\sqrt{p}) - (\sqrt{p}) p^T$$As you can check with the code below:  p = {1, 10, 3}; p = With[{q = Abs[p]}, q/Total[q]]; (* positive + normalized *) A = DiagonalMatrix[p] - Outer[Times, p, p]; B = DiagonalMatrix[Sqrt[p]] - Outer[Times, Sqrt[p], p]; A == Transpose[B] . B I got this formula by expanding the components and staring at them for a while. This obviously doesn't scale. This was a factorization of cross-entropy hessian, and there are at least 20 more loss functions I'd like to do .... can anyone think of a way such formulas could be discovered automatically?This begs the question of what "formula" is....if we restrict attention to factorizations of Hessians, a symbolic way to derive them is this . You can see that if your original loss was specified in terms of pointwise ops and some tensor contractions, then its Hessian will consist of a sum of terms with some pointwise ops and some tensor contractions.The author posted formula for 3rd derivative of cross-entropy loss on Mathematica stack-exchange so you can see that small formulas remain small after differentiation.
 I think you are looking for symbolic matrix capabilities, and the Wolfram Language can do some nontrivial computations on this difficult area. There are purely symbolic capabilities based on group-theoretical algorithms in TensorReduce and TensorExpand, and then there are also indexed-differentiation capabilities in D itself.I just posted the notebook https://wolframcloud.com/obj/b50c9dd5-bfec-4651-940d-ccf640238be3 in the Wolfram Cloud with some examples of indexed differentiation in relation to the question you mention at the end of your post. This is an area under development for us. The results can probably be expressed in a more compact way using symmetrizations.Regarding square roots of matrices, I think the problem you pose is too general. Can you restrict it to some family of objects with some special properties that can help finding a solution? I do not understand immediately the information you mention about those loss functions.Let us show the identity you found using the functions defined in the notebook above:Define your initial and final objects: A[i_, j_] := KroneckerDelta[i, j] p[i] - p[i] p[j] B[i_, j_] := KroneckerDelta[i, j] Sqrt[p[i]] - Sqrt[p[i]] p[j] Then this is the identity you have found, using the notations in the notebook, and with the simplification functions defined there: In[]:= A[i, k] - sum[B[j, i] B[j, k], {j, n}] // SumExpand // DeltaSimplify // Simplify Out[]= -p[i] p[k] (-1 + Inactive[Sum][p[j], {j, n}]) That's zero if p is normalized to have its components to add 1.