Another Mathematica user suggested a way to speed up the pairwise correlation algorithm using associations.
We begin by downloading returns data for the S&P500 membership in legacy (i.e. list) format:
tickers = Take[FinancialData["^GSPC", "Members"]];
stockdata =
FinancialData[tickers, "Return",
DatePlus[Today, {-753, "BusinessDay"}], Method -> "Legacy"];
Then define:
PairwiseCorrelation[stockdata_] :=
Module[{assocStocks, pairs, correl},
assocStocks = Apply[Rule, stockdata, {2}] // Map[Association];
pairs = Subsets[Range@Length@assocStocks, {2}];
correl =
Map[Correlation @@ Values@KeyIntersection[assocStocks[[#]]] &,
pairs];
{correl, pairs}]
Here we are using the KeyIntersection function to identify common dates between two series, which is much faster than other methods. Accordingly:
In[317]:= AbsoluteTiming[{correl, pairs} =
PairwiseCorrelation[stockdata];]
Out[317]= {112.836, Null}
In[318]:= Length@correl
Out[318]= 127260
In[319]:= Through[{Mean, Median, Min, Max}[correl]]
Out[319]= {0.428747, 0.43533, -0.167036, 0.996379}
This is many times faster than the original algorithm and, although much slower (40x to 50x) than equivalent algorithms in other languages, gets the job done in reasonable time.
So I still think we need a Method-> "Pairwise" option for the Correlation function.