Group Abstract Group Abstract

Message Boards Message Boards

Pairwise Correlation of Financial Data

POSTED BY: Jonathan Kinlay
11 Replies

Guys, just FYI, is this discussion related to the blog: Graph Theory and Finance in Mathematica ?

https://blog.wolfram.com/2012/06/01/graph-theory-and-finance-in-mathematica

POSTED BY: Sam Carrettie

Hi Sam, no it was a very general question I raised, independently.

POSTED BY: Jonathan Kinlay

POSTED BY: Martijn Froeling

Another excellent implementation!

POSTED BY: Jonathan Kinlay
Posted 4 years ago

Hi,

for fun and as an exercise, I was curious to see if I could speed up the pairwise correlation trying some alternative approaches. There was indeed some room for improvement.

In short, using some rather basic Mathematica code and also the (old) compiler (but without compilation to C), I have finally been able with some fine tuning to reach a 100x factor (compared to your last approach and with the same data).

For example, running the computation for the whole 505 S&P index and 753 business days, took me 1.1 s instead of 120 s (your PairwiseCorrelation) in the Wolfram basic free cloud (Mathematica v12.3), or it took me 2. seconds instead of 200 seconds in the Wolfram player 12.0.0 on my old desktop.

Also for comparison, you said that using another scientific language it takes you under 3 seconds to produce the correlation coefficients for all 500 S&P but it is not clear for how many business days (in your first example you get the stocks data for 2753 days)? In my case, the computation (in the wolfram free basic cloud) took me about 2 s for 1500 days, 4 s for 2000, days, 6 s for 2500 days and 9 s for 3000 days, so it does not scale well but the timings remain almost acceptable I guess.

These results show simply that probably a core function in Mathematica (as you wish it existed) would speed up the computation even more so the timings would be comparable to other programming languages optimized for speed (so no need to make external evaluations).

Concerning my approach, the "challenge" was mostly to speed up the computation of the intersections. As you can easily check this takes 75% of the total time in your PairwiseCorrelation function which is very long in absolute time in your case. If i am not mistaken, the speed up is here about 180x, using a few tricks and basic Mathematica code without even any compilation. But I also did some fine tuning at every step (= comparing which Mathematica code will give you the best time computation time to accomplish the same task) which allowed me to grasp some fractions of second here and there. I have used compilation only to speed up the correlation computation, the speed up factor is about 25x. (I couldn't experiment with the new compiler for technical reasons, but it looks very promising)

Of course, I don't pretend at all to have the most elegant and fastest approach, and if you want I can of course publish my code here. Tell me if you are interested, I will just have to make it more readable and commented ;)

Chris

POSTED BY: Chris P

Chris, Very well done! That would indeed be. a significant step forward.

Yes, I think many of us would be very interested to review and test the code. Also, you might want to consider adding the function to the Wolfram Function Repository.

Again, great job.

Jonathan

POSTED BY: Jonathan Kinlay
Posted 4 years ago

Here it goes !

POSTED BY: Chris P

Great solution, Chris. Impressive work.

POSTED BY: Jonathan Kinlay

enter image description here -- you have earned Featured Contributor Badge enter image description here Your exceptional post has been selected for our editorial column Staff Picks http://wolfr.am/StaffPicks and Your Profile is now distinguished by a Featured Contributor Badge and is displayed on the Featured Contributor Board. Thank you!

POSTED BY: EDITORIAL BOARD

Another Mathematica user suggested a way to speed up the pairwise correlation algorithm using associations. We begin by downloading returns data for the S&P500 membership in legacy (i.e. list) format:

tickers = Take[FinancialData["^GSPC", "Members"]];

stockdata = 
  FinancialData[tickers, "Return", 
   DatePlus[Today, {-753, "BusinessDay"}], Method -> "Legacy"];

Then define:

PairwiseCorrelation[stockdata_] := 
 Module[{assocStocks, pairs, correl}, 
  assocStocks = Apply[Rule, stockdata, {2}] // Map[Association];
  pairs = Subsets[Range@Length@assocStocks, {2}];
  correl = 
   Map[Correlation @@ Values@KeyIntersection[assocStocks[[#]]] &, 
    pairs];
  {correl, pairs}]

Here we are using the KeyIntersection function to identify common dates between two series, which is much faster than other methods. Accordingly:

In[317]:= AbsoluteTiming[{correl, pairs} = 
   PairwiseCorrelation[stockdata];]

Out[317]= {112.836, Null}

In[318]:= Length@correl

Out[318]= 127260

In[319]:= Through[{Mean, Median, Min, Max}[correl]]

Out[319]= {0.428747, 0.43533, -0.167036, 0.996379}

This is many times faster than the original algorithm and, although much slower (40x to 50x) than equivalent algorithms in other languages, gets the job done in reasonable time.

So I still think we need a Method-> "Pairwise" option for the Correlation function.

POSTED BY: Jonathan Kinlay

Great post Jonathan! Looking forward to more of them in the near future

Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard