Hi,
for fun and as an exercise, I was curious to see if I could speed up the pairwise correlation trying some alternative approaches. There was indeed some room for improvement.
In short, using some rather basic Mathematica code and also the (old) compiler (but without compilation to C), I have finally been able with some fine tuning to reach a 100x factor (compared to your last approach and with the same data).
For example, running the computation for the whole 505 S&P index and 753 business days, took me 1.1 s instead of 120 s (your PairwiseCorrelation) in the Wolfram basic free cloud (Mathematica v12.3), or it took me 2. seconds instead of 200 seconds in the Wolfram player 12.0.0 on my old desktop.
Also for comparison, you said that using another scientific language it takes you under 3 seconds to produce the correlation coefficients for all 500 S&P but it is not clear for how many business days (in your first example you get the stocks data for 2753 days)? In my case, the computation (in the wolfram free basic cloud) took me about 2 s for 1500 days, 4 s for 2000, days, 6 s for 2500 days and 9 s for 3000 days, so it does not scale well but the timings remain almost acceptable I guess.
These results show simply that probably a core function in Mathematica (as you wish it existed) would speed up the computation even more so the timings would be comparable to other programming languages optimized for speed (so no need to make external evaluations).
Concerning my approach, the "challenge" was mostly to speed up the computation of the intersections. As you can easily check this takes 75% of the total time in your PairwiseCorrelation function which is very long in absolute time in your case. If i am not mistaken, the speed up is here about 180x, using a few tricks and basic Mathematica code without even any compilation. But I also did some fine tuning at every step (= comparing which Mathematica code will give you the best time computation time to accomplish the same task) which allowed me to grasp some fractions of second here and there. I have used compilation only to speed up the correlation computation, the speed up factor is about 25x. (I couldn't experiment with the new compiler for technical reasons, but it looks very promising)
Of course, I don't pretend at all to have the most elegant and fastest approach, and if you want I can of course publish my code here. Tell me if you are interested, I will just have to make it more readable and commented ;)
Chris