# FinancialData and daily return values (was:Sorority Simulator)

Posted 9 years ago
9330 Views
|
5 Replies
|
4 Total Likes
|
 I'm just a freshman and my sorority sisters and I need your help.Not really, but please help me. I'm trying to get the daily return values for the Vanguard fund VO for the last ten years, and I got it:FinancialData["VO", "Return", "Jun. 26, 2004"]The problem is that I am trying get a data set of just the daily returns (not the dates), and I am having the darndest time. This is what I'm trying to do:"Use the past 10 years of daily total returns from the VO fund to create a distribution. Randomly draw 252 returns from that distribution and multiply them together to get an annual return. Do this over and over to create a distribution of annual returns to expect from VO."I am brand new to Mathematica, any help is very much appreciated, for more info on the project:http://seekingalpha.com/article/2287813-the-best-passive-retirement-strategy-in-the-worldhttp://seekingalpha.com/instablog/1117866-joe-springer/3023693-mathematica-what-is-the-area-for-circle-of-competenceThank you!Joe
5 Replies
Sort By:
Posted 9 years ago
Posted 9 years ago
 Dear Joe, please do also note that the "returns" that you download: datatimes = FinancialData["VO", "Return", "Jun. 26, 2004"]; are different from what Jason Cawley generates by his command VO = FinancialData["VO", {2004,6,26}]; VOret = Drop[VO[[All,2]],1]/Drop[VO[[All,2]],-1]; They differ. I suppose that this is because the FinancialData function in fact returns the logarithmic return. If we plot the logarithm of Jason's time series vs the time series from FinancialData they coincide. ListLinePlot[{Log[VOret[[1 ;; 100]]], data[[1 ;; 100]]}] That would also explain why he gets, more or less, a log-normal distribution whereas I get, more or less, a normal distribution. The idea behind the log normal distribution is after all that if you take the logarithm of the values they are Gaussian distributed. And that is what you see if you compare the two answers. For the definitions you might want to have a look at this wiki page. It also explains why Jason has to multiply (which is what you, Joe, suggested in your first post) and I had to add the numbers. I am not quite sure whether in the mathematica help system for FinancialData this is made suffienently clear: "Return" daily return on a particular day, allowing dividends I believe that what is given is actually the logarithmic return - I might be wrong though. Jason, do you agree?Also the histogram of the original data we get from the FinancialData function is most definitely not Gaussian distributed. It peaks more and might be closer to a TsallisQGaussianDistribution - see the PS below. Only after the summation the distribution becomes "more normal", so much so that the test does not reject the Null; see also the central limit theorem. Furthermore, if the log returns are not actually normally distributed, but say Tsallis/Gaussian, I would think that Jason's data is also not log-normally distributed. Cheers, Marco PS: You might also want to have a look at this website.
Posted 9 years ago
 This is unbelievably helpful, thank you both so much!
Posted 9 years ago
 Dear Joe, here are some ideas:1) this is what you download: datatimes = FinancialData["VO", "Return", "Jun. 26, 2004"]; as you say it contains the dates.2) you only take the magnitudes for each day. data = datatimes[[All, 2]]; 3) you can calculate a smooth kernel distribution Plot[PDF[SmoothKernelDistribution[data], x], {x, -0.1, 0.1}, PlotRange -> All] 4) this generates the product of a random choice of 252 returns: Product[RandomChoice[data, 252][[i]], {i, 1, 252}] It does not help a lot because it is numerically nearly always zero - on the bright side it is probably not what you want to calculate anyway. The mean of the returns is Mean[data] which evaluates to 0.000551085. The standard deviation is StandardDeviation[data] which is 0.0144021. The product of many numbers that come from such a narrow distribution around zero can become very small. Also Min[Abs[data]] is 0. Actually, Sort[Abs[data]] gives:{0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1.1102210^-16, 1.1102210^-16, 0.00010582, 0.000108319, 0.00011383, 0.000118779, 0.000120048, 0.000123977, 0.000124425, 0.000124,....}If any of the first numbers are in your random choice you get zero. 5) Luckily, to get an average return you might not want to multiply the data but rather sum them up Sum[RandomChoice[data, 252][[i]], {i, 1, 252}] 6) The histogram of that is: Histogram[Table[Sum[RandomChoice[data, 252][[i]], {i, 1, 252}], {k, 1, 500}], 20] 7) The mean over 500 of these realisations can be obtained like so: Mean[Table[Sum[RandomChoice[data, 252][[i]], {i, 1, 252}], {k, 1, 500}]] I got 0.149755 when I ran it for my realisation. This seems to be more or less ok, because the average daily return was 0.000551085. Multiplying this by 252 gives 0.138873.8) Let's see. If we run the entire thing say 100 timesMonitor[Table[Mean[Table[Sum[RandomChoice[data, 252][[i]], {i, 1, 252}], {k, 1, 500}]], {j,1, 100}], j]we get {0.130739, 0.1443, 0.12358, 0.127578, 0.123799, 0.127366, 0.137378, \ 0.142802, 0.123663, 0.131705, 0.143079, 0.129468, 0.133854, 0.152172, \ 0.14965, 0.124151, 0.148038, 0.129823, 0.124735, 0.142115, 0.13393, \ 0.146552, 0.142295, 0.145668, 0.148012, 0.149947, 0.157339, 0.144625, \ 0.131332, 0.152722, 0.152528, 0.132397, 0.149237, 0.133508, 0.147617, \ 0.133868, 0.1329, 0.155013, 0.144509, 0.139821, 0.1457, 0.160008, \ 0.140802, 0.122112, 0.139138, 0.147673, 0.136278, 0.142777, 0.117216, \ 0.113688, 0.142883, 0.132171, 0.140114, 0.146726, 0.142973, 0.15172, \ 0.136722, 0.141169, 0.128717, 0.1394, 0.138362, 0.145236, 0.151213, \ 0.13936, 0.123638, 0.12851, 0.140283, 0.139783, 0.12457, 0.137845, \ 0.13261, 0.153618, 0.126994, 0.127699, 0.137892, 0.15243, 0.151824, \ 0.131615, 0.135664, 0.134355, 0.144779, 0.126877, 0.135637, 0.129136, \ 0.144117, 0.139079, 0.144863, 0.13009, 0.142233, 0.127004, 0.118718, \ 0.154026, 0.137453, 0.111452, 0.148349, 0.137895, 0.140912, 0.116243, \ 0.134876, 0.129615} The mean of that Mean[%] is 0.137787. And the variance is Variance[%%] 0.000106633 and the standard deviation is 0.0103263. So after altogether 500*100=50000 realisations we are quite close to the theoretical value of 0.138873. We now can calculate the histogram from point 6 for 50k realisations: Monitor[Histogram[Table[Sum[RandomChoice[data, 252][[i]], {i, 1, 252}], {k, 1, 50000}], 20], k] This gives the really smooth histogram8) I suppose that to a very good approximation that is Gaussian distributed. Let's check that. DistributionFitTest[datalist, Automatic, "TestConclusion", SignificanceLevel -> 0.05] Results: The null hypothesis that the data is distributed according to the NormalDistribution[[FormalX],[FormalY]] is not rejected at the 5. percent level based on the CramÃ©r-von Mises test. If we fit a Gaussian and then plot them together we get: Show[Histogram[datalist], Plot[1000*PDF[EstimatedDistribution[datalist, NormalDistribution[\[Mu], \[Sigma]]], x], {x, -1, 1.2}, PlotStyle -> {Red, Thick}]] This give this nice figure:With this it becomes easy to make all sorts of nice predictions. Oh, yes, here are the parameters I got for the distribution: EstimatedDistribution[datalist, NormalDistribution[\[Mu], \[Sigma]]] gives: NormalDistribution[0.137692, 0.22852].I hope that helps a bit. It is quite safe to ignore point 8. That one is just for fun. Cheers, Marco
Posted 9 years ago
 Thank you so much Marco!!