Group Abstract

Message Boards

WOLFRAM COMMUNITY

11.2K Views

1 Reply

9 Total Likes

View groups...

Follow this post

Share this post:

GROUPS:

Staff Picks Data Science Finance Mathematics Mathematica Add-Ons Wolfram Language Modeling Packages Statistics and Probability

Doing Bayesian inference with the nested sampling algorithm

Sjoerd Smit

Sjoerd Smit, Wolfram Research Europe Ltd.

Posted 7 years ago

I recently finished a major overhaul of my Bayesian inference package on Github: https://github.com/ssmit1986/BayesianInference and wanted to give a small overview of what it can do. If you're interested, the repository has a notebook with other examples to browse through. As an example, let's do some regression on financial data: << BayesianInference`; data = FinancialData["AS:ASML", {Today - Quantity[3, "Months"], Today}]; DateListPlot[data] Preparation of the data: data = TimeSeriesRescale[data, {0, Length[data] - 1}]; {start, ts} = TakeDrop[data, 1]; start = start[[1, 2]]; ts = TimeSeries[ts]; We will try to model the data with a `GeometricBrownianMotionProcess` (GBP): process = GeometricBrownianMotionProcess[mu, Exp[logsigma], start]; (* using Exp[logsigma] makes it easier to search for sigma values over a large range of values *) The goal of what follows below is to find values for `mu` and `sigma` that can fit the data well. However, we want to do this the Bayesian way and find a proper probability distribution over these two parameters that tells us precisely how much we can know about `mu` and `sigma`. My package uses the function `defineInferenceProblem` to specify all of the relevant inputs for the problem. You can check the types of recognised input with `defineInferenceProblem[]`: In[77]:= defineInferenceProblem[] Out[77]= {Data,Parameters, IndependentVariables, LogLikelihoodFunction, LogPriorPDFFunction, GeneratingDistribution, PriorDistribution, PriorPDF} You don't need to specify everything in this list: only whatever is necessary to compute the loglikelihood function and the logPDF of the prior (it's important to work with these log functions to avoid numerical rounding errors). For example, this is how we can define the inference problem for the GBP regression: obj = defineInferenceProblem[ "Data" -> ts, "GeneratingDistribution" -> process[t], "Parameters" -> {{mu, -\[Infinity], \[Infinity]}, {logsigma, -\[Infinity], \[Infinity]}}, "IndependentVariables" -> {t}, "PriorDistribution" -> {NormalDistribution[0, 10], NormalDistribution[0, 10]} ] This is a somewhat naive way to do the regression, because it models the data points as independent realizations of a log normal distribution rather than as sequentially dependent realizations of the GBP. In the example code on Github there is an explanation on how to model the GBP properly. The returned object is basically an Association that keeps all of the relevant data together. To inference the posterior distribution of `mu` and `sigma`, we use the nested sampling algorithm by John Skilling. This algorithm will produce (weighted) samples from the posterior and compute the evidence integral (which measures the quality of the fit). I implemented a normal version and a parallel version. It is called like so: result = parallelNestedSampling[obj]; Visualise the samples with a bubble chart (where the sizes are the weights): posteriorBubbleChart[takePosteriorFraction[result, 0.99], {mu, logsigma}] The posterior is mainly useful to have because you can use it to make predictions. Here's a plot of the regression when you "replay" it from the first data point: predictions = predictiveDistribution[takePosteriorFraction[result, 0.95], ts["Times"]]; plot = regressionPlot1D[result, predictions] You can also extrapolate from the last known value: It's instructive to compare these results with a point-estimate using the sample with the highest loglikelihood found. As you can see below, the point-estimate gives a very similar result, so in this case we can conclude that the posterior can be approximated with a single point quite well: proc = GeometricBrownianMotionProcess[#1, Exp[#2], start] & @@ First[TakeLargestBy[result["Samples"], #LogLikelihood &, 1]]["Point"]; Show[ plot, regressionPlot1D[result, AssociationMap[proc, ts["Times"]], PlotStyle -> Dashed] ] Here is the result if you fit the process properly (i.e., you model each point conditionally on the previous one rather than pretending they are independent of each other). In the first plot you see the prediction error when you predict forward from the last known data point. Again, the dashed line shows you the maximum-likelihood estimate: That doesn't seem like a big difference between the two prediction methods. However, this is what happens when you predict forward from the first data point only: Here the difference between the point-estimate and the full Bayesian prediction becomes more apparent. The point-estimate is too optimistic because it doesn't include the uncertainty in the fitting parameters. I hope this code is useful to some of you. If you're going to give it a try, please let me know what you think!

POSTED BY: Sjoerd Smit

1 Reply

Sort By:

EDITORIAL BOARD

EDITORIAL BOARD, WOLFRAM

Posted 7 years ago

- Congratulations! This post is now a Staff Pick as distinguished by a badge on your profile! Thank you, keep it coming, and consider contributing your work to the The Notebook Archive!

POSTED BY: EDITORIAL BOARD

Reply to this discussion

Reply Preview

Attachments

Remove Add a file to this post

Follow this discussion

or Discard

Feedback