# Message Boards

Answer
(Unmark)

GROUPS:

13

# Stochastic Epidemiology Models with Applications to the COVID-19

Posted 10 months ago

Stochastic Epidemiology Models with Applications to the COVID-19 Pandemic As a follow-on to my earlier post about Epidemiological Models for Influenza and COVID-19, I’d like to show how these compartmental models can be used in a stochastic setting to study how the inherent randomness of events can affect the interpretation of modeled outcomes. As noted before, the deterministic ODE-based models have a number of assumptions: All of the individuals of a given type are identical ◼ The population of individuals is well mixed, that is, instantaneously and uniformly mixed ◼ The number of individuals is so large that the density can be treated as a continuous variable ◼ The last assumption rarely holds in practice for epidemiology models where the number of individuals is typically on the order of hundreds to millions. These population sizes are vastly smaller that those usually encountered in chemical kinetics, where the numbers of molecules in a beaker are on the order of 23 10 So, what’s to be done? In the 1970’s, Daniel Gillespie [1] developed methods based on the chemical master equation to simulate chemical reactions stochastically where the last assumption, especially, does not hold. The simulation method correctly accounts for the inherent fluctuations and correlations that are ignored in the deterministic formulation. There is a nice review of the method with implementation by Higham [2], and a good book by Wilkinson with applications to systems biology [3]. Finally, we have used a hybrid deterministic-stochastic approach for viral dynamics modeling [4, 5]. One of the advantages of the stochastic modeling approach is the access to distributions of individuals in the various compartments over time. Let’s look at the SEIR model. It should be pointed out that these stochastic model simulations are not the same as agent-based models (see Christopher Wolfram's recent post for an example). The main difference is that the latter use a uniform time step and all agents update at each step, while the former use varying length time steps and only one event occurs at each step.
Sonata—SEIR We begin as usual by loading the CompartmentalModeling` <<CompartmentalModeling` In[]:= modelName="SEIR";model= βλ[t] → ζ → γ → In[]:= modelData=KineticCompartmentalModel[model/.forceOfInfection,t];vars=modelData["Variables"];(odes=modelData["Equations"])//Column In[]:=
Out[]= We’ll parameterize the model in terms of the basic reproduction number ℜ 0 1 ζ 1 γ betaSoln=First@Solve ℜ 0 β γ In[]:= β γ ℜ 0 Out[]= params=<| ℜ 0 1 5.1 1 7. 3 10 In[]:= ℜ 0 γ ℜ 0 Out[]= { ℜ 0 Out[]= inits=Join[{[0]-I0,ℐ[0]I0},Thread[Through@Complement[vars,{,ℐ}][0]0]] In[]:= {[0]-I0+,ℐ[0]I0,ℰ[0]0,ℛ[0]0} Out[]= First, let’s see how the usual ODE solution behaves over 120 days (the code for epidemicModelPlot tmax=120; In[]:= solnODE=First@NDSolve[{odes,inits}/.paramsN,vars,{t,0,tmax}]; In[]:= epidemicModelPlot[solnODE,{0,tmax},"modelName"modelName,ImageSizeLarge] In[]:=
Out[]= Nothing unusual. There is the outbreak that reaches a maximum around day 40, and nearly everyone in the population gets infected. Now, let’s see what the stochastic simulation shows. We’ll use the same parameters and the same initial conditions, and run 500 (typically one would use more runs, but we'll keep it on the small side for this essay; we'll also use SeedRandom from time to time) simulations (the code for stochasticSolve SeedRandom[1425367]solnsCME=Table[stochasticSolve[modelData["Rates"]/.paramsN,modelData["StoichiometricMatrix"],Through@modelData["Variables"][0]/.Rule@@@inits/.paramsN,modelData["Variables"],{t,0,tmax}],500];//Timing In[]:= {35.578,Null} Out[]= Here is a plot of the first 20 runs: epidemicModelPlot[Join@@Take[solnsCME,20],{0,tmax},"modelName"modelName,ImageSizeLarge] In[]:=
Out[]= We can see that there is quite a bit of variability in the individual runs, and that there is about a three week variation for the occurrence of the outbreak maximum with possibly a hint of bimodal shape. Moreover, some of the runs did not produce an outbreak. We can examine the average behavior by plotting the median and the middle 95 percentile band: TabView[{"median with bands"epidemicModelPlot[solnsCME,{0,tmax},"modelName""SIR",ImageSizeLarge],"median"epidemicModelPlot[solnsCME,{0,tmax},"modelName"modelName,ImageSizeLarge,"showBands"False]}] In[]:= median with bands median
Out[]= The middle 95 percentile bands show that there is quite a bit of variability in the ensemble of solutions. Looking at the distributions of susceptible and recovered ℛ at t t max Histogram[Quiet[#[tmax]/.solnsCME],{10},ChartStyleepiColor[#],PlotLabel#,PlotRange400,ImageSize4×72]&/@{,ℛ}//Row[#,Spacer[18]]& In[]:= Out[]= We can partition the ensemble into an “outbreak” and “not an outbreak” subsets: {outbreak,notOutbreak}=Quiet[SortBy[solnsCME,(ℛ[tmax]/.#)<50&]//SplitBy[#,(ℛ[150]/.#)<50&]&];Length/@% In[]:= {394,106} Out[]= About a fifth of the time the infection dies out before an outbreak can be established. Now, looking at the average behavior of the outbreak group, while the average behavior is very similar to that for the deterministic solution, the substantial variability remains. MapThread[#2epidemicModelPlot[#1,{0,tmax},"modelName"modelName,ImageSizeLarge]&,{{outbreak,notOutbreak,solnODE},{"outbreak","not outbreak","deterministic"}}]//TabView In[]:= outbreak not outbreak deterministic
Out[]= So, why are there two different regimes of behavior? Now would be a good time to examine the Gillespie Algorithm.
Adagio—Details of the Gillespie Algorithm
Elements of a stochastic simulation The simulation is a Markov chain process, which is a random process that depends only on the current state to determine the next state. The events that occur at step of the chain are the transitions of the SEIR model, in this case. The likelihood of a particular event occurring depends on its propensity, which in turn depends on the current state, that is, the numbers of individuals in each compartment. The time between events depends on the total propensity of the system in the current state. The propensity changes from step to step, and the process ends when the maximum time of the simulation is reached, or when the total propensity is 0. The information needed to step from the current state to the next state can be found in the modelData "ComponentTransitions" "Rates" "StoichiometricMatrix" "Variables" MapThread[Join,{List/@modelData["ComponentTransitions"],RotateRight/@modelData["StoichiometricMatrix"],List/@modelData["Rates"]}]//
In[]:=
Out[]=
Computing the propensity To demonstrate the steps, we begin by assuming a current state and time: state=RandomInteger[{0,1000},Length@modelData["Variables"]]time=RandomReal[{0,50}] In[]:= {551,554,561,940} Out[]= 44.5866 Out[]= The propensity vector =modelData["Rates"]/.Thread[Through[modelData["Variables"][t]]state]/.paramsN/.ttime In[]:= {108.039,79.1429,371.971} Out[]= The individual propensities define an empirical distribution: event In[]:= DataDistribution
Out[]= Graphically, we can see how the propensities on the left contribute to the cumulative distribution on the right: RowBarChart,
event
In[]:= Out[]= We can use RandomVariate Table[RandomVariate[ event In[]:= {1,3,3,3,3,3,3,1,2,3} Out[]= The next state is computed by incrementing the current state with stoichiometry of the selected transition: oldState=state;state+=modelData["StoichiometricMatrix"]〚Echo[#,"",modelData["ComponentTransitions"]〚#〛&]&@RandomVariate[ event In[]:= ℐ γ → »
Out[]//TableForm= The events occur randomly in time, and are therefore a Poisson process. It is the same process followed by radioactive decay, and follows an exponential distribution: Δtime In[]:= ExponentialDistribution[559.154] Out[]= Withtmax= Log[2] Total[] Δtime
Δtime
In[]:= Out[]= Again, RandomVariate oldTime=time;time+=(Echo@RandomVariate[ Δtime In[]:= 0.000653458 » 44.5873 Out[]=
Coding it up
Propensity First, we need to compute the propensity at each step, so let’s make a function that generates the model-specific propensity function. This function will take two arguments: the current state and the current time. We include the time because some models may have time-dependent rates for treatment or mitigation strategies. For most epidemiology models, the elements of the stoichiometric matrix are 0, 1, or -1, so if one or more of the compartments for a transition are exhausted the resulting propensity for that transition is 0 and that transition cannot be selected—that’s good. However, if the stoichiometry of one of the contributing compartments is 2 or more, then a non-zero propensity will still be computed when there is only 1 individual left in that state, and if that state gets selected, the resulting new state will have a negative entry—that’s not good. So we need to define a model-specific mask function that returns a list of 1’s and 0’s according to whether or not there are sufficient individuals present in the current state to support a transition. definePropensityFunction[rates:{__},(sMatrix_)?(MatrixQ[#1,IntegerQ]&), varsIn:{__},time_Symbol]:=Module[{mask,vars=Through@modelData["Variables"][t]},mask=Function[state,Boole[And@@NonNegative/@(state+#)]&/@sMatrix];Function[{s,t},(rates/.Thread[varss]/.timet)mask[s]]] In[]:= This code generates the propensity function for our SEIR model: propensity=definePropensityFunction[modelData["Rates"]/.forceOfInfection/.paramsN,modelData["StoichiometricMatrix"],modelData["Variables"],t] In[]:= Function[{s$,t$},({0.196078ℰ[t],0.142857ℐ[t],0.000714286ℐ[t][t]}/.Thread[vars$265514s$]/.tt$)mask$265514[s$]] Out[]= And here is an example of its use: propensity[state,time] In[]:= {108.039,79.,371.3} Out[]=
Simulation loop to begin the stochastic simulation, we need to instantiate the starting state and evolve it over time. We’ll capture each new state in a dynamic array DataStructure SeedRandom[1234567]stateList=Module[{time,state,idx,prop,totalProp,stateList,distEvent,distTime,result},idx=Range@Length@modelData["StoichiometricMatrix"];stateList=CreateDataStructure["DynamicArray"];state=Replace[modelData["Variables"],{-I0,ℐ->I0,_0},{1}]/.paramsN;time=0.;stateList["Append",{time,state}];While[time<120&&(totalProp=Total[prop=propensity[state,time]])>0,distEvent=EmpiricalDistribution[propidx];state+=modelData["StoichiometricMatrix"]〚RandomVariate[distEvent]〛;distTime=ExponentialDistribution[totalProp];time+=RandomVariate[distTime];stateList["Append",{time,state}];];result=Normal@stateList;stateList["DropAll"];result];//AbsoluteTiming In[]:= {1.12691,Null} Out[]= Length@stateList In[]:= 2978 Out[]= Last@stateList In[]:= {112.293,{0,0,993,7}} Out[]= At the end of the simulation, the state vector shows that there are no exposed or infectious individuals, 993 recovered, and 7 susceptible.
Result in same the form as used by NDSolve The raw result from the iteration is a list of time and state pairs. A more convenient form would be a list of InterpolatingFunction NDSolve {times,states}=Transpose@Normal@stateList;states=Transpose@states; In[]:= Next, we compute InterpolatingFunction soln=First@{Thread[vars(Interpolation[Join@@Replace[SplitBy[Thread[{times,#}],Last],{a_,___,z_}{a},{1}],InterpolationOrder0]&/@states)]} In[]:= ℰInterpolatingFunction
Out[]= epidemicModelPlot[#,{0,115},"modelName"modelName,ImageSizeLarge]&/@{soln,solnODE}//FlipView In[]:=
Out[]= Click on the output above to flip between the stochastic and deterministic simulations.
Code optimization The SEIR model is fairly simple and has an absorbing compartment ℛ (i.e., a sink), so it evaluates fairly quickly, and often terminates before the maximum time threshold is reached. If we use a population 100 times larger it will take much longer to evaluate. Running hundreds or thousands of simulations to study average behavior will take much longer. Finally, a bigger model (e.g., a gene regulatory network or a cell signalling network) will take much longer also. The simulation above took almost 1 second, so just imagine the effect of any one of the above factors on the running time! There are two bottlenecks in the code: RandomVariate ◼ we can speed up the computation by going to the heart of the matter and skipping the overhead ◼ propensity ◼ we can speed up the computation by creating a function with a precomputed body ◼ I won’t go into the details here, but the code in the initialization section at the end of the notebook has been optimized, giving a substantial speed up. It’s still top-level Wolfram Language code, however, and compilation is still and option for further improvement. Here is the usage message for the solver: ?stochasticSolve In[]:=
Out[]=
Scherzo—Flattening the curve So, back to the SEIR model and the failure to achieve an outbreak in some runs. It all has to do with the probability of infection events happening, and therefore a sufficient number of them relative to the number of recovery events. Recall that the initial state has just one infected individual: state0=Through@modelData["Variables"][0]/.Rule@@@inits/.paramsN In[]:= {0,1,0,999} Out[]= =propensity[state0,0] In[]:= {0.,0.142857,0.713571} Out[]= event In[]:= DataDistribution
Out[]= RowBarChart,
event
In[]:= Out[]= While the most likely event will be an infection, there is a good chance that it could be a recovery instead, which will eradicate the infection before it can become an outbreak. So, about one sixth ( 0.142857 0.142857+0.713571 modelData["ComponentTransitions"]〚Table[RandomVariate[ event In[]:= βℐ[t] → γ → Out[]= %[ℐ γ → 10000 In[]:= 0.167 Out[]=
Effectiveness of early international travel restrictions Examining how the propensity varies over the time course of the simulation offers some interesting insights. Here is a plot of the individual and total propensities from the first run of the stochastic simulation: With{prop=propensity[Through@modelData["Variables"][t],t]/.First@solnsCME},PlotAppend[prop,Total@prop]//Evaluate,{t,0,100},
In[]:=
Out[]= In a way, this plot give us a window into the process to design mitigation strategies. It’s quite clear that early in the process, up to about day 15, the process is almost exclusively the infection step ( βℐ(t) → In the early days of the pandemic, international travelers were introducing many more than one infected person into uninfected populations. Here is the effect of starting with 1, 2, 3, 4, or 5 initial infections: solnsCME⎵I0=Table[With[{paramsScenario=<|params,I0i0|>},With[{paramsN=Normal@(paramsScenario/.paramsScenario)},Table[stochasticSolve[modelData["Rates"]/.paramsN,modelData["StoichiometricMatrix"],Through@modelData["Variables"][0]/.Rule@@@inits/.paramsN,modelData["Variables"],{t,0,tmax}],500]]],{i0,5}];//TimingDimensions@solnsCME⎵I0 In[]:= {214.566,Null} Out[]= {5,500, |