MODERATOR NOTE: coronavirus resources & updates: https://wolfr.am/coronavirus
June 15, 22: (Notebook not yet ready today June 22, hopefully soon) -- As of June 22 I will be on holiday, or doing something else, until August 17. Some sections will not be updated on a daily basis. In the table of contents ("WHAT IS INCLUDED IN THIS POST" just below), I will indicate how often each section will be updated. I will also make a note to this effect in each of the corresponding sections. There are some forecasts which will be checked on the date that is given for the forecast. At times I will be in wilderness areas without electricity, so some updates at that time might not arrive promptly. Before June 22 (or a bit later, apologies) I will post a notebook which calculates parameters for some of the models automatically. There will be enough in it to get you started with your own data. The table of contents is updated now accordingly (and corrected)
June 3-5: In a reply to the Europe section, a picture with forecasts for fatalities per capita for USA, UK, Sweden, and Italy, updated daily. Minor corrections in the text were made on June 5.
May 29: contents have been reorganized slightly, the table of contents is now up to date again. See detailed update notices after the main text, and in each section (this post contains several sections where results are shown, see table of contents ("INCLUDED IN THIS POST") just a bit below.
May 20: I have updated the table of contents ("INCLUDED IN THIS POST" below)
NEW, May 10: Daily new cases forecast trends obtained from optimally fitted "TRUE" models (to learn about this, read the next paragraph and go to the response posted today at the bottom of the post ... it will be up in a few minutes). A notebook will be provided with some guidance as to how to obtain optimal fits almost automatically when I have time to finish putting it together. WARNING: There is a lot of details involved in what I am doing which gives rise to mistakes, especially when something new comes around. I just corrected some blatant ones. Hopefully they dwindle out to zero with time.
On April 21, 26, and a importantly on May 3 and 4, I have tried to improve and restructure this presentation and make a note of various conceptual issues. I have also added "TRUE" (or truer) models of the outbreak (where the number of cases is matched to the R curve - read ahead). I realize that as it was before today (and maybe still), it was poorly and hastily written. I will continue to make improvements as I have time. If nothing else, please read this paragraph. Hopefully it is easier to read now and clearer. I specify where to find material in various sections, now close to the beginning and not in too many places. NOTE and DISCLAIMER: we are using a formalism to model data ... this is not exactly the same as having a model of the outbreak itself ... only an approximation that allows us to have some understanding the dynamics of the outbreak and make some forecasts with reasonable accuracy. Modulo the explanations of compartmental models ahead and in the linked post, for the purposes of modeling the outbreak, the only data we have is the number of detected cases. This corresponds to the R curve in the outbreak: each detected infection is an individual that de facto gets quarantined and removed from the infective process. We have no other data available. We don't have data that tells us when an individual becomes exposed or infectious. However, we are using a compartmental model differently: we are considering the R curve to be those individuals that have recovered from the infection or died. And the I curve as the number of detected cases minus those that have recovered and died. The point is, these definitions of our compartments are sound in the sense that they are disjoint (that is, true compartments); and the dynamics of how individuals move from one of our so defined compartment to another can be described with the equations of the SEIR and SIR models. And so, we have a model of the data which we are able to collect. To make this clear, we present in this section, along with other content outlined below, two models for the outbreak in Italy (for which the data is very good), a "TRUE" model of the outbreak, in which the data is matched to the R curve (removed individuals), and OUR VERSION of the model of the data as we have endeavored to look at it for the most part in this post. Without further ado:
SEIR MODELS:
For an explanation of SEIR (and SIR) models see Robert Nachbar's post:
https://community.wolfram.com/groups/-/m/t/1896178
The equations of the slightly modified SEIR model are given ahead. It is possible to model the data by assuming a low enough susceptibility. I explain in the discussion with Robert Nachbar below why the main effect of containment measures is to lower the effective number of susceptible individuals when it is imposed (relatively speaking, at the beginning). Using this idea, it is also possible to model what could happen if you lift restrictions too early by letting the susceptibility increase (see picture) and you then reintroduced them (this is not a forecast, just a possible scenario). As a CAVEAT, these models are models using the DETECTED number of cases, not the TRUE number of cases, AT THE MOMENT THEY ARE DETECTED, not at the moment they are exposed or become infectious. Also, our compartments do not correspond to the compartments of a "true" model of the outbreak. We are taking the I compartment to be the number of detected cases minus the number of recovered and fatal cases, the sum of which is the R compartment. Our ASSUMPTION is that we can model the I
compartment moving to the R compartment as individuals moving from being infected to being recovered or dead ... so intuitively we are tacitly assuming that our model gives us a picture of the outbreak with a delay, reflected in the data as it becomes available.
Regardless of these considerations, our model allows us to understand how the disease evolves in time as it pertains to the data we have at hand. At least, we were able, in the Chinese model, to predict an end of outbreak time well in advance (the evidence is in Rimmer's response to this post where a similar prediction is made based on our model).
INCLUDED IN THIS POST:
1) SEIR models for data from China. A SIR model for Italy. A "TRUE" SIR model for Italy and for the US. And a daily new cases forecast obtained from the "TRUE" model for the US. (The Finland model now lives in its own section (3) only, see ahead. The Spanish model has been replaced and lives in its own section (2). On the last weekend of May a new notebook will be available in the notebook section (5)). As of June 22 and until August 17, this section will continue to be maintained on an as frequent as possible basis, daily if possible. If not, in the updates below, I will indicate if there is to be a pause
2) In a response below, various models for Spain, UK, France, Germany, and Austria, see section for details. This section will be maintained once a week, on Mondays, starting June 22 and until August 17.
3) in a separate response below, two models for Finland (SIR and "TRUE"), a model as Norway, and "TRUE" models for Denmark and Sweden. In a reply to that section, there is detailed information for Sweden (case forecasts and fatality forecasts). From June 22 to August 17, both these section will be maintained on a daily basis if possible. Otherwise, you will be notified as to when updates will occur.
4) in a separate response, a brief discussion of SIR models (with models for China, Italy, and two more models for Finland) in a separate response. Also there, a document of cases/tests ratios for various countries in the SIR models section. This section includes a picture of positivity rates for several countries updated once a week. The positivity rates picture will be updated last on June 22, and again on August 17, weekly.
5) in a separate response, a notebook, towards the end of the post. This section includes a pdf document with dialy new cases for many countries updated once a week. By June 22, this section will contain a notebook which shows how to fit parameters automatically. From June 22 to August 17, the pdf document will not be updated. The notebook and pdf document are also posted in a reply to Kaurov, above the Scandinavian countries section.
6) Daily new cases forecast trends obtained from optimally fitted "TRUE" models in the latest response (May 10). Read more in the new section. Notebook will be provided to make automatic fits. From June 22 to August 17, this section will not be updated.
7) Fatalities per million for USA, UK, Sweden, and Italy. Details in the section, a reply to the Europe section. This section will be updated as frequently as possible between June 22 and August 17.
SIR MODELS (see SIR section for equations)
At the end of the post in a new response I discuss a simpler SIR-like model for the Chinese, Italian, and Finnish data which is practically as good - the equations are there. It has two advantages over the SEIR model: a) the classic SIR model has analytic solutions, so straightforward (somewhat) computational optimization can be carried out to estimate the parameters - although our equations are not the classical ones as they have a delay; b) it yields for the data we are trying to model values of R0 that are congruent to the observed ones, 5.43 for China - compared to 5.7 obtained in the just published study led by Steven Sanche and Lin Yeng-Ting, Los Alamos N. L. (arXiv:2002.03268) in Emerging Infectious Diseases, V26, Num 7 - (a note about this for the SEIR model below) without further ado (more on this ahead); and c) (UPDATE 3) if we look at the susceptibility curves (S in the diagrams), we see that they do not necessarily reach 0. If they are asymptotic to a positive value, that means there is a herd immunity effect - the value of the asymptote being the number of people who remain susceptible under containment that will not get infected; moreover, we can see that the susceptibility curve is very close to its asymptote near the peak of the infections curve (I in the diagrams), so that targeted testing is warranted as an effective measure of containment at that stage. That section contains a model of China (final) and a model for Italy (that is not updated), and a two models for Finland, one using the JHU data instead of the Finnish authorities data, and another one using another recovery schedule using an estimate based on the scant recovery data for Finland.
A SIR model for Finland and the SEIR model alternate every so often here; the SIR model uses THL (Finnish health authorities) data. Both models are available in the Finland section, and the model for Germany in its section is, alternatingly, either a SEIR or a SIR model. We have removed, in the Finland, an SIR model that shows what it looks like to reach a plateau or steady state, rather than a peak; the equations for this are necessarily different than the simple SIR equations given in the SIR section, In the SIR section there are also other models for Finland, one using JHU and the other using a different recovery schedule.
The newer models are adjusted quite frequently, especially with respect to the number of susceptible individuals, as they continue to grow. They tend to stabilize about three weeks after control measures have been in place. After the I curve peaks, it is possible to begin to get an idea of how long the outbreak will last.
EQUATIONS and PARAMETERS OF MODIFIED SEIR MODEL
Now the equations (for the SIR model, see the SIR section at the end of the post and after most of the discussions).
s'(t) = -Beta * s(t) * i(t) / p,
e'(t) = Beta * s(t) * i(t) / p - Sigma * e(t),
i'(t) = Sigma * e(t - m) - Gamma * i(t - n),
r'(t) = Gamma * i(t - n)
The function s(t) is the number of susceptible people (the people that can get exposed to the pathogen) at time t. e(t) is the number of people that have been exposed to the pathogen and can become infected; i(t) is the number of people who are infected; r(t) is the number of people who have become resistant to the pathogen: they have recovered and developed immunity or died. Now the parameters.
beta is usually considered to be the rate of infection or "force of infection"; sigma is the usually the rate at which an exposed individual becomes infective; gamma is usually the removal rate. We introduced m and n, shift or delay parameters to line up the model curves with the data.
Here, we are operating with a delay. In our model, an individual is in the I compartment when it gets detected (a case of infection) and we continue to consider it infective until it gets "removed" when it has recovered or passed (not when it gets caught). In reality (in a true model of the outbreak, as in the second example for Italy in the pictures), individuals become infective before they get caught, and they get removed when they get detected. If we assume some kind of uniform delay in the process, we can try to fit the model to the data as we have compartmentalized it (cases and recovered+deceased). Thus we get a description of the dynamics of the outbreak as described by the data we can collect. IN ANY CASE, OUR MODELS ARE MODELS OF THE DATA ... the SEIR (SIR) formalism works well, and they have predictive value. The parameter values are in the titles of the pictures for each country. In general we assume e(0)=i(0)=1 unless stated otherwise in the model label. Also, s(0)=p, and r(0)=0.
R0 in our SEIR-like and SIR-like MODELS:
In the SEIR models, the basic reproduction number (R0) is constant and it depends on the parameters of the equations below. If we do the usual calculation (roughly beta/gamma in the equations below), R0 in our models is about an order of magnitude larger than the estimated-observed R0. There is an intuitive explanation for that. If we were to model the DETECTED number of cases using the BELIEVED or TRUE number of susceptible individuals, thought to be an order of magnitude higher than the detected ones, then we would need to scale down beta by an order of magnitude to get our results, among other things. That would give us the R0 that is being measured (my understanding is that R0 was estimated on DETECTED number of cases - but if this is wrong, then my explanation for the disparity is not correct). The main effect of lockdown is to lower the number of people that can be exposed to the pathogen when it is imposed, roughly at the beginning of the outbreak (see reply to Rober Nachbar's response for a thought experiment that explains this). Recall, the basic reproduction number (R0) is constant.
The R0 numbers obtained in the SIR models discussed in a separate section are congruent with the values that are proposed in the research litereature (more about that in the SIR section).
A NOTE ABOUT SOME DATA
Some countries do not provide any or most data pertaining to recoveries. We have estimated this data, sometimes extrapolating from available data, sometimes using an estimating function based on average rates from countries that do provide the data, etc. It would take too long to discuss what we have done in each case where recovery data seems to be missing or partial. We explain the Finnish case.
The THL (Finnish health authorities) data for the Finnish model comes from the Finnish Department of Health and Welfare (THL acronym in Finnish). There is a delay in the release of the data of 1-2 days. however, the recovery data comes from Johns Hopkins University and occasional reports from the Finnish authorities. According to the medical chief of staff of the infections diseases clinic at the Helsinki and Uusimaa hospital district, it was "important to define what people mean when they talk about recovery", and that "eventually it would be important to compile statistics to better understand the disease" and "was taking the numbers with a grain of salt" noting that "the criteria undrelying the data are not always clear and they are not always the same in each country". He also said that "tracking recovered patients was not a top priority". (quotes source is Yle news, the state run news agency). We have serialized the occasional recovery data according to how cases might have arisen in time to obtain a recovery rate function. We verify the accuracy of this function every time a new datum becomes available. We use this function also to estimate Norway recoveries.
The US model now uses an alternative recovery schedule based on an average of the recovery schedules of countries which are providing these data, as the US recovery data seems lower than it ought to be. See my comment in the day's update (April 15). We also use an estimate for UK data which is not available. Some countries have changed the way they count in the middle of the process, and we have adjusted for this (or not) as we see fit - again, it would take too long to discuss this. For the most part, we use the data that is available and take it from there.
SOME EXTRAS:
In the notebook section, where there is space, I include a pdf document with a smoothened version (14 day moving average) of the daily tallies for several countries in Europe, as well as USA and South Korea. In the SIR section, where there is space, there is a picture of the current positivity rates (number of cases/number of tetsts conducted so far). It is a useful diagnostic of where a country stands in the process.
June 19-July 12: Updated. Our model for the US will have to be recalculated once things stabilize. Right now there is very substantial growth in the number of cases. Results for Italy will be posted with a delay of one day. This section will continue to be updated daily after June 22, unless a note to the contrary is made, for example, during travel in the wilderness without access to electricity.
June 13-19: Updating. It seems there is an uptick of cases in the US. We note there is a fatalities per Million model in a reply to the European section for several countries (Italy, USA; Brazil, Sweden, and the UK). This modeling, matching the R curve of a SIR model to fatalities per million cumulative provides a forecast of fatalities for those countries. Our forecast is compared to those of IHME and other institutions. Details are in that section
May 29-June 12: Updating. I have removed the Finland model in this section, it can still be found in the section for Finland and the Nordics. And I move up to this section the daily new cases forecast that comes out of the "TRUE" model for the US. It might illustrative to show what you can get out of this model. At the very bottom of this post in their own section similar forecasts for Italy and the Nordics can be found. We also have a new fit for the "TRUE" model for Italy.
May 28: There is a new model fit for Finland. There is also a new model fit for USA. At the bottom of the post, in the last section, there are new forecasts for the daily number of cases ... you can compare the old and the new model fits.
May 23-28: Updating. We will wait until the end of May to fit a new "TRUE" model for the US. A semiautomatic, almost optimal fit for the Finland model has been obtained. We will try to fit models automatically from now one, slowly but surely. An automatically fitted, almost optimal SIR model for Italy is now posted.
May 19-22: updating. On May 20 the table of contents above ("INCLUDED IN THIS POST" section) was updated.
May 18: there is no data for Finland May 17 yet.
May16-17: The Italy "TRUE" models has been fitted again this weekend; next weekend we do the U.S.
May 10-15. Updating. The US and Italy "TRUE" models are now optimally fitted. The Italy model is fitted to 4 May when restrictions were lifted. A notebook to do this will be provided later. From these models one can derive the daily new cases forecast trends in the new section at the end of the post (for more info, read there, it will be up shortly). It takes about 2-4 hours of compute time to make some of these fits. These models will not be fitted again as restrictions are slowly lifting ... which changes the forecast, hopefully in a noticeable way (or hopefully not, from the state of things point of view). The Italy SEIR model has been replaced by a SIR model. Earlier on May 10 I had posted the wrong file for the US model ... it is now correct. And apologies, had the wrong label on the Italy "TRUE" model, now corrected (hopefully) ...
May 8-9: I will leave the "TRUE" model for the US now. It is perhaps the most reliable picture of what lies ahead. One of my usual SEIR models forecasts a higher (4.4 million) susceptible population, but that number does not square with the "TRUE" model, although soon I will do an automatic and optimal fit of it, which might push this number up. Over the weekend, a new model for Italy will be forthcoming and "TRUE" models will start to be produced in a fully automated way (I will later post a notebook with the code that does the optimization; it is written withing the simplicity of built in Mathematica functionality, which means it is somewhat slow and NMinimize needs help.
May 6-7: Updating. Soon we will have to update our standard model for Italy. The "TRUE" model looks very reliable now, enough to make long term forecasts and provide a picture as to what to expect in the longer run.
May 5: The US SEIR models will now alternate with a "TRUE" SIR model (see first paragraph of text above and subsequently for explanation). Also, there is an SEIR model for Italy and now a "TRUE" SIR model as well.
April 30 - May 4: updating, US model alternating every so often
April 29: updating. Today I put back the US model with the actual recovery data that is provided. The two models will alternate. One of our alternative Finnish models squares with the latest THL recovery estimates, so we are showing that model instead of the model we had yesterday. This picture will be updated again at 2 PM EEST.
April 28: Tomorrow I will start alternating the US model with the model obtained from the recovery data that is provided. I found a source of daily increments for the Spanish data. The Italian model has been stable now for weeks, since before the peak of the I curve. Their official data is quite good comparably speaking.
April 27: updating. I will not update Spain after today until I get hold of the data from local authorities if I can. The JHU data is inconsistent both in number of cases and in recoveries. It seems the historical series is being updated retrospectively, but according to the Spanish authorities, it is not yet ready. The temporary lump sums provided temporarily make for very poor data. When it becomes ready, I will continue to update this model. If I can obtain reliable information from press reports, I will update my data thus by hand.
April 25-26: updating. Today, April 26, the recovery data from Spain is highly anomalous, for the second time (in the past, counting method changed). Unless this datum is corrected, from now on I will use an estimate based on a recovery rate function that can be computed from the data up to yesterday, or constant adjustment as of today based on today's estimate. Using this function, we obtain today's picture.
April 24: updating. It seems the model for Spain might require a steeper rise up again.
April 23: updating. I have posted yet a new model for Finland which is probably more accurate. It is hard to say, as the entire time series changes each day due to delays in testing reports. The date in the Spanish model is now correct. I seem to have made, unfortunately, a correct forecast of the consequences of going back to work too soon!
April 22: updating. Spain went back to work ten days ago. We see new growth and forecast it will continue so ... may we be wrong.
April 21: updating. I changed the text above to improve it and hopefully make it more readable, and highlight important issues. I changed the standard SEIR Finland model for an SIR model that to me, seems more realistic, given the daily tally trends. The problem with Finnish data is that the entire time series gets corrected every day, not just the last day. While this makes for accuracy, it makes modeling difficult. I will alternate with the usual SEIR model.
April 19+20: updating. There is yet another model for Finland using another estimate for the recovery schedule. At the end of the SIR section there is a picture of the current positivity rates (number of cases/total number of tests). This should be a useful diagnostic. I will start keeping a history of these data from now on (I only have a history for Finland).
April 18: updating. There is an additional model for Finland in the SIR section using JHU data instead of THL data
April 17: updating. This section now has the SIR model for Finland, we believe it is a more accurate model for the time being, and based on a just published estimate of recoveries. Our extrapolating function seems to be working quite well and we have adjusted it to reflect this last change.
April 16: updating. The German model in its section is now an SIR model. In the Finland section there is a SIR model in which a plateau, rather than a peak, is reached
April 15: updating. Today I will show an alternative model for the USA that uses an average recovery rate obtained from other countries rather than the reported data, which seem low (understandably so, it is not a priority to test people who have tested positive and are recovering at home). The daily tally in the US has slowed down somewhat, which would lend credibility to the model, which shows the infection curve getting close to a peak. Also, in the previous model, the number of susceptible individuals was probably too high. I will compute an estimated peak date tomorrow based on this model. I will continue to track the old model, but it doesn't fit here. I am thinking of adding another response to the post with a number of models which don't fit here, but I haven't made up my mind about it yet.
April 13-14: Today is the last day the China model will be updated (April 13). April 14: I am adding a SIR model of Italy in the response with the SIR model for China. There is also a SIR model for Finland in the Finland section. It is possible to compute an effective R that is time dependent (but that won't be in the post, although I will make a notebook available in that section at a later time with this). I will add SIR models for other countries as well. I am working on an optimization program for the SIR model to further automate the determination of parameters - if I ever complete this it will also be in the SIR notebook eventually. On another note, I plan to add a section with models for other Scandinavian countries as soon as I have time.
April 12: updating. Today I am adding a response with a section which discusses the simpler SIR-like model (which I managed to make work almost just as well as the SEIR model, although it is somewhat more difficult to get it to work). The SIR-like model has the advantage that analytical solutions are known for SIR models which might be modified for our specific instance of the model, and in the case of our investigations, it yields an adequate value for R0 without the need for any further explanations.
April 11: updating. The daily tally pdf document in the Finland section is now per million inhabitants. Again note the disparity between European countries wishing to pursue an exit strategy at the moment, and South Korea, the role model country. I have added a picture which explores a scenario in which Spain lifts restrictions (as it has announced) today. We are able to model (with some mathematical ingenuity) the effect of this on the S curve, and subsequent effect on the number of infections. We hope this does not happen, but it might.
April 10; updating. I added some explanations in the text and a picture that illustrates what could happen when restrictions are lifted too early and then reintroduced - this is not a forecast, just a plausible scenario. I moved the notebook to a new response at the end of the post.
April 9: updating. In the Finland section there is a pdf document with the smooth version of the daily tallies for several Euro countries, USA, and South Korea. The Italy model seems very stable now.
April 8. Updating. Today I added, in the main part of the text above, an "intuitive" note about the basic reproduction number (R0) in these models and why they are about an order of magnitude larger than the measured rates.
April 6-7. Updating throughout the day. France and UK models temporarily suspended due to missing or inconsistent data, until more data is available
April 5: Updating throughout the day. There is a new model for Austria in the Europe section. It's I curve has peaked, and it will be the first country in Europe without an outbreak, at the end of this month, most likely.
April 4: Updating. The daily tallies for Finland, with the corrected time series, has been published now (3 PM EET) and the graph updated. The model was adjusted a tiny bit. In the model for Italy there is now a possible end of outbreak date, should the data stay on the curve. It is calculated using a threshold, which I do not explain here.
April 3: Updating. As expected, there is no recovery data for Finland so we are extrapolating from previous data.
April 2: Updating. As of April 1 I will use the official data from the Finnish Department of Health and Welfare (THL) for the Finnish model. The model is almost the same as before. There was a lack of recovery data, but yesterdays the recovery data was released as a lump sum the number of recoveries. We have stretched this datum into data according to how cases might have arisen in time to obtain a model. The recovery period appears to be 17 days or so (see more detailed explanation above). We are modifying the Finland section to reflect this new reality and will explore other models. Italy seems to be reaching the peak of its I curve at around the time I estimated it would, maybe a week later.
March 30-31: Updated
March 24-27 AM EET: All models updated; PM EET: Europe models (some are at the bottom of discussion) models updated. I UPDATED and CORRECTED the NOTEBOOK.
March 23 AM EET: China and US pictures updated, as well as the reply to the post at the bottom with the five major European countries other than Italy, and Finland. Italy picture updated in the pm. The hopefully good news is that all major
Attachments: