Message Boards Message Boards


[UPDATES] Resources For Novel Coronavirus COVID-19

Posted 23 days ago
17 Replies
101 Total Likes

enter image description here



This post is intended to be the hub for Wolfram resources related to novel coronavirus COVID-19 ( a.k.a. 2019-nCoV ) from Wuhan, China. The larger aim is to provide a forum for disseminating ways in which Wolfram technologies and coding can be utilized to shed light on the virus and epidemic. Possibilities include using the Wolfram Language for data-mining, modeling, analysis, visualizations, and so forth. Among other things, we encourage comments and feedback on these resources. Please note that this is intended for technical analysis and discussion supported by computation. Aspects outside this scope and better suited for different forums should be avoided. Thank you for your contribution!

Data Sources

We have published and are continuously updating the following Wolfram Data Repository entries:

Genetic Sequences for the SARS-CoV-2 Coronavirus

Epidemic Data for Novel Coronavirus COVID-19

Patient Medical Data for Novel Coronavirus COVID-19

Computational Articles:

We encourage you to share your computational explorations relevant to coronavirus on Wolfram Community as stand-alone articles and then comment with their URL links on this discussion thread. We will summarize these articles in the following list:

Genome analysis and the SARS-nCoV-2 by Daniel Lichtblau

A walk-through of the SARS-CoV-2 nucleotide Wolfram resource by John Cassel

Geometrical analysis of genome for COVID-19 vs SARS-like viruses by Mads Bahrami

Chaos Game For Clustering of Novel Coronavirus COVID-19 by Mads Bahrami

Coronavirus logistic growth model: China by Robert Rimmer

Coronavirus logistic growth model: Italy and South Korea by Robert Rimmer

Mapping Novel Coronavirus COVID-19 Outbreak by Jofre Espigule-Pons

Visualizing Sequence Alignments from the COVID-19 by Jessica Shi

Video Recordings

Other useful resources:

17 Replies

There is also raw data being collected here in the form of a Google Sheet. It relies on data abstracted by a human (a work study student at the University of Houston operating under my supervision) from the daily reports being produced by the World Health Organization. I attach a notebook that shows how the data can be sucked in from the Google Sheet and turned into a Wolfram Language Dataset. From there, I run a few basic queries.

I just wanted to note for anyone who might be interested that the latest release of IGraph/M from a few days ago now exposes the igraph C library's SIR modelling functionality. It is fairly simple at the moment. It can run several simultaneous stochastic SIR simulations on a network, and only returns the S, I, R values at each timestep (not individual node states). It can be used to study the effect of network structure on the spreading.

UPDATE: I just added another example to the documentation to clarify what this functionality is good for. If you've opened the above link before, please do a hard-refresh of the page (Shift-F5 on Linux/Windows or Command-Shift-R on Mac)

I did a simple chart how 2019-nCoV aligns against SARS, MERS. Here results and source code.

2019-nCoV vs SARS, MERS

ChartLabels->{Placed[{"2019-nCoV","SARS","MERS","Avian Flu"},{{0.5,0},{0.8,1.2}},Rotate[#,(1.75/7) Pi]&],Placed[{"",""},Above]},
LabelingFunction->(Placed[Rotate[#,0 Pi],If[#1>1,Center,Above]]&),
ChartLegends->Placed [{"Infections","Fatalities"},Right],
PlotLabel->Style["2019-nCoV Infections",FontFamily->"Helvetica",Thin,24],

I have compiled some of the work done so far into a compact cloud dashboard:

It is mainly built to give an overview of some information from our WDR resources, with corresponding daily updates. It is still a work in progress; I will be adding more visualizations and interactivity in the coming days. (The code is rather messy, but I'll also be publishing a cleaned-up notebook with some sample code for creating similar elements.)

Aside from the visual elements, folks here might find the "Resources" tab helpful. It includes several of the Wolfram resources listed here, but also has some external resources I've seen floating around in several threads about the outbreak. I'll be continuously adding to that section as well.

Feel free to comment if you think of anything you'd like to see added! (Or if you see something that isn't working--e.g. the tooltips for the world map, which I'm looking to fix.)


It is very nice to see how fast Wolfram Inc is moving in gathering and curating data on the corona virus outbreak. Thank you very much!

Still, for me to use, e.g. the

ResourceObject["Patient Medical Data for Novel Coronavirus 2019-nCoV from Wuhan, China"]

it is paramount the I can trust the data source, especially in this Age of Misinformation. You give a name and a link to a Google Sheet, but who is behind that? Which organization? How have you curated that specific data set?

Best, Per Møldrup-Dalum

Hi @Per Møldrup-Dalum, I am glad you like our resources and we highly appreciate user feedback, thank you! For this specific type of question I recommend reaching out directly to our Wolfram Data Repository team at: Please note, Wolfram Data Repository entries are continuously updated and new information can appear on their pages in the future.

I have a new notebook titled 'china-province-graph.nb' here:

It contains the 'bordering provinces graph' (not a built-in dataset).

enter image description here

Might be useful with your IGraph package?

In case it's not covered in data resources in OP, here is a history data source someone crawled from Ding Xiang Yuan (DXY), down to every cities of every provinces in China.

COVID-19/2019-nCoV Infection Data Realtime Crawler

Note the data source is non-official. DXY, as I know it, is an online non-gov society of doctors and nurses from mainland china. Their data could be different from officially published one.

Update: I made a livestream recording on Twitch, related to data analysis techniques for the coronavirus in the Wolfram Language:

Posted 3 days ago

I have published 2 notebooks on the Wolfram Could which uses a logistic growth model to track the coronavirus epidemic with the data from the GitHub repository:

Hi Vitaliy, thank you for answering. I can see that the dataset now has a link to source and metadata information! Fantastic!

I have studied the genetic sequences of COVID-19 and SARS-like viruses, using Chaos Game Representation and Z-curve methods (hyperlinks to my Community posts). Z-curves provide a fascinating visualization of genomes that helps a lot for classification and clustering. The hierarchical clustering of viruses identifies Bat coronavirus RaTG13 as the most-likely culprit of COVID-19. My results strongly support the hypothesis of a Bat origin of COVID-19. I appreciate any comment or feedback :-)

Very neat, thanks for sharing!

Extremely interesting. Thanks for the original work and for sharing your model.

Posted 22 days ago

It would be neat to see a SEIR type analysis

Many thanks for this.

Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
or Discard

Group Abstract Group Abstract