COVID-19 data and the Newcomb Benford Distribution

What an excellent idea of performing that test !!!

To be fair, one should look at as many of the reporting regions as possible. I've heard some people say they are suspicious of reports from other countries than China, so I would like to see if their hunches hold up.

I also wouldn't be surprised if some US state level data were fudged, just so some officials could say they are reporting the data in a timely way.

We all have our pet hypotheses, let's put them to the test.

I have provided some updates on this topic (see my latest reply). Since this is related to the accusation towards a country, could you please also highlight my response in the same way that Gustavo's work is introduced?

Two other studies using Benford's Law gave different conclusions:

Koch & Okamura ( ): China’s distribution of first digits for confirmed cases is in line with Benford’s Law. Thus we reject the hypothesis that the Chinese data has been manipulated. It also matches the distribution found in the United States and Italy.

Junyi Zhang ( ): In this article, we propose a test of the reported case number of coronavirus disease 2019 in China with Newcomb-Benford law. We find a p-value of 92.8% in favour that the cumulative case numbers abide by the Newcomb-Benford law. Even though the reported case number can be lower than the real number of affected people due to various reasons, this test does not seem to indicate the detection of frauds.

Hope Gustavo could check and update your numbers. Thanks!

