Turning data into insights - What does the Drosten study have to do with this?
The results of the preliminary study suggest that children and adults are equally infectious - or are they not?
What is the preliminary study about? It is investigating whether the amount of virus is different in children and adults. The results will determine whether kindergartens, for example, will be kept closed or not. Dr. Christian Drosten was heavily criticized for his statistical application in this preliminary study regarding the differences in the amount of virus in children and adults.
We will only look at the statistics and will use the study to show how insights are generated in the statistics and which pitfalls one can fall into.
To illustrate, we will briefly leave the epidemic area and look at a typical scenario in the industry: For a manufacturing process we need shafts that have to be clean, since dirty waves can cause massive problems. Therefore we are interested in whether the waves from the manufacturer "SauberWelle" are cleaner than those from the manufacturer "WellenRein". We want to know if it is possible to detect any differences by using statistics.
Therefore our inspectors Peter and Michaela examine the waves and classify them into cleanliness categories (very clean, still clean, a little dirty and very dirty)
What do we need to be able to make a sound statement using statistics?
Just as virologists cannot test the entire world population, we cannot always test all parts in the production area. This is expensive and often not necessary. We need enough shafts from both manufacturers, which we can be inspected. From both manufacturers, we select as randomly as possible the portion of the shafts that we examine to determine cleanliness. For example, if we only inspect the parts that are produced on Friday evenings, we cannot make a general statement, because the upcoming weekend may have an influence.
Statistical jargon: The sample must represent the population.
When assessing cleanliness, we have to make sure that Peter and Michael "measure using the same criteria", i.e. that both would judge the same wave the same way. Similar to a body scale that shows 5 kg too much: Due to this wrong measurement we wrongly refrain from having a tasty lunch because we think we are over our ideal weight.
Statistical jargon: We need a capable measuring system.
In statistics, the difference between groups must be big enough for us to "prove" that difference. If the difference is too small, we assume that the difference is random, i.e. a shaft may be worse or better for both manufacturers. Therefore in statistics it is possible that although the inspected shafts of "SauberWelle" are somewhat cleaner, the difference can be so small that we must assume a random difference.
Back to the studyIn addition to the chosen approach, the following aspects were criticized:
How is this to be assessed and which area from our scenario is affected?
The reductions concern the measured values, i.e. point 2 of our scenario ("We need measured values we can rely on"). If the smears are more unreliable in children, the result of the investigation is also unreliable.
The other two points are related - for a few children examined, we need a very large difference between the groups so that we can statistically prove a difference (point 3).
This is also the reason for the initially somewhat strange case that despite a difference in the data, no difference can be statistically proven. Unfortunately, this does not mean that there is no difference in the statistics, as the saying goes: "the absence of evidence is not an evidence of absence".(see also Drosten study)
Conclusion
The criticism expressed is directed at a preliminary study. Here, criticism of the work is not a scandal but is actually desired so that the actual publication is as good as possible. All these points of criticism do not make the statements of the study fundamentally wrong but show that we lack experience in dealing with the epidemic. But it also shows how important reliable data are as a basis for statistics.
This post is also available in:
German