Without Statistics, Science would be Just Religion
A very important article published today in Nature called my attention. It talks, again, about the misuses of statistics by scientists and the volume of unreproducible studies it is producing.
You probably have heard the joke: “there are three kinds of lies: lies, big lies and statistics”. Even though unfair (does anyone expect a joke to be fair?), it is fun. The truth, however, is that reality is full of uncertainty and statistics is the best tool to represent it.
But reality is also complex and it would be unlikely that something that best explains uncertainty and complexity would be simple. Statistics is difficult, but it is so, So, SO important for those who claim to investigate reality as it is, whom we call scientists, that it should be mandatory. And it is! You won’t publish an article in a scientific journal without statistics on it. But not necessarily the correct one.
Every complicated tool can be explained in a simpler way. And the simplest explanation in statistics is the P-value: the probability that represents the likelihood of any claim to be wrong. For a result to be trustworthy, its p-value has to be very low. Below 5 or 1 %.
“People want something that they can’t really get… They want certainty.”
However, like any modeling tool, statistics is dependent on how you use it. If you follow all the assumptions for the analysis, understand its power and respect its limitations (not to say about the quality of the data you input), your results should be meaningful to reality. But many scientists, pressured by whatever excuse they can come up with (“publish or perish”; grants committee, tenure committee…), believes that statistics is a tool to confirm their narratives and expectations about experiments, not reality. They refuse to learn more statistics before applying it and refuse to accept that their expectations could be wrong when that is what the data tell them.
The misuse of statistics by scientists have been a persistent problem in science, bigger than copycat, plagiarism or plain fraud, simply because most scientists refuse to accept they are doing something wrong. And keep doing it! The practice is costing billions of dollars in failed attempts to reproduce results in innovation development in industry, but they keep doing it. One of the most read article of recent times, Why Most Published Research Findings Are False, by John Ioannidis, is exactly about this. And guess what… they keep doing it.
“Researchers should be instructed to treat statistics as a science, and not a recipe” says the author. Maybe if scientific journals starts to demand the presentation of full statistical analysis on scientific articles, like the text in Nature is proposing, things will change. But I foresee a LOT of resistance exactly from those who should support the need for more accurate results: scientists themselves!
Originally published on Linkedin on March 2016
Recent Comments
Bernardo MonteiroSays
One little step further and you would be to be to talk to future versions of you, to help you…
Euclydes SantosSays
"Tristo è quel discepolo che non avanza il suo maestro" Orgulho de te ver voando alto e fugindo com folga…