Statistical testing is tricky – as a newspaper article from yesterday shows.
Both publics, journalists and researchers tend to overemphasize the value of single studies. Publics want simple answers, journalists want spectacular news and researchers want results that will further their careers.
Science does not work that way. Knowledge accumulates slowly, through trial and error, missteps and detours. Established truths are never fully established. Answers are valid until further notice. I quote the crucial parts:
How can two studies of the same topic reach opposite conclusions?
Stanford University epidemiologist John Ioannidis has famously estimated that 90 per cent of published medical research is wrong, thanks to factors such as sloppy statistics, inadequate study size and duration, and bias – both conscious and unconscious.
Three common research failings.
Last year, the Canadian Medical Association Journal published a review article showing that cigarette smoking can help marathoners run faster.
- Numerous studies show that smoking boosts lung volume and hemoglobin, and stimulates weight loss – all factors known to improve running performance.
- University of Calgary medical resident Ken Myers wrote the article as a spoof, to show how easy it is to support virtually any hypothesis by carefully selecting your data.
- For the record, the increased lung volume and hemoglobin in smokers are signs of respiratory problems – unlike in runners, where they signal adaptation to training.
Correlation versus causation
In 2009, a study made headlines with the finding, based on a 10-year study of 500K people, that eating more meat increases your risk of death.
- The results were “adjusted” to take into account confounding factors such as age, education, weight, exercise habits and so on.
- In the meat study, a closer look at the data reveals that eating more red meat also seemingly raises your risk of accidental death from car crashes and guns
- – a clear sign that the statistical adjustment failed to find all the underlying risk factors that affect meat-eaters.
A 2008 Harvard study of 38K women looked for links between caffeine and breast cancer.
- They found three different patterns that were “statistically significant,” meaning there was less than a 1-in-20 chance that the apparent pattern was just a random fluke.
- The researchers had analyzed 50 different possible caffeine-cancer links.
- Since you expect that one out of every 20 tests will randomly produce a false positive, the researchers should have expected about 2.5 of these false alarms – which is pretty much what they saw.
- As a result, the results are most likely due to chance.
Calling the article How to spot bogus research is also a case of journalitis, however. What the columnist describes is not bogus, but very ordinary research. The problem does not (usually) lie in the single project, but at the collective level.
Researchers are exposed to the same tensions and interests as everybody else. For practical reasons, we get very many studies of groups that are easily available to researchers – such as students, patients, clients and sociable people from the middle classes. We study data that have been collected for administrative purposes (cheap) rather than data we have to gather ourselves (expensive). We tend to avoid studies that challenge the organizations or groups that fund research (risky). And so on.
Enlightened scientists and governments are aware of this. In medicine and other health sciences, meta-analyses (Cochrane) get lots of support. Other subject areas are more politicized.
One answer, as I see it, lies in more autonomy at the collective level. That means
- more support for arenas of debate that are protected from the forces that try to control professional knowledge
- more discussion about, and more studies of, these forces