I was going to give this a rest for a while, but this is too good not to post a brief note about.
Posted in the comments of my piece debunking the Geiers’ pseudoscience and their laughable “scientific” article claiming to show a decrease in the rate of new cases of autism since late 2002, when thimerosal was removed from vaccines completely other than some flu vaccines was this gem of a comment, by one MarkCC, which stated the essence of what was wrong with the Geiers’ so-called “statistical analysis” of the VAERS database:
Here’s the key, fundamental issue: when you’re doing statistical analysis, you don’t get to look at the data and choose a split point. What the Geiers did is to look at the data, and find the best point for splitting the dataset to create the result they wanted. There is no justification for choosing that point except that it’s the point that produces the result that they, a priori, decided they wanted to produce.
Time trend analysis is extremely tricky to do – but the most important thing in getting it right is doing it in a way that eliminates the ability of the analysis to be biased in the direction of a particular a priori conclusion. (In general, you do that not to screen out cheaters, but to ensure that whatever correlation you’re demonstrating is real, not just an accidental correlation created by the human ability to notice patterns. It’s very easy for a human being to see patterns, even where there aren’t any.)
Redo the Geiers analysis using any decent time-trend analysis technique – even a trivial one like doing multiple overlapping three-year regressions (i.e., plot the data from ’92 to ’95, ’93 to ’96, ’94 to ’97, etc) and you’ll find that that nice clean break point in the data doesn’t really exist – you’ll get a series of trend lines with different slopes, without any clear break in slope or correlation.
So – to sum up the problem in one brief sentence: in statistical time analysis, you do not get to pick break points in the time sequence by looking at the data and choosing the break point that is most favorable to your desired conclusion.
Exactly! Unfortunately, that’s exactly what the Geiers did.
A proper statistical analysis of such data, looking for time points at which a rate of change in a variable changes, is designed such that there is no bias in selecting a time point at which a significant change in slope is observed. As much as the Geiers might want to believe that there is a marked change in the slope of the curve beginning around late 2002 to early 2003, they can’t assume that there is such a breakpoint before doing the analysis.
Once again, what pseudoscientists like the Geiers never seem to understand is that all those precautions we scientists take with control groups and statistical analyses designed to minimize investigator bias exist because we realize how easy it is for a scientist, particularly a medical scientist who is invested in finding a cure for a particular disease or condition, to be seduced into believing something that is not supported by data. (If they did understand, they wouldn’t use such simplistic and easily debunked “scientific” methodology.) It’s a very human tendency, and the scientific method is designed to minimize that tendency. That’s why it takes so much training to overcome.
Some scientists never do overcome this tendency, and if they fall deeply enough into belief over evidence they become pseudoscientists.
Like the Geiers.