Note: Grant writing ruled again this weekend; so I took this post, which first appeared elsewhere, and decided to revise and repost it. It seems appropriate, given what I’ve been discussing lately. Enjoy, and hopefully there’ll be something new tomorrow..
I’ve been complaining a lot about a certain journalist lately, specifically one named David Freedman. Before the most recent paean to unscientific medicine written by him, he wrote another article. The article, which was trumpeted by Tara Parker-Pope, came under the heading of “Brave Thinkers” and is entitled Lies, Damned Lies, and Medical Science. It is being promoted in news stories like this, where the story is spun as indicating that medical science is so flawed that even the cell-phone cancer data can’t be trusted:
Let me mention two things before I delve into the meat of the article. First, these days I’m not nearly as enamored of The Atlantic as I used to be. I was a long-time subscriber (at least 20 years) until last fall, when The Atlantic published an article so egregiously bad on the H1N1 vaccine that Mark Crislip decided to annotate it in his own inimitable fashion. Fortunately, this article isn’t as bad (it’s a mixed bag, actually, making some good points and then undermining some of them by overreaching), although it does lay on the praise for Ioannidis and the attacks on SBM a bit thick. Be that as it may, clearly The Atlantic has developed a penchant for “brave maverick doctors” and using them to cast doubt on science-based medicine. Second, I actually happen to love John Ioannidis’ work, so much so that I’ve written about it at least twice over the last three years, including The life cycle of translational research and Does popularity lead to unreliability in scientific research?, where I introduced the topic using Ioannidis’ work. Indeed, I find nothing at all threatening to me as an advocate of science-based medicine in Ioannidis’ two most famous papers, Contradicted and Initially Stronger Effects in Highly Cited Clinical Research and Why Most Published Research Findings Are False. The conclusions of these papers to me are akin to concluding that water is wet and everybody dies. It is, however, quite good that Ioannidis is there to spell out these difficulties with SBM, because he tries to keep us honest.
Unfortunately, both papers are frequently wielded like a shibboleth by advocates of alternative medicine against science-based medicine (SBM) as “evidence” that it is corrupt and defective to the very core and that therefore their woo is at least on equal footing with science-based medicine. Ioannidis has formalized the study of problems with the application of science to medicine that most physicians intuitively sense but have not ever really thought about in a rigorous, systematic fashion. Contrast this to so-called “complementary and alternative medicine” (i.e., CAM), where you will never see such a questioning of the methodology and evidence base behind it (mainly because its methodology is primarily anecdotal and its evidence base nonexistent or fatally flawed) and most practitioners never change their practice as a result of any research, and you’ll see my point.
Right from the beginning, the perspective of the author David H. Freedman is clear. I first note the title of the article (Lies, Damned Lies, and Medical Science) is intentionally and unnecessarily inflammatory. On the other hand, I suppose that entitling it something like “Why science-based medicine is really complicated and most medical studies ultimately turn out to be wrong” wouldn’t have been as eye-catching. Even Ioannidis restrained himself more when he entitled his PLoS review an almost as exaggerated Why Most Published Research Findings Are False, which has made it laughably easy for cranks to the misuse and abuse of his article. My annoyance at the title and general tone of Freedman’s article notwithstanding, coupled with the sorts of news coverage it’s getting notwithstanding, there are still important messages in Freedman’s article worth considering, if you get past the spin, which begins very early in describing Ioannidis and his team thusly:
Last spring, I sat in on one of the team’s weekly meetings on the medical school’s campus, which is plunked crazily across a series of sharp hills. The building in which we met, like most at the school, had the look of a barracks and was festooned with political graffiti. But the group convened in a spacious conference room that would have been at home at a Silicon Valley start-up. Sprawled around a large table were Tatsioni and eight other youngish Greek researchers and physicians who, in contrast to the pasty younger staff frequently seen in U.S. hospitals, looked like the casually glamorous cast of a television medical drama. The professor, a dapper and soft-spoken man named John Ioannidis, loosely presided.
I’m guessing the only reason Freedman didn’t liken this team to Dr. Greg House and his minions is because, unlike Dr. House, Ioannidis is dapper and soft-spoken, although like Dr. House’s team apparently Ioannidis’ team is full of good-looking young doctors. After describing how Ioannidis delved into the medical literature and was shocked by the number of seemingly important and significant published findings that were later reversed in subsequent studies, Freedman boils down the what I consider to be the two most important messages that derive from Ioannidis’ work:
This array suggested a bigger, underlying dysfunction, and Ioannidis thought he knew what it was. “The studies were biased,” he says. “Sometimes they were overtly biased. Sometimes it was difficult to see the bias, but it was there.” Researchers headed into their studies wanting certain results–and, lo and behold, they were getting them. We think of the scientific process as being objective, rigorous, and even ruthless in separating out what is true from what we merely wish to be true, but in fact it’s easy to manipulate results, even unintentionally or unconsciously. “At every step in the process, there is room to distort results, a way to make a stronger claim or to select what is going to be concluded,” says Ioannidis. “There is an intellectual conflict of interest that pressures researchers to find whatever it is that is most likely to get them funded.”
Perhaps only a minority of researchers were succumbing to this bias, but their distorted findings were having an outsize effect on published research. To get funding and tenured positions, and often merely to stay afloat, researchers have to get their work published in well-regarded journals, where rejection rates can climb above 90 percent. Not surprisingly, the studies that tend to make the grade are those with eye-catching findings. But while coming up with eye-catching theories is relatively easy, getting reality to bear them out is another matter. The great majority collapse under the weight of contradictory data when studied rigorously. Imagine, though, that five different research teams test an interesting theory that’s making the rounds, and four of the groups correctly prove the idea false, while the one less cautious group incorrectly “proves” it true through some combination of error, fluke, and clever selection of data. Guess whose findings your doctor ends up reading about in the journal, and you end up hearing about on the evening news? Researchers can sometimes win attention by refuting a prominent finding, which can help to at least raise doubts about results, but in general it is far more rewarding to add a new insight or exciting-sounding twist to existing research than to retest its basic premises–after all, simply re-proving someone else’s results is unlikely to get you published, and attempting to undermine the work of respected colleagues can have ugly professional repercussions.
Of course, I’ve discussed the problems of publication bias before multiple times right here on this very blog. Contrary to the pharma conspiracy-mongering of many CAM advocates, more commonly the reason for bias in the medical literature is what is described above: Simply confirming previously published results is not nearly as interesting as publishing something new and provocative. Scientists know it; journal editors know it. In fact, this is far more likely a problem than the fear of undermining the work of respected colleagues, although I have little doubt that that fear is sometimes operative. The reason is, again, because novel and controversial findings are more interesting and therefore more attractive to publish. A young investigator doesn’t make a name for himself by simply agreeing with respected colleagues. He makes a name for himself by carving out a niche and even more so if he shows that commonly accepted science has been wrong. Indeed, I would argue that this is the very reason that comparative effectiveness research (CER) is given such short shrift in the medical literature, so much so that the government has decided to encourage it in the Patient Protection and Affordable Care Act. CER is nothing more than comparing already existing and validated therapies head-to-head against each other to see which is more effective. To most scientists, nothing could be more boring, no matter how important CER is. Until recently, doing CER was a good way to bury a medical academic career in the backwaters. Hopefully, that will change, but to my mind the very problems Ioannidis points out are part of the reason why CER has had such rough sledding in achieving respectability.
More importantly, what Freedman appears (at least to me) to portray as a serious, nigh unfixable problem in the medical research that undergirds SBM is actually its greatest strength: it changes with the evidence. Yes, there is a bias towards publishing striking new findings and not publishing (or at least not publishing in highly prestigious journals) less striking or negative findings. This has been a well-known bias that’s been bemoaned for decades; indeed, I remember learning about it in medical school, and you don’t want to know how long ago I went to medical school.
Even so, Freedman inadvertently echoes a message that we at SBM have discussed many times, namely that high quality evidence is essential. In the article, Freedman points out that 80% of nonrandomized trials turn out to be wrong, as are “25 percent of supposedly gold-standard randomized trials, and as much as 10 percent of the platinum-standard large randomized trials.” Big surprise, right? Less rigorous designs produce false positives more often! Also remember, in an absolutely ideal world with a perfectly designed randomized clinical trial (RCT), by choosing p<0.05 as the cutoff for statistical significance, we would expect that at least 5% of RCTs will be wrong by random chance alone. Add type II errors to that and the number is expected to be even higher, again, just by random chance alone. When you consider these facts, then having only 10% of large randomized trials turn out to be incorrect is actually not too bad at all. Even if only 25% of all randomized trials turn out to be wrong, that isn’t all that bad either; these include smaller trials. After all, the real world is messy; trials are never perfect, nor is their analysis. The real messages should be that lesser quality trials that are unrandomized are highly unreliable and that even randomized trials should be replicated if at all possible. Unfortunately, resources are such that such trials can’t always be replicated or expanded upon, which means that we as scientists need to do our damnedest to work on improving the quality of such trials. Also, don’t forget that the probability of a trial being wrong increases as the implausibility of the hypothesis being tested increases, as Steve Novella and Alex Tabarrok have pointed out in discussing Ioannidis’ results. Unfortunately, with the rise of CAM, more and more studies are being done on highly implausible hypotheses, which will make the problem of false-positive studies even worse. Is this contributing to the problem overall? I don’t know, but that would be a really interesting hypothesis for Ioannidis and his group to study, don’t you think?
Another important lesson from Ioannidis’ work cited by Freedman is that hard outcomes are much more important than soft outcomes in medical studies. For example, death is the hardest outcome of all. If a treatment for a chronic condition is going to claim benefit, it behooves researchers to demonstrate that it has a measurable effect on mortality. I discussed this issue a bit in the context of the controversy over Avastin and breast cancer, where the RCTs used to justify approving Avastin for use against stage IV breast cancer found an effect on disease-free survival but not overall survival. However, this issue is not important just in cancer trials, but in any trial for an intervention that is being used to reduce mortality. “Softer” outcomes, be they disease-free survival, reductions in blood lipid levels, reductions in blood pressure, or whatever, are always easier to demonstrate than decreased mortality.
Unfortunately, one thing that comes through in Freedman’s article is something similar to other work I’ve seen from him. For instance, when Freedman wrote about Andrew Wakefield back in May, he got it so wrong that he was not even wrong when he described The Real Lesson of the Vaccines-Cause-Autism Debacle. To him the discovery of Andrew Wakefield’s malfeasance is as nothing compared to what he sees as the corruption and level of error present in the current medical literature. In other words, Freedman presented Wakefield not as a pseudoscience maven, an aberration, someone outside the system who somehow managed to get his pseudoscience published in a respectable medical journal and thereby caused enormous damage to vaccination programs in the U.K. and beyond. Oh, no. To Freedman, Wakefield is representative of the system. One wonders, given how much he distrusts the medical literature, Freedman actually knew Wakefield was wrong. After all, all the studies that refute Wakefield presumably suffer from the same intractable problems that Freedman sees in all medical literature. In any case, perhaps this apparent view explains why, while Freedman gets some things right in his profile of Ioannidis, he gets one thing enormously wrong:
Ioannidis initially thought the community might come out fighting. Instead, it seemed relieved, as if it had been guiltily waiting for someone to blow the whistle, and eager to hear more. David Gorski, a surgeon and researcher at Detroit’s Barbara Ann Karmanos Cancer Institute, noted in his prominent medical blog that when he presented Ioannidis’s paper on highly cited research at a professional meeting, “not a single one of my surgical colleagues was the least bit surprised or disturbed by its findings.” Ioannidis offers a theory for the relatively calm reception. “I think that people didn’t feel I was only trying to provoke them, because I showed that it was a community problem, instead of pointing fingers at individual examples of bad research,” he says. In a sense, he gave scientists an opportunity to cluck about the wrongness without having to acknowledge that they themselves succumb to it–it was something everyone else did.
To say that Ioannidis’s work has been embraced would be an understatement. His PLoS Medicine paper is the most downloaded in the journal’s history, and it’s not even Ioannidis’s most-cited work–that would be a paper he published in Nature Genetics on the problems with gene-link studies. Other researchers are eager to work with him: he has published papers with 1,328 different co-authors at 538 institutions in 43 countries, he says. Last year he received, by his estimate, invitations to speak at 1,000 conferences and institutions around the world, and he was accepting an average of about five invitations a month until a case last year of excessive-travel-induced vertigo led him to cut back.
Freedman includes an anecdote about how medical practitioners are unsurprised by some of Ioannidis’ results. Unfortunately, instead of the interpretation intended, namely that physicians are aware of the problems in the medical literature described by Ioannidis and take such information into account when interpreting studies (i.e., that Ioannidis’ work is simply reinforcement of what they know or suspect anyway), Freedman instead interprets physicians’ reaction to Ioannidis as “an opportunity to cluck about the wrongness without having to acknowledge that they themselves succumb to it–it was something everyone else did.” I suppose it’s possible that there is a grain of truth in that — but only a small grain. In reality, at least from my observations, the reason that scientists and skeptics have not only refrained from attacking Ioannidis but in actuality have embraced him and his findings of deficiencies in how we do clinical trials is for the right reasons. We want to be better, and we are not afraid of criticism. Try, for instance, to imagine an Ioannidis in the world of CAM. Pretty hard, isn’t it? Then picture how a CAM-Ioannidis would be received by CAM practitioners? I bet you can’t imagine that they would shower him with praise, publications in their best journals, and far more invitations to speak at prestigious medical conferences than one person could ever possibly accept.
Yet that’s how science-based practitioners have received John Ioannidis.
In the end, Ioannidis has a message that is more about how little the general public understands the nature of science than it is about the flaws in SBM:
We could solve much of the wrongness problem, Ioannidis says, if the world simply stopped expecting scientists to be right. That’s because being wrong in science is fine, and even necessary–as long as scientists recognize that they blew it, report their mistake openly instead of disguising it as a success, and then move on to the next thing, until they come up with the very occasional genuine breakthrough. But as long as careers remain contingent on producing a stream of research that’s dressed up to seem more right than it is, scientists will keep delivering exactly that.
“Science is a noble endeavor, but it’s also a low-yield endeavor,” he says. “I’m not sure that more than a very small percentage of medical research is ever likely to lead to major improvements in clinical outcomes and quality of life. We should be very comfortable with that fact.”
We should indeed. On the other hand, those of us in the trenches with individual patients don’t have the luxury of ignoring many studies that conflict (as Ioannidis suggests elsewhere in the article). Moreover, it is science that gives us our authority with patients. If patients lose trust in science, then there is little reason not to go to a homeopath. Consequently, we need to do the best we can with what exists. Nor does Ioannidis’ work mean that SBM is so hopelessly flawed that we might as well all throw up our hands and become reiki masters, which is what Freedman seems to be implying. SBM is our tool to bring the best existing care to our patients, and it is important that we know the limitations of this tool. Contrary to what CAM advocates claim, there currently is no better tool. If there were, and it could be demonstrated conclusively to be superior, I’d happily switch to using it.
To paraphrase Winston Churchill’s famous speech, many forms of medicine have been tried and will be tried in this world of sin and woe. No one, certainly not those of us at SBM, pretends that SBM is perfect or all-wise. Indeed, it has been said (mainly by me) that SBM is the worst form of medicine except all those other forms that have been tried from time to time. I add to this my own little challenge: Got a better system than SBM? Show me! Prove that it’s better! In the meantime, we should be grateful to John Ioannidis for exposing defects and problems with our system while at the same time expressing irritation at people like Freedman for overhyping them.