Over the years, the criticism of “evidence-based medicine” (EBM) that I have repeated here and that I and others have repeated at my not-so-super-secret other blog is that its levels of evidence relegate basic science considerations to the lowest level evidence and elevate randomized clinical trial evidence to the highest rung, in essence fetishizing it above all, a form of thinking that I like to call methodolatry. Now, when EBM works correctly, this is not an entirely unreasonable way to look at things. After all, we just want to know what works in patients. Basically, when EBM is working properly, its underlying assumption is that treatments don’t reach the level of double-blind randomized clinical trials (RCTs) without having first gone through several steps first, beginning with basic science considerations, progressing to early stage clinical trials, and then finally reaching the stage of large RCTs. In other words, preclinical studies (basic biochemistry and animal studies) produce biological plausibility that justifies testing a new drug or treatment in clinical trials.
Another important point is that basic science alone can’t demonstrate efficacy. However, it can show that a proposed treatment is so implausible based on its purported mechanism of action as to be utterly not worth testing in RCTs, particularly given that clinical equipoise is essential in clinical trials. Think of it this way. For homeopathy to be a valid treatment, our understanding of physics and chemistry would have to be not just wrong, but spectacularly wrong. While it is theoretically possible for scientists to be so wrong about such fundamental physical laws and theories, what’s more likely, that scientists are so completely wrong about laws and theories that rest upon a very solid evidence base or that homeopathy is bunk? The same thing can be said of mystical modalities like “energy healing.” For example, no one has ever demonstrated the existence of this “life energy” that is redirected to heal or the “universal source” that supposedly provides the energy that reiki masters claim to be able to use to heal. In a nutshell, basic science can tell us that it is, for all practical intents and purposes, impossible that a treatment can work. Basically, it can tell us a treatment doesn’t work or can’t work, but it can’t by itself tell us if a treatment works. Thus has the phenomenon of quackademic medicine entered medical academia through the blind spot in EBM, namely its assumption that a treatment won’t reach the stage of RCTs without first having “proven its plausibility” through preclinical basic science investigation. In a sense, EBM was blindsided by “complementary and alternative medicine” (CAM), which is why I’ve supported the concept of science-based medicine (SBM), which takes into account prior plausibility of proposed treatments.
So it was that I took a lot of interest in an article by a woman who is the director of a not infrequent topic of this blog, namely the misbegotten NIH center known as the National Center for Complementary and Alternative Medicine (NCCAM), a branch of the NIH that studies magic. Ever since I first became aware of its existence, I’ve kept an eye on the NCCAM researchblog, where Josephine Briggs, MD, the director of NCCAM, occasionally posts. This time around Dr. Briggs tackles the plausibility issue head-on and comes out of it worse for the wear, so poor are her arguments in a post entitled Bayes’ Rule and Being Ready To Change Our Minds. Regular readers might remember that, basically, applying Bayesian analysis to an RCT involves assigning a prior probability that a clinical trial is likely to be positive and using that estimate to weight the statistics. Basically, the lower the prior probability, the less likely that a “positive” trial (with the classic p-value less than 0.05) is to represent a “true” positive.
Where Bayesian considerations are most useful in the discussion of CAM is for modalities that have very low prior probabilities, particularly those that are about as close to zero as you can imagine, like homeopathy, reiki, acupuncture, and the like. If you take Bayes’ rule into account, the “positive” RCTs touted by CAM practitioners and quackademics are almost certainly not “true positives.” Rather, they’re false positives, noise. It’s also important to point out that we’re not arguing over whether a treatment with an estimated prior probability of, say, 10% is too “improbable” to be worth testing. It probably isn’t. What we’re talking about is something like homeopathy, whose pre-trial probability, based on the sheer scientific nonsensicalness of its purported mechanism is so close to be zero as to be, for all intents and purposes, indistinguishable from zero.
This brings us back to Dr. Briggs, who’s flogging a study that has also been a fairly frequent topic of this blog over the years, namely the Trial To Assess Chelation Therapy (TACT) trial, a trial that tested chelation therapy, a common quack therapy used by naturopaths and others to treat cardiovascular disease, for its effects on cardiovascular complications and death. For the gory details of why this $30 million boondoggle was a complete waste of taxpayer money that endangered patients testing a treatment with close to zero prior plausibility, you can go back and read Kimball Atwood’s criticism of the trial design itself and my discussions of the completely underwhelming results here, here, here, and here. (If you doubt me, you really should check out Dr. R. W. Donnell’s Magical Mystery Tour of NCCAM Chelation Study Sites, Part 1, Part 2, Part 3, Part 4, Part 5, Part 6, and Part 7, asking yourself if you would trust any data coming from such sites. As Dr. Donnell points out, only 12 of the 110 TACT study sites were academic medical centers. Many of the study sites were highly dubious clinics touting highly dubious therapies, including heavy metal analysis for chronic fatigue, intravenous infusions of vitamins and minerals (I could never figure out how infusing minerals could be reconciled with chelation therapy to remove minerals, but that’s just me), anti-aging therapies, assessment of hormone status by saliva testing, and much more. Dr. Donnell also points out that the blinding of the study groups to local investigators was likely to have been faulty. So right off the bat, this study was dubious for so many reasons, not the least of which was that some of its site investigators were felons, a problem blithely dismissed by the NIH as being in essence irrelevant to whether the study could be done safely.
For those who aren’t inclined to click on a bunch of links and read fairly lengthy deconstructions, there were many problems with the design of the study, and the study turned out to be basically a negative study, with no decrease in a composite outcome of aggregated cardiovascular events in nondiabetics and a questionable (at best) claimed improvement found by pre-specified subgroup analysis in cardiovascular outcomes only in diabetics. As I’ve pointed out before, if CAM practitioners consider this study valid, they would have stopped using chelation therapy for cardiovascular disease in nondiabetics right away. They didn’t, and the results in diabetics were not persuasive for a number of reasons.
So what does Dr. Briggs have to say about this trial?
She begins by contrasting TACT to what usually happens, namely that treatments expected to be useful often fail to show evidence of efficacy in clinical trials, saying that every once in a while “the opposite happens.” Her lead-in to TACT thus established, she cites the most recent publication based on TACT. There have already been multiple publications, a clear example of what I like to call publishing the MPU (minimal publishable unit), and this looks like yet another reanalysis of the very same data coming to the same conclusion. Seriously, this is touted as a “factorial analysis,” but it’s the same vinegary wine in a different bottle, which is why I don’t plan on doing a particularly deep analysis of the paper. Despite the authors touting the results as showing a benefit in the entire TACT experimental population (which sure sounds to me like post hoc analysis compared to the first primary analysis published in JAMA), the authors simply repeat the claim that the results are especially compelling in diabetics. I’ve looked at these before before for the previous MPUs from TACT. Suffice to say that there were no statistically significant differences in all-cause mortality, MI, stroke, or hospitalization for angina. There was a statistically significant difference in coronary revascularization, but what that means is uncertain, particularly given that if there was any failure of blinding patients in the treatment group might be less likely to be referred for mild symptoms. Only for the same composite endpoint, a mixture of a “hard” endpoint like death and much “softer” endpoints subject to judgment calls, such as hospitalization and coronary revascularization, showed statistically significant differences. Be that as it may, Dr. Briggs seems inordinately and unjustifiably impressed by the results:
The authors found that those receiving the active treatment clearly fared better than those receiving placebo. The accompanying editorial in the AHJ reminds readers about the value of equipoise and the need to “test our beliefs against evidence.”3
Most physicians did not expect benefit from chelation treatment for cardiovascular disease. I readily admit, initially, I also did not expect we would find evidence that these treatments reduce heart attack, strokes, or death. So, the evidence of benefit coming from analyses of the TACT trial has been a surprise to many of us. The subgroup analyses are suggesting sizable benefit for diabetic patients—and also, importantly, no benefit for the non-diabetic patient. Clearly subgroup analyses, even if prespecified, do not give us the final answer. But it is also clear that more research is needed to test these important findings.
No. It. Is. Not.
Dr. Steven Nissen explained why in an editorial that accompanied one of the first MPUs published from TACT. I’ve explained why ad nauseam in the links I’ve included above. This was, in essence, a negative trial. Indeed, its results were as both Kimball Atwood and I predicted: Negative overall but with one subgroup with a suggestion of benefit that lets the authors claim that “more research is needed,” which is the same thing that Dr. Briggs is saying. She has to, given that TACT began at NCCAM (albeit before her tenure began) before being taken over by NHLBI. Of course, let’s say that, against all reason, you take TACT and its findings at face value. Remember that the trial took $30 million and a decade to do. Are the TACT findings reported by Gervasio Lamas and colleagues, even if completely reliable, compelling enough to justify spending a similar amount of money to follow up. Even considering that if a future study is limited to just diabetic patients and could thus be smaller, we’re still talking several million precious research dollars, minimum, to do the followup study, probably at least $10 million or more. Do the findings of TACT justify such an expenditure. I argue that they most definitely do not. It would be, at best, investing a lot of money to study a question that is simply not that compelling and not that likely to help very many people (under the most charitable interpretation of the results) and, at worst, throwing good money after bad, endangering more patients in the process and thus destroying equipoise.
Dr. Briggs then makes an argument that, while seeming persuasive on the surface, is actually less so if you look at it closely:
And TACT findings are indeed a reminder of the importance of retaining equipoise, seeking further research aimed at replicating the findings, and neither accepting nor rejecting findings based on personal biases. The scientific process is designed to weed out our preconceived notions and replace them with evidence.
Note the not-so-subtle implication that critics of TACT are rejecting its findings based not on their amazing unimpressiveness, coupled with the very low prior plausibility, but rather because of “personal biases” and how the scientific method (as represented by TACT, naturally) will weed out those “preconceived notions” and replace them with evidence. Dr. Briggs is very obviously trying to paint critics of TACT as unscientific zealots with an ax to grind. To do this, she cleverly tries to reclaim Bayes for herself, knowing that Bayes is a frequent argument against not just TACT and chelation therapy for heart disease but against CAM itself:
Bayesian methods are getting a lot of attention in the clinical research literature these days. The Bayes rule involves estimating the probability of a result—the prior—then modifying it with each round of new evidence. Another editorialist, a statistician, examined the TACT results, using a Bayesian approach, and comments: “If we start from a position of skepticism, the results of the TACT trial reduces the degree of skepticism. This is exactly how Bayes analysis helps modify prior beliefs by incorporating new evidence and upgrading knowledge.”4
One paper that Briggs cites echoes the sentiment:
When evidence conflicts with expectations, the findings are typically discounted. This response is rational from a Bayesian perspective—if the pretest probability (read “pretrial beliefs”) is low, a positive test (trial) should revise the posttest probability upward, but the result is not conclusive. Scientific paradigms shift only after the weight of evidence builds up sufficiently to move from hypothesis to proven fact. There are some classic examples of clinical trials that overturned conventional wisdom. Postmenopausal estrogen therapy was believed to prevent coronary disease events based on observational studies, but randomized clinical trials showed harm rather than benefit.8 and 9 Antiarrhythmic drug therapy suppresses ventricular ectopy after MI and was therefore widely believed to reduce the risk of sudden cardiac death, but a randomized controlled trial showed that it did not.10 β-Blockers were contraindicated in heart failure until randomized controlled trials proved they were indicated.11, 12 and 13 There are many examples of interventions commonly used—or not used—in practice that failed to show the expected result when tested in carefully conducted randomized controlled trials. In the case of TACT, the intervention (chelation therapy) is not commonly used in practice and most physicians expected the trial to show no benefit, yet a benefit was seen. Either way, we should not let our biases blind us to the possibility that unexpected results might provide an important clue for a new approach.
It is critical to use the scientific method to test our beliefs against the evidence. Simply dismissing results that we did not expect would ignore opportunities to expand knowledge and the armamentarium of effective therapies. This latest report is a useful extension of the previously published work from TACT and should prompt new research to replicate the initial provocative findings and base decisions about chelation on strong scientific evidence, not on beliefs, either pro or con.
Um, no. Not quite. Yes, the results of a prior trial can modify Bayesian considerations for a future trial, but in reality the most parsimonious interpretation of the results of TACT is that chelation therapy for cardiovascular disease does not work. There was basically no effect on mortality, no effect on myocardial infarction, no effect on stroke, no effect on any individual cardiovascular outcome, and only a relatively marginal effect observed in diabetics that could well be spurious. This is thin gruel to put up against all the basic science that fails to find a plausible mechanism for chelation therapy in cardiovascular disease. Basically, Briggs is using a variant of the “science was wrong before” argument, while the authors of the editorial that she cites, David J. Maron and Mark A. Hlatky, mistakenly accept the TACT trial at face value, ignoring its inherent flaws and focusing on Bayes like a laser beam to dismiss TACT critics as hopelessly biased and so hostile to the thought of chelation therapy that we are upset by the results of the study. Believe me, I’m not particularly upset by the results of the study. Equivocal results that show up in only one subgroup or require some factorial prestidigitation to be demonstrated are exactly what critics of TACT predicted given the trial design and the problems in its implementation.
I often think: What would it take for me to believe that, for example, homeopathy works—or at least to start changing my mind? Given its incredible scientific implausibility, to me it would take an utterly undeniable result, such as cures of several patients with stage IV pancreatic cancer. Chelation therapy isn’t quite as implausible as homeopathy, because chelation, at least, doesn’t involve magic and the memory of water. It is, however, pretty damned implausible. The pharmacology doesn’t work. The mechanism is not remotely plausible. It’s basically a load of fetid dingo’s kidneys. So while Dr. Briggs is correct, as far as she goes, that a surprising result in a clinical trial can modify the plausibility calculations for further clinical trials, TACT just isn’t particularly persuasive that we should do so. To illustrate, let me just ask a single question: Should we add chelation therapy to the armamentarium of treatments used in cardiovascular disease, based on this study? Even Dr. Lamas doesn’t think this study is enough, nor does Dr. Briggs. Ask yourself this, also: Then should we do another trial costing many millions of dollars to nail down this result? The answer is obvious: No. There are lots of other pressing questions to study for which the funds could be better used.
Proponents of TACT or those who don’t know much about chelation who were surprised by the results of the study like to paint themselves as being open-minded and following “true science” while we nasty critics are portrayed as hopelessly biased and close-minded. In reality, proponents of TACT are being so open-minded that their brains have fallen out.