How do we know a medicine works?
Randomized trials are the "gold standard" but they aren't always possible. Are there other, creative ways to figure out whether a medicine works?
If you’ve been enjoying this Substack, then you’ll definitely enjoy our book, which just came out this past week: Random Acts of Medicine: The Hidden Forces That Sway Doctors, Impact Patients, and Shape Our Health. Available now at your favorite bookseller!
We’re sure many of you are familiar with randomized controlled trials, or “RCTs” as they’re known in many scientific fields, medicine included.
The basic idea of a RCT is that if you want to find out if something “works”—whether it’s a drug, a surgery, a social program, or a new website design—you take a bunch of people and randomly assign them to be in a group that receives your intervention of interest, or be in a group that doesn’t get the intervention (i.e., a control group). Because people were assigned randomly, we can attribute any difference between what happens to the two groups to the intervention of interest, and not some other factor.
If you’re not already familiar with randomized trials, this video explains it well in 7 minutes with some intuitive visuals:
For example, the CARE trial wanted to see whether a cholesterol lowering drug, pravastatin, lowered the risk of future heart attacks in patients who had just had a heart attack. They took 4,159 patients with heart attacks, randomized them to receive either pravastatin (the intervention) or an inert placebo pill (the control). After following them for about 5 years, 10.2% of the patients who got pravastatin had another heart attack, compared with 13.2% of patients who got placebo pills. The study suggested pravastatin worked at lowering the risk of future heart attacks.
More than one way to randomize
As great and as helpful as they are, RCTs aren’t always an option. They can be prohibitively expensive to conduct, can take too long to answer urgent questions, and in some cases are unethical to do (particularly when we’re trying to understand how harmful something is).
As readers of the Random Acts of Medicine book will know, we don’t always have to rely on researchers to randomize patients to one treatment or another. Randomization happens all the time by accident, producing chance “natural experiments” that we can learn from.
Here’s a few ways to figure out the effect of a drug without running a randomized trial.
A silver lining to drug shortages
This example comes from the ICU, where Chris works. Patients with septic shock, a severe complication of an infection, need special drugs to keep their blood pressure high enough to pump blood around the body (these drugs are called vasopressors or “pressors” for short in the ICU). Years ago, there was debate as to whether one of these drugs, called norepinephrine, should be our go-to vasopressor for septic shock.
A randomized trial could answer that question, but a chance event answered it for us. In 2011, there was a drug shortage of norepinephrine, meaning it wasn’t always available for use in ICUs. Because the timing of the shortage was random as far as patients are concerned (they didn’t time their septic shock to correspond with norepinephrine availability), patients who happened to get sick during the shortage were, by chance, less likely to get norepinephrine than other patients who got sick when the drug was widely available.
A clever study found that in hospitals affected by the drug shortage, use of norepinephrine (predictably) went down; meanwhile, the mortality rate of septic shock went up. And when norepinephrine became available again? Patients started receiving it again, and mortality rates went back to baseline. (In hospitals that didn’t experience the shortage, serving as another type of control, there was no change in mortality in that same time period). Norepinephrine worked for septic shock.
A similar study looked at what happened before and during a shortage of furosemide, a diuretic drug that is given to help hospitalized patients with congestive heart failure remove excess fluid from their body through urinating. People hospitalized with congestive heart failure often have swelling of their lower extremities and build-up of fluid in their lungs (termed pulmonary edema). An important treatment is to remove that fluid and that can be done via either oral or IV diuretic drugs. Whether those two formulations are readily interchangeable isn’t well known. It turns out that the furosemide drug shortage affected only the IV version of the drug; the oral pill version was still available. During the shortage, IV furosemide use dropped, and oral furosemide increased. As for the patients? Nothing really changed—suggesting oral furosemide pills worked just as well as the IV version.
Doctor roulette
As patients, we get some say in which doctors we see on a non-emergent basis—we might see a few different primary care doctors until we settle on one we like. But in the emergency department (ED), we don’t get to choose which doctor we see—we see whoever is on duty when we happen to have an emergency. So patients who happen to go to a given ED on a Tuesday are essentially randomized to the doctor who happened to work Tuesday.
A study by Bapu and colleagues took advantage of the random assignment of emergency doctors, some of whom are more likely to prescribe an opioid pain medication, others who are less likely. Patients, then, randomly saw a high opioid prescribing doctor (the “intervention” group) or a low prescribing doctor (the “controls”).
The variation in how doctors prescribed opioids created an opportunity to study how opioids work in the real world. But rather than focusing on how they work to treat pain, the study focused on how they work to cause problems. One of those problems is that prescription leading to long term opioid use.
When patients happened to see a high-prescribing doctor, their odds of long-term opioid use within the next year were 30% higher than when patients happened to see a low-prescribing doctor. An initial opioid prescription, therefore, “worked” in leading to long term use.
In similar study, Bapu and colleagues studied antibiotic prescribing for patients with acute respiratory illness, most of which are viral and don’t require antibiotics, but doctors sometimes prescribe them in the event that they may be caused by bacteria. Focusing on people seen at urgent care clinics, where who treats you on a given day is effectively random, the study found that urgent care providers varied a lot in their propensity to prescribe antibiotics. Some providers prescribed antibiotics often, others did not. Although the study was primarily interested in understanding whether randomization to a high-prescriber influenced your likelihood of seeking antibiotics in the future (because any improvement in symptoms might be attributed to the antibiotic, making you more likely to seek out antibiotics the next time you were sick), the set-up would also allow one to study the short- and long-term effects of being prescribed an antibiotic.
Scratching the surface
These are just a few examples of ways accidental randomization can help us figure out whether treatments work (or cause harm) without randomized trials. The other benefit of applying these tools to large data is that they can tell us more about the effects of these treatments in the real world—outside the highly supervised setting of randomized trials.
There are all kinds of ways randomization can happen by accident. Here’s an untested idea one of you may run with.
Let’s say a new weight loss drug can only be given to people with a body mass index (BMI) above 30, the cutoff for a diagnosis of obesity. This means that if a patient has a BMI of 29.5, they may not be eligible for the drug. It’s normal for weight to fluctuate, so for a patient whose BMI is approximately 30, it’s essentially random whether their BMI measures at 29.5 or 30.0. One way to measure how well the weight loss drug works in the real world, then, would be to look at differences in patients with a BMI of 29.5 versus 30.0, who are essentially the same weight but are randomly assigned to have access to the drug or not.
This sort of approach doesn’t just apply to BMI, but any place where above or below a discrete cutoff people might receive very different treatments. It could be a cutoff based on blood pressure, cholesterol, hemoglobin A1c (a measure of diabetes), or one of many more.
One of our goals with this Substack is to explore ideas like these. They may not all be phenomena we or someone else can study, but this type of thinking takes practice. So expect to see us throw ideas out there—and we’d love to hear any ideas you might have in the comments. And if we can come up with an idea that we can actually examine in the data, we’ll take a look and report back!
Though not as clear-cut as you might want, and more gradual than your examples, pre and post WWii Okinawa provides a natural experiment, in particular when it comes to the food environment.
The socioeconomically relatively poor Okinawans had, up to WWii, access largely to vegetables, including a staple, sweet potatoes, and had e.g. exceptionally high longevity and number of centenarians per capita. After WWii their food access gradually Americanized, but also Japanized, and 2015 male Okinawans ranked 36th of 47 prefectures in life expectancy. For example, Okinawa has Japan's highest number of hamburger restaurants per capita. [1]
(All weight gain can be explained with increased supply.[1])
[1] For references to the claims, see the linked draft, section 1.5.2, first paragraph:
http://dx.doi.org/10.31219/osf.io/bq438
PS: Great NYT article!
Another important aspect of RCTs and applying to the care of individual patients is answering the question, “Were patients like the one in front of me in the RCT?” Often study participants have more social capital and higher levels of functioning than the population at large. The examples you cite help address this concern as well as the “Study Effect” which has a favorable influence regardless of the group to which participants are randomized.