In the United States, Americans who turn 65 qualify for federal health insurance that covers hospital stays, certain doctors’ services, and many prescription drugs. One thing the basic Medicare plan doesn’t cover is routine dental care, so many people skip dentist visits rather than pay for them out of pocket. “Multiple attempts to develop a Medicare dental benefit have failed,” write Harvard’s Lisa Simon, Zirui Song, and Michael L. Barnett in a paper looking at the issue. They note that “poor oral health among older adults is a long-standing public health challenge,” but that there has been a lack of hard evidence this can be addressed by adding Medicare dental coverage.

Simon, Song, and Barnett are all physicians and health-policy experts. The government, they well know, is not going to gather evidence by randomly granting some people dental insurance but not others, and then comparing the results. So evidence would need to be found in data that already exist.

The researchers found some: they analyzed survey and other information from nearly 100,000 people ages 50–85, almost half of whom had dental coverage prior to turning 65. “Before age sixty-five, dental services use trended upward, consistent with increasing oral health needs with age,” the researchers write. “However, the slope of use began to decline after age sixty-five. This raises concerns for increasingly unmet dental needs among Medicare beneficiaries that may lead to poorer downstream outcomes, consistent with evidence of oral health utilization and outcomes evolving years after policy changes.”

To arrive at this conclusion, the researchers employed two statistical approaches, one being regression discontinuity design. RDD is a methodology that economists have long used to isolate an effect and link it back to what caused it, and it is increasingly enabling researchers outside of economics to tackle a host of new causal questions, some with major policy implications. Over the past decade, as data have become ubiquitous, economic tools including RDD have spread to fields such as finance and accounting. At the same time, a group of econometricians—Columbia’s Sebastian Calonico, Princeton’s Matias D. Cattaneo, Chicago Booth’s Max Farrell, and Princeton’s Rocío Titiunik—have refined RDD and created software libraries that put scientifically advanced versions of it at anyone’s fingertips.

“Most people can vaguely understand that you can’t just look at correlation” to know if two things (such as dental insurance and dental care) are linked, says Boston University’s Tal Gross, an economist. “You need more than that.” RDD, he observes, is “a rare example of a highly technical method that is accessible to anyone who can read a graph.” (For a depiction of the approach in Simon, Song, and Barnett’s dental-care research, see “How RDD works: An example,” below.)

Linking causes and effects

In the scientific method, someone raises a question and then devises a way to answer it, and the method used varies depending on the question. Many questions have to do with causal effects, as in, does a particular drug reduce diabetes?

Researchers can in many cases run a randomized control trial, the gold standard, to answer such causal-inference questions. This involves recruiting study participants, randomly dividing them into groups, and subjecting some to a treatment while others are a control group. Then the researchers compare the outcomes. Randomized trials are often used in medicine and can be used in economics too—for example, by assigning some students to receive extra tutoring while others go without it.

But while such a study design is ideal, it doesn’t always work in practice, and beyond that, it ignores a vast amount of information that already exists. Researchers can also identify and analyze natural experiments, which take advantage of preexisting data that weren’t necessarily created to be part of a study. In 2021, University of California at Berkeley’s David Card, MIT’s Joshua D. Angrist, and Stanford’s Guido W. Imbens shared the Nobel Prize in Economic Sciences for their pioneering contributions in the area of natural experiments.

Several tools can be used in natural experiments, and RDD is one of them. It first surfaced in the 1960s but was little used for decades, according to the Central Bank of Colombia’s Mauricio Villamizar-Villegas, Maria Alejandra Ruiz-Sanchez, and CEMLA’s Freddy A. Pinzon-Puerto, who documented RDD’s development and formalization. They note that it was used in a small first wave of projects in the 1970s, but revived more widely in the early 2000s.

One key factor when studying causal inference is the issue of randomization: the initial data must be essentially random before a treatment is applied. Say you want to study the effects of the COVID-19 vaccine outside of a randomized trial. In reality, the shot wasn’t assigned randomly, and people who take the shot differ in some ways that can be easily measured and others that can’t be. Were the people who received the shot better educated? Did they live in areas with more access to health care? Are they more risk averse? Those factors also affect outcomes.

How RDD works: An example  

Regression discontinuity design is a statistical technique that uses existing data to compare similar people on either side of a sharp cutoff to measure the effect of a “treatment.” That treatment could be a policy such as Medicare, which is available to Americans starting at age 65 but doesn’t cover routine dental care. Could this lack of dental coverage lead to worse oral-health outcomes, such as complete edentulism (the loss of all teeth), at 65? Here’s a simplified example of how researchers in one study used RDD to help policy makers answer this question, demonstrating that complete edentulism increased at the cutoff. 

In RDD, the population studied must be essentially the same up until the point of treatment. The treatment then creates a cutoff line that sorts people into treatment and control groups. Importantly, whatever happens to create this cutoff line is something that people couldn’t have anticipated, therefore they couldn’t have done anything to intentionally land themselves in one group or the other.

Randomized control trials are more precise than RDD, write Villamizar-Villegas, Ruiz-Sanchez, and Pinzon-Puerto. If you compare two studies with equal sample sizes, the controlled trial will be more precise because RDD will effectively analyze only a portion of the data originally sampled.

Additionally, some researchers have argued that RDD results are only valid close to the cutoff line. The research about Medicare finds that receiving dental benefits helped people at age 65, for example. It doesn’t establish whether offering the same services helped people of any other age, such as people aged 70 or 80. Villamizar-Villegas, Ruiz-Sanchez, and Pinzon-Puerto argue that RDD results can in many situations be generalized beyond their most technical application, but this needs to be evaluated on a case-by-case basis. Ultimately, policy makers must use discretion when it comes to applying RDD results to a real issue.

The widespread use of RDD would suggest that many researchers are comfortable with the method, despite its limitations. It is being used to look at all kinds of questions, from how COVID lockdowns affected flight cancellations to how being offered credit changes consumers’ savings patterns. In 2007, University of Chicago Harris School of Public Policy’s Jens Ludwig and Cornell’s Douglas L. Miller used RDD to analyze the effects of Head Start, the program that President Lyndon Johnson launched almost 60 years ago to fight child poverty in the United States.

To learn whether intervening early and supporting the formative years of a child’s development leads to long-term benefits later in life, the researchers identified an environment where people had been essentially randomized. In 1965, the Office of Economic Opportunity provided grant-writing assistance and other help to develop Head Start proposals, but only to the 300 poorest counties in the country. Those 300 counties were similar to many other poor counties that didn’t make the cut, and yet they qualified for assistance while the others did not. Thus the researchers were able to use RDD to determine that the assistance led to higher Head Start funding rates, which in turn led to lower child mortality rates and higher educational attainment.

The unanswered questions of RDD

The robustness of RDD, however, depends on some key decisions. In each project, it is up to researchers to make a few crucial choices, one of which has to do with “bandwidth,” or the distance from the cutoff line, establishing the number of data points included in the study. A narrow bandwidth means fewer data, which could increase the variability of the two samples and therefore produce a less accurate analysis. But making the bandwidth wider could create new complications because it could lead to including so many data points that it breaches the assumption that the two samples are comparable.

In the case of the Head Start study, the cutoff line ran between the 300th-poorest county and the 301st-poorest county. The researchers had a decision to make: How many counties should be included on either side of the line? One? Ten? Three hundred? What was important is that the counties included be substantially similar. If their study ended up comparing poor counties to richer ones, the analysis would no longer be sound.

For years, researchers established these parameters themselves, but ad-hoc decision-making introduces subjectivity into the process. In a series of papers, the first of which was published in 2014, Calonico, Cattaneo, Farrell, and Titiunik standardized this process to eliminate guesswork, fleshing out a standard, data-driven approach to derive the optimal bandwidth.

“This is making it more universal and transparent,” says Farrell. “You can trust what comes out of the analysis.”

A second crucial methodological decision has to do with covariates, or attributes that might influence the target outcome. The education level of parents often affects a child’s educational attainment, so that’s a covariate to note when assessing the success of Head Start, to continue the example.

Researchers no longer have to set parameters or hire a programmer and work through bugs. They can simply download the software, feed it data, and generate scientifically sound and precise answers.

In a typical regression analysis, it’s accepted practice for researchers to identify and consider these covariates. But should they do the same when running a regression discontinuity analysis? Without a formal approach, researchers again made their own call. But Calonico, Cattaneo, Farrell, and Titiunik addressed this too. Their research provides theoretical grounding for considering covariates in RDD analysis through “rigorous, applicable, transparent tools for research,” Farrell says. And their study finds that including covariates will largely improve precision just as it would in a typical regression.

The researchers also find, however, that there are certain covariates that should not be included, namely those that are directly affected by the treatment itself. For example, a researcher assessing Head Start wouldn’t want to separately consider a child’s educational performance because that, of course, would be influenced by participation in the programming.

To demonstrate how the method could be used, Calonico, Cattaneo, Farrell, and Titiunik revisited Ludwig and Miller’s work on Head Start, using the same data but designing an experiment according to their methods. They find that using covariates improved the precision of the regressions by up to 10 percent. They also demonstrate how various parts of RDD, including setting the bandwidth, should be done differently when covariates are used.

RDD in practice

The researchers have coded up their work and put it online, making it far easier to use their method. Researchers no longer have to set parameters or hire a programmer and work through bugs. They can simply download the software, feed it data, and generate scientifically sound and precise answers. This software has become the standard in academia, used in thousands of analyses. “It is hard to overstate the impact this work has had on empirical research in the past decade,” says Farrell.

Boston College’s Samuel Hartzmark and Booth’s Abigail Sussman, in a paper published in 2019, used RDD to determine that investors value sustainability. In 2016, Morningstar began scoring funds on a 100-point scale according to their sustainability, then divided the funds into five groups marked by visually salient globe icons. The top 10 percent of funds in terms of sustainability were put in the five-globe group, and the bottom 10 percent in the one-globe group.

The researchers looked at what happened when the globe rating system was introduced and used RDD to identify discontinuities at the cutoff lines between groups. Funds that are close to a cutoff line can have similar sustainability scores but be placed in different groups, and the researchers used this to demonstrate that investors react more strongly to the easy-to-grasp globe icons than to the scores underlying them.

“Prior to the rating publication, funds received similar levels of flows. After the publication, funds rated highest in terms of sustainability experienced substantial inflows of roughly 4 percent of fund size over the next 11 months. In contrast, funds rated lowest in sustainability experienced outflows of about 6 percent of fund size,” write Hartzmark and Sussman.

RDD informed research by Purdue’s Jillian B. Carr and Vanderbilt’s Analisa Packham, who find that food-purchasing assistance payments from the Supplemental Nutrition Assistance Program affected the timing and level of crimes committed in Illinois and Indiana. About the software, “we didn’t use it for the main specification, but we did use it for some robustness checks and for all bandwidth selection,” says Carr.

Granted, not every situation calls for RDD. At many businesses, experimentation has gotten so easy that there has to be a reason, such as cost or risk, to use something like RDD, says University of Chicago’s John A. List. And the software isn’t appropriate for every RDD project.

Villamizar-Villegas, who helped document the two waves of RDD growth, says, “I think that for us it was easier to pinpoint ‘formalization waves’ when the method was still in the beginning stages of development.” He’s unconvinced, but also wouldn’t rule out, that there’s a third wave underway.

For now, the use of the RDD software can be tracked through a growing list of academic citations. Simon, coauthor of the research on dental coverage, says, “I was familiar with the method previously, but read Calonico et al.’s papers extensively to better understand what the methodology is and why using their packages was the most rigorous approach to my research question.” Her research findings haven’t succeeded in moving the Medicare-coverage debate forward—but that may be because there’s not yet an equivalent, scientifically advanced tool that can be used to find answers hidden in politics.

More from Chicago Booth Review

More from Chicago Booth

Your Privacy
We want to demonstrate our commitment to your privacy. Please review Chicago Booth's privacy notice, which provides information explaining how and why we collect particular information when you visit our website.