How Statistics Is Getting Better at Linking Cause and Effect

In the United States, Americans who turn 65 qualify for federal health insurance that covers hospital stays, certain doctors’ services, and many prescription drugs. One thing the basic Medicare plan doesn’t cover is routine dental care, so many people skip dentist visits rather than pay for them out of pocket. “Multiple attempts to develop a Medicare dental benefit have failed,” write Harvard’s Lisa Simon, Zirui Song, and Michael L. Barnett in a paper looking at the issue. They note that “poor oral health among older adults is a long-standing public health challenge,” but that there has been a lack of hard evidence this can be addressed by adding Medicare dental coverage.

Simon, Song, and Barnett are all physicians and health-policy experts. The government, they well know, is not going to gather evidence by randomly granting some people dental insurance but not others, and then comparing the results. So evidence would need to be found in data that already exist.

The researchers found some: they analyzed survey and other information from nearly 100,000 people ages 50–85, almost half of whom had dental coverage prior to turning 65. “Before age sixty-five, dental services use trended upward, consistent with increasing oral health needs with age,” the researchers write. “However, the slope of use began to decline after age sixty-five. This raises concerns for increasingly unmet dental needs among Medicare beneficiaries that may lead to poorer downstream outcomes, consistent with evidence of oral health utilization and outcomes evolving years after policy changes.”

To arrive at this conclusion, the researchers employed two statistical approaches, one being regression discontinuity design. RDD is a methodology that economists have long used to isolate an effect and link it back to what caused it, and it is increasingly enabling researchers outside of economics to tackle a host of new causal questions, some with major policy implications. Over the past decade, as data have become ubiquitous, economic tools including RDD have spread to fields such as finance and accounting. At the same time, a group of econometricians—Columbia’s Sebastian Calonico, Princeton’s Matias D. Cattaneo, Chicago Booth’s Max Farrell, and Princeton’s Rocío Titiunik—have refined RDD and created software libraries that put scientifically advanced versions of it at anyone’s fingertips.

“Most people can vaguely understand that you can’t just look at correlation” to know if two things (such as dental insurance and dental care) are linked, says Boston University’s Tal Gross, an economist. “You need more than that.” RDD, he observes, is “a rare example of a highly technical method that is accessible to anyone who can read a graph.” (For a depiction of the approach in Simon, Song, and Barnett’s dental-care research, see “How RDD works: An example,” below.)

Linking causes and effects

In the scientific method, someone raises a question and then devises a way to answer it, and the method used varies depending on the question. Many questions have to do with causal effects, as in, does a particular drug reduce diabetes?

Researchers can in many cases run a randomized control trial, the gold standard, to answer such causal-inference questions. This involves recruiting study participants, randomly dividing them into groups, and subjecting some to a treatment while others are a control group. Then the researchers compare the outcomes. Randomized trials are often used in medicine and can be used in economics too—for example, by assigning some students to receive extra tutoring while others go without it.

But while such a study design is ideal, it doesn’t always work in practice, and beyond that, it ignores a vast amount of information that already exists. Researchers can also identify and analyze natural experiments, which take advantage of preexisting data that weren’t necessarily created to be part of a study. In 2021, University of California at Berkeley’s David Card, MIT’s Joshua D. Angrist, and Stanford’s Guido W. Imbens shared the Nobel Prize in Economic Sciences for their pioneering contributions in the area of natural experiments.

Several tools can be used in natural experiments, and RDD is one of them. It first surfaced in the 1960s but was little used for decades, according to the Central Bank of Colombia’s Mauricio Villamizar-Villegas, Maria Alejandra Ruiz-Sanchez, and CEMLA’s Freddy A. Pinzon-Puerto, who documented RDD’s development and formalization. They note that it was used in a small first wave of projects in the 1970s, but revived more widely in the early 2000s.

One key factor when studying causal inference is the issue of randomization: the initial data must be essentially random before a treatment is applied. Say you want to study the effects of the COVID-19 vaccine outside of a randomized trial. In reality, the shot wasn’t assigned randomly, and people who take the shot differ in some ways that can be easily measured and others that can’t be. Were the people who received the shot better educated? Did they live in areas with more access to health care? Are they more risk averse? Those factors also affect outcomes.

How RDD works: An example

Regression discontinuity design is a statistical technique that uses existing data to compare similar people on either side of a sharp cutoff to measure the effect of a “treatment.” That treatment could be a policy such as Medicare, which is available to Americans starting at age 65 but doesn’t cover routine dental care. Could this lack of dental coverage lead to worse oral-health outcomes, such as complete edentulism (the loss of all teeth), at 65? Here’s a simplified example of how researchers in one study used RDD to help policy makers answer this question, demonstrating that complete edentulism increased at the cutoff.

In RDD, the population studied must be essentially the same up until the point of treatment. The treatment then creates a cutoff line that sorts people into treatment and control groups. Importantly, whatever happens to create this cutoff line is something that people couldn’t have anticipated, therefore they couldn’t have done anything to intentionally land themselves in one group or the other.

Randomized control trials are more precise than RDD, write Villamizar-Villegas, Ruiz-Sanchez, and Pinzon-Puerto. If you compare two studies with equal sample sizes, the controlled trial will be more precise because RDD will effectively analyze only a portion of the data originally sampled.

Additionally, some researchers have argued that RDD results are only valid close to the cutoff line. The research about Medicare finds that receiving dental benefits helped people at age 65, for example. It doesn’t establish whether offering the same services helped people of any other age, such as people aged 70 or 80. Villamizar-Villegas, Ruiz-Sanchez, and Pinzon-Puerto argue that RDD results can in many situations be generalized beyond their most technical application, but this needs to be evaluated on a case-by-case basis. Ultimately, policy makers must use discretion when it comes to applying RDD results to a real issue.

The widespread use of RDD would suggest that many researchers are comfortable with the method, despite its limitations. It is being used to look at all kinds of questions, from how COVID lockdowns affected flight cancellations to how being offered credit changes consumers’ savings patterns. In 2007, University of Chicago Harris School of Public Policy’s Jens Ludwig and Cornell’s Douglas L. Miller used RDD to analyze the effects of Head Start, the program that President Lyndon Johnson launched almost 60 years ago to fight child poverty in the United States.

To learn whether intervening early and supporting the formative years of a child’s development leads to long-term benefits later in life, the researchers identified an environment where people had been essentially randomized. In 1965, the Office of Economic Opportunity provided grant-writing assistance and other help to develop Head Start proposals, but only to the 300 poorest counties in the country. Those 300 counties were similar to many other poor counties that didn’t make the cut, and yet they qualified for assistance while the others did not. Thus the researchers were able to use RDD to determine that the assistance led to higher Head Start funding rates, which in turn led to lower child mortality rates and higher educational attainment.

The unanswered questions of RDD

The robustness of RDD, however, depends on some key decisions. In each project, it is up to researchers to make a few crucial choices, one of which has to do with “bandwidth,” or the distance from the cutoff line, establishing the number of data points included in the study. A narrow bandwidth means fewer data, which could increase the variability of the two samples and therefore produce a less accurate analysis. But making the bandwidth wider could create new complications because it could lead to including so many data points that it breaches the assumption that the two samples are comparable.

In the case of the Head Start study, the cutoff line ran between the 300th-poorest county and the 301st-poorest county. The researchers had a decision to make: How many counties should be included on either side of the line? One? Ten? Three hundred? What was important is that the counties included be substantially similar. If their study ended up comparing poor counties to richer ones, the analysis would no longer be sound.

For years, researchers established these parameters themselves, but ad-hoc decision-making introduces subjectivity into the process. In a series of papers, the first of which was published in 2014, Calonico, Cattaneo, Farrell, and Titiunik standardized this process to eliminate guesswork, fleshing out a standard, data-driven approach to derive the optimal bandwidth.

“This is making it more universal and transparent,” says Farrell. “You can trust what comes out of the analysis.”

A second crucial methodological decision has to do with covariates, or attributes that might influence the target outcome. The education level of parents often affects a child’s educational attainment, so that’s a covariate to note when assessing the success of Head Start, to continue the example.

Researchers no longer have to set parameters or hire a programmer and work through bugs. They can simply download the software, feed it data, and generate scientifically sound and precise answers.

In a typical regression analysis, it’s accepted practice for researchers to identify and consider these covariates. But should they do the same when running a regression discontinuity analysis? Without a formal approach, researchers again made their own call. But Calonico, Cattaneo, Farrell, and Titiunik addressed this too. Their research provides theoretical grounding for considering covariates in RDD analysis through “rigorous, applicable, transparent tools for research,” Farrell says. And their study finds that including covariates will largely improve precision just as it would in a typical regression.

The researchers also find, however, that there are certain covariates that should not be included, namely those that are directly affected by the treatment itself. For example, a researcher assessing Head Start wouldn’t want to separately consider a child’s educational performance because that, of course, would be influenced by participation in the programming.

To demonstrate how the method could be used, Calonico, Cattaneo, Farrell, and Titiunik revisited Ludwig and Miller’s work on Head Start, using the same data but designing an experiment according to their methods. They find that using covariates improved the precision of the regressions by up to 10 percent. They also demonstrate how various parts of RDD, including setting the bandwidth, should be done differently when covariates are used.

RDD in practice

The researchers have coded up their work and put it online, making it far easier to use their method. Researchers no longer have to set parameters or hire a programmer and work through bugs. They can simply download the software, feed it data, and generate scientifically sound and precise answers. This software has become the standard in academia, used in thousands of analyses. “It is hard to overstate the impact this work has had on empirical research in the past decade,” says Farrell.

Boston College’s Samuel Hartzmark and Booth’s Abigail Sussman, in a paper published in 2019, used RDD to determine that investors value sustainability. In 2016, Morningstar began scoring funds on a 100-point scale according to their sustainability, then divided the funds into five groups marked by visually salient globe icons. The top 10 percent of funds in terms of sustainability were put in the five-globe group, and the bottom 10 percent in the one-globe group.

The researchers looked at what happened when the globe rating system was introduced and used RDD to identify discontinuities at the cutoff lines between groups. Funds that are close to a cutoff line can have similar sustainability scores but be placed in different groups, and the researchers used this to demonstrate that investors react more strongly to the easy-to-grasp globe icons than to the scores underlying them.

“Prior to the rating publication, funds received similar levels of flows. After the publication, funds rated highest in terms of sustainability experienced substantial inflows of roughly 4 percent of fund size over the next 11 months. In contrast, funds rated lowest in sustainability experienced outflows of about 6 percent of fund size,” write Hartzmark and Sussman.

RDD informed research by Purdue’s Jillian B. Carr and Vanderbilt’s Analisa Packham, who find that food-purchasing assistance payments from the Supplemental Nutrition Assistance Program affected the timing and level of crimes committed in Illinois and Indiana. About the software, “we didn’t use it for the main specification, but we did use it for some robustness checks and for all bandwidth selection,” says Carr.

Granted, not every situation calls for RDD. At many businesses, experimentation has gotten so easy that there has to be a reason, such as cost or risk, to use something like RDD, says University of Chicago’s John A. List. And the software isn’t appropriate for every RDD project.

Villamizar-Villegas, who helped document the two waves of RDD growth, says, “I think that for us it was easier to pinpoint ‘formalization waves’ when the method was still in the beginning stages of development.” He’s unconvinced, but also wouldn’t rule out, that there’s a third wave underway.

For now, the use of the RDD software can be tracked through a growing list of academic citations. Simon, coauthor of the research on dental coverage, says, “I was familiar with the method previously, but read Calonico et al.’s papers extensively to better understand what the methodology is and why using their packages was the most rigorous approach to my research question.” Her research findings haven’t succeeded in moving the Medicare-coverage debate forward—but that may be because there’s not yet an equivalent, scientifically advanced tool that can be used to find answers hidden in politics.

Works Cited

Sebastian Calonico, Marias D. Cattaneo, and Rocío Titiunik, “Robust Data-Driven Inference in the Regression-Discontinuity Design,” Stata Journal, December 2014.
Sebastian Calonico, Matias D. Cattaneo, and Max Farrell, “Optimal Bandwidth Choice for Robust Bias-Corrected Inference in Regression Discontinuity Designs,” Econometrics Journal, May 2020.
Sebastian Calonico, Matias D. Cattaneo, Max Farrell, and Rocío Titiunik, “rdrobust: Software for Regression Discontinuity Designs,” Stata Journal, June 2017.
———, “Regression Discontinuity Designs Using Covariates,” Review of Economics and Statistics, July 2019.
Jillian B. Carr and Analisa Packham, “SNAP Benefits and Crime: Evidence from Changing Disbursement Schedules,” Review of Economics and Statistics, May 2019.
Samuel Hartzmark and Abigail Sussman, “Do Investors Value Sustainability? A Natural Experiment Examining Ranking and Fund Flows,” Journal of Finance, August 2019.
Jens Ludwig and Douglas L. Miller, “Does Head Start Improve Children’s Life Chances? Evidence from a Regression Discontinuity Design,” Quarterly Journal of Economics, February 2007.
Lisa Simon, Zirui Song, and Michael L. Barnett, “Dental Services Use: Medicare Beneficiaries Experience Immediate and Long-Term Reductions after Enrollment,” Health Affairs, February 2023.
Donald L. Thistlewaite and Donald T. Campbell, “Regression-Discontinuity Analysis: An Alternative to the Ex Post Facto Experiment,” Journal of Educational Psychology, 1960.
Mauricio Villamizar-Villegas, Freddy A. Pinzon-Puerto, and Maria Alejandra Ruiz-Sanchez, “A Comprehensive History of Regression Discontinuity Designs: An Empirical Survey of the Last 60 Years,” Journal of Economic Surveys, August 2021.

More from Chicago Booth Review

The Equation: How to Improve a City Commute

How researchers analyzed the impact of various transport policies on Chicago commuters.

CBR - Public Policy

Data Privacy Is for the Privileged

Policies marginalize poorer consumers and consumers with niche tastes, disadvantaging small businesses.

CBR - Marketing

What Would Banning TikTok Mean for the US Economy?

Economic experts consider what a nationwide ban on the app would mean for innovation and for the rest of the tech industry.

CBR - Clark Center Panels

NECESSARY COOKIES These cookies are essential to enable the services to provide the requested feature, such as remembering you have logged in.	ALWAYS ACTIVE
	Accept \| Reject
PERFORMANCE AND ANALYTIC COOKIES These cookies are used to collect information on how users interact with Chicago Booth websites allowing us to improve the user experience and optimize our site where needed based on these interactions. All information these cookies collect is aggregated and therefore anonymous.
FUNCTIONAL COOKIES These cookies enable the website to provide enhanced functionality and personalization. They may be set by third-party providers whose services we have added to our pages or by us.
TARGETING OR ADVERTISING COOKIES These cookies collect information about your browsing habits to make advertising relevant to you and your interests. The cookies will remember the website you have visited, and this information is shared with other parties such as advertising technology service providers and advertisers.
SOCIAL MEDIA COOKIES These cookies are used when you share information using a social media sharing button or “like” button on our websites, or you link your account or engage with our content on or through a social media site. The social network will record that you have done this. This information may be linked to targeting/advertising activities.

How Statistics Is Getting Better at Linking Cause and Effect

A transformative tool is being used to answer an ever-increasing number of questions.

Linking causes and effects

How RDD works: An example

The unanswered questions of RDD

RDD in practice

More from Chicago Booth Review

The Equation: How to Improve a City Commute

Data Privacy Is for the Privileged

What Would Banning TikTok Mean for the US Economy?

Related Topics

More from Chicago Booth

Related Topics

Manage Cookie Preferences

How Statistics Is Getting Better at Linking Cause and Effect

A transformative tool is being used to answer an ever-increasing number of questions.

Linking causes and effects

How RDD works: An example

The unanswered questions of RDD

RDD in practice

More from Chicago Booth Review

The Equation: How to Improve a City Commute

Data Privacy Is for the Privileged

What Would Banning TikTok Mean for the US Economy?

Related Topics

More from Chicago Booth

Related Topics