One of Justice Ketanji Brown Jackson’s most defining attributes on the US Supreme Court, particularly so early in her tenure, is her identity as a Black woman—the first to have climbed to such towering heights in the US justice system. But she also participated in a selection process that was unusual, enough so to have attracted attention from seasoned Court watchers. For decades, presidents only considered people who had risen up through the federal court system, pointed out journalist Jeff Greenfield at Politico. President Joe Biden, in fulfilling his intention to appoint a Black woman, deviated from that standard and considered a state-court justice before ultimately picking Jackson, an alumna of both the federal bench and (as is true for almost all of the justices) an Ivy League school.

Professional and educational filters can be helpful in identifying candidates who are well prepared for a role. But they’re also a barrier, easier for certain candidates to pass through than others. If students from advantaged backgrounds have more opportunities to shine in applications to colleges and specifically Ivy League schools, and then as graduates are privileged in job searches, the playing field is tilted in their favor. Some allege this is exactly what is happening. In July 2023, on the heels of a Supreme Court decision gutting affirmative action at US universities, a decision from which Jackson dissented, a group of minority advocacy groups filed a civil-rights complaint with the US Department of Education alleging that Harvard favors “overwhelmingly white applicants” who take advantage of the school’s “legacy and donor preferences” in undergraduate admissions. A Harvard spokesperson says the school is “in the process of reviewing aspects of our admissions policies to assure compliance with the law and carry forward Harvard’s longstanding commitment to welcoming students of extraordinary talent and promise who come from a wide range of backgrounds, perspectives, and life experiences.”

The layered relationship over time between identity and opportunity make up the infrastructure of systemic discrimination, a phenomenon that social scientists have studied since the 1950s and that is increasingly acknowledged across American society, despite resistance from the Right. But in economics, practitioners have traditionally studied only direct discrimination, with projects that have a narrower scope. Take, for example, a study from three Federal Reserve economists—Neil Bhutta, Aurel Hizmo, and Daniel Ringo—that analyzes the extent to which lenders provided differential treatment by race, illegal under US fair-lending laws, in 2018 and 2019. The study establishes a steep decline in racial discrimination in mortgage issuance, as compared with research findings from a study of home loan applications in 1990, which is encouraging. But the researchers in both studies controlled for factors such as applicants’ credit scores and leverage, a standard economic approach but one that drew ridicule from journalist Michael Hobbes, who tweeted, “Yes[,] once you remove the influence of all of the other racist systems, racism doesn’t exist.”

The thinking among economists about how to account for such factors may be changing, however. University of Pennsylvania’s J. Aislinn Bohren, Brown’s Peter Hull, and Chicago Booth’s Alex Imas are among the economists who are proposing new approaches to measuring discrimination that take systemic factors into account. They are looking at the mechanisms by which historical discrimination continues to create unequal outcomes while also acknowledging the limits of economists’ traditional measurement tools and extending the tape measure—rethinking their models so that quantitative data can better illuminate whether the American dream is available to all. The research by Bohren, Hull, and Imas indicates that traditional estimates can undercount discrimination, and not by just a little: they sometimes miss the majority of the total.

The actions of prejudiced individuals

The standard economic approach has been critiqued by sociologists including Mario Small at Columbia and the late Devah Pager, who taught at Harvard. Their 2020 paper advising economists on how to improve their study of discrimination using techniques from sociology suggests that a 70-year focus on why people discriminate has led economics to focus too much on individual gatekeepers and ignore norms and practices that transcend them. According to the paper, “Limiting the study of discrimination to the actions of potentially prejudiced individuals dramatically understates the extent to which people experience discrimination [and] . . . to which discrimination may account for social inequality.”

Economists began studying direct discrimination as early as the 1950s, led by the late Gary S. Becker, a professor at Booth. In examining the racial wage gap, Becker gave the field tools for measuring its financial impact not just on those discriminated against but on the people and institutions who discriminate—and on the economy as a whole.

In the early 1970s, Edmund S. Phelps at Columbia and Stanford’s Kenneth J. Arrow, working independently, moved the conversation from focusing not just on discrimination stemming from what Becker called “taste”—or a dislike of people from other identity groups. They began interrogating the degree to which people from certain groups are prepared for, or perceived to be prepared for, particular activities. This is what Phelps called statistical discrimination, since it is based on statistical differences between groups’ preparedness in the past, or at least on what one person believes those differences to have been. Not hiring a young woman as your deputy because you prefer to work with young men would be taste-based discrimination; not hiring her because you think she is more likely than a man to leave after a few years to start a family would be statistical discrimination, since historically (or “statistically”) women take career breaks to raise children more than men do.

Economists’ go-to tools detect discrimination added by a single, biased individual (or individuals) at a particular point in the process, not discrimination that’s baked into a system.

The idea is that decision-makers may have no personal amity or animus toward any one group but use identity markers such as race or religion to predict an individual’s suitability for a job, a school, or other opportunities. “In the absence of perfect individual information regarding job applicants’ labor performance, group information (e.g., ethnicity, gender, age) is considered to be a cheap source of information to infer individual productivity,” writes Radboud University postdoctoral researcher Lex Thijssen in a review for the EU Commission of frameworks for understanding racism’s impact on labor markets.

For example, an employer trying to fill a marketing position might veer toward younger job applicants on the belief that they are more fluent in social media than baby boomers. Thus, the hiring manager is able to save time by narrowing in on one group, using what could be accurate information about group attributes, if not the individuals concerned. Of course, the underlying assumption used in some decisions could be misguided, based on stereotypes, or applied selectively. Thijssen, in an email, illustrates where statistical discrimination can go awry: “Given the overrepresentation of men in crime rates, no employer should hire men anymore,” he points out.

This line of investigation has continued and deepened. Bohren, University of California at Los Angeles’s Kareem Haggag (a graduate of Booth’s PhD Program), Imas, and Booth’s Devin G. Pope distinguish between statistical discrimination that is “accurate”—as in, the generational assumptions about social media fluency that the hypothetical hiring manager in the previous example has made—and “inaccurate” statistical discrimination, such as the lower salaries paid to women on the faulty belief that they are less competent at certain jobs.

They reviewed three decades of economic research and find that hiring managers’ misconceptions of how well women, Black people, or immigrants can complete a task may be incorrectly interpreted by researchers as intractable sexism, racism, or xenophobia.

Of course, such narrow definitions of ingrained bias might be just the sort of thing that frustrates noneconomists. Is a landlord xenophobic if he avoids renting to South Asians—but not xenophobic if his decision is due to a mistaken assumption that the smell of Indian food will linger in his apartments? Does ignorance excuse discrimination? For economists, these distinctions are useful not to give people doing the discriminating the benefit of the doubt, but to help identify useful tools to reduce disparities: “If discrimination stems from inaccurate beliefs, an effective policy response could be to provide individuals with information about the correct distributions,” write Bohren, Haggag, Imas, and Pope.

The gender pay gap in two steps

In a lab experiment simulating the labor market, the researchers demonstrated how an initial bias against women can have a persistent effect. 

Evidence of systemic factors

But providing people with accurate information may not stop them from discriminating. And beyond that, as sociologists have noted, to think of it as discrete episodes is to ignore the diffuse nature of discrimination. A lot happens before people apply for an apartment or for a job.

Booth’s Erik Hurst, London School of Economics’ Yona Rubinstein, and MIT PhD student Kazuatsu Shimizu wanted to understand the persistent wage gap between Black and white men in the United States—since it can’t be explained by either taste-based discrimination, which surveys demonstrate has decreased over time, or statistical discrimination, which should also be diminishing as Black men’s and white men’s education and work experiences have converged. The researchers looked at jobs according to the skills they require, ranging from analytical to manual to interpersonal. “At the heart of our framework is the idea that the size and nature of racial discrimination faced by Black workers varies by the task requirements of each job,” the researchers write.

“Unless you’re going to make the argument that people are innately different, there has to be something that is causing different outcomes between Group A and Group B,” Hurst says. “The question is where it lies.”

They find telling trends in two areas with traditionally high barriers to entry for Black men: positions that required significant contact with others (think sales clerk) and those that required abstract thinking (say, software developer). Between the 1960s and 1980, Hurst, Rubinstein, and Shimizu find, jobs that required strong interpersonal skills were taken up at increasingly similar rates by Black and white men, whereas jobs whose main tasks required complex analytical skills remained as dominated as ever by white men. From 1960 to 1980, this two-paced development did not keep the wage gap from shrinking. But starting in the 1980s, as changes in the wider economy more richly rewarded jobs heavy on analytical tasks, the fact that Black men still held fewer of these began having a more noticeable impact on wage differentials overall.

The researchers looked back at skills surveys of teenagers to see if those explain why Black men might not have been getting jobs heavy on analytical tasks. Though they could not draw a direct line between education and hiring, their sources pointed to Black teenagers lacking the skills that would, come the 1980s, help land them a higher-paying job. This skill deficit and overall discrimination each account for about half of the racial gap in these jobs, the study concludes.

The study starts to unpack biased outcomes. The same is true of work by Harvard’s Claudia Goldin, whose research has, among other things, identified some of the decisions and forces that lead to lower pay for women than men. But economists’ go-to tools detect discrimination added by a single, biased individual (or individuals) at a particular point in the process, not discrimination that’s baked into a system.

The newer research highlights that even though there are situations in which women might be favored, it’s not enough to make up for the higher hurdles they have to clear along the way.

For instance, audit studies involve using actors or fabricated résumés to test whether real-life employers discriminate among candidates who boast nearly identical qualifications but differ in terms of gender, race, age, or other personal attributes. One of the earliest and best-known of these experiments, conducted in 2001 and 2002 by Booth’s Marianne Bertrand and Sendhil Mullainathan, finds that résumés topped by names frequently given to white newborns in the late 1970s elicited interest from employers at a rate 50 percent higher than the same résumés with only the names changed—to those popular with Black parents in the same period.

This picks up direct discrimination in a laboratory-like setting. The trouble is that this setting assumes that achieving near-identical qualifications, regardless of gender or race, is always possible. The lack of tools to measure the impact of sexism or racism over time, or across different contemporaneous domains, can minimize discrimination’s role in labor markets and beyond, says Imas.

A multiplier effect

Let’s consider an automaker looking to add engineers to its team. The human-resources department outsources the first stages of hiring to an external recruiter who reviews candidates’ applications and suggests four people for the positions—two men and two women. The recruiter shares with the company’s HR director his predictions of how productive each candidate will be on the job. He is honest about the two women’s expected productivity, but because of his own biases, he inflates the two men’s.

The hiring manager uses these signals to offer pay packages she thinks reflect how much the candidates will contribute to the company’s success. If all the candidates are equally qualified but the sexist recruiter’s advice is taken seriously, the hiring manager will offer higher salaries to the male candidates—even if she believes an engineer’s gender is irrelevant, and even if she does not even know each candidate’s gender.

The recruiter is engaged in direct discrimination, explain Bohren, Hull, and Imas, who together use a barer-bones version of this hypothetical scenario to propose new models and measurement tools for capturing discrimination. The HR director, on the other hand, and the company that employs her, have engaged in—however unintentionally—indirect discrimination, or systemic discrimination. Compound the two, say the researchers, and you’ve got a more objective, apolitical, and complete accounting of the tide of bias against which so many individuals need to swim.

Recommended Reading

To do that fuller accounting, the researchers first broke down total discrimination into direct and systemic components. They defined direct discrimination as explicitly differential treatment (inflating a male engineer’s likely productivity), and systemic discrimination as the result of acting on biased information (paying male engineers a higher rate than equally qualified women because of their inflated reputation).

The researchers then created three hiring experiments. They randomized about 800 study participants into the roles of worker, recruiter, or hiring manager. The workers self-reported their gender identities and completed two tasks. Recruiters, given access to both the gender data and the results of one of the tasks, were asked to state the highest salary they would be willing to pay the workers—and to select the two workers they would most likely hire.

In the first experiment, recruiters suggested wages for male candidates that were nearly 50 percent higher, on average, than those suggested for female workers. Because the researchers did not find any gender difference in performance (using the results of the tasks), the disparity suggests direct discrimination.

As for hiring managers, they offered wages to male workers that were 94 percent higher than—or nearly double—those they extended to female workers. Some of this difference could be traced to direct discrimination, but a greater portion was the result of the hiring managers seeing and reacting to the discriminatory performance signals generated by the recruiters. When the researchers controlled for the recruiters’ biases, they find that hiring managers made salary offers to men that were 41 percent higher, on average, than those they made to equally qualified women.

Standard economic measures of discrimination narrow in on the moment that identity dictates a person’s remunerative worth, but they would have missed this systemic discrimination from the hiring managers—which represents the majority of the discrimination in this experiment (56 percent, or the difference between the 94 percent wage gap and the 41 percent wage gap).

The experiment suggests an HR department operating in a vacuum would pay male employees $1.41 for every dollar their female equals earn. And when there’s a longer history of bias, as represented in the setup by the additional involvement of a recruiter, male workers would earn $1.94 for every dollar their female counterparts take home.

The system’s invisible hurdles

Measuring systemic discrimination enables economists to establish who is most hurt by it. In this case, the research finds that well-qualified women can be hit harder than poorly qualified peers by a system that puts up hurdles they can’t see.

In the second hiring experiment, candidates were given a performance task—a short test of math skills and knowledge of business and history. But hiring managers only learned about the performance of workers whom the recruiters recommended, and recruiters recommended male workers at twice the rate they did equally qualified female workers—64 percent versus 36 percent. Managers then hired one in three male candidates, compared with one in five women, without even knowing the performance results of many highly qualified female candidates.

The gender difference in hiring rates for individuals who scored a 2 out of 6 on the performance task was narrow, with poor-performing men and women alike having a less than one in five chance of being hired. A woman who scored a 6 on the task saw her chances of being hired rise to one in four. But a man who scored a 6 had closer to a 50 percent chance of getting the job.

“It’s not that the best-qualified women out there are more discriminated against directly,” says Imas. “It’s that they have the most to lose. They would thrive if systemic discrimination didn’t exist—but because of it, they don’t get the chance.”

This brings him back to what, in part, inspired the search for new measurement tools. In an online experiment conducted eight years ago, Bohren, Imas, and Michael Rosenberg observed that high-performing women participants were no longer discriminated against—and even benefited from reverse sexism—if they were given a chance to prove themselves (in this case, by solving math problems posted by members of an online community of more than 350,000 users), and thus build their reputations.

“We’ve got this leaky pipeline problem where well-qualified women keep dropping out or being held back,” says Imas. The newer research highlights that even though there are situations in which women might be favored—because of gender quotas, or simply by employers who make a concerted effort to hire qualified women—it’s not enough to make up for the higher hurdles they have to clear along the way.

Separating racial bias from systemic discrimination

The researchers used an “iterated audit” approach to measure how much of the difference in outcomes between a Black worker and a white worker was due to previous instances of discrimination carrying over to the present.

A new tool for economists

Those hurdles are why, according to Imas, policies ignoring race and gender may not create a more equitable system.

Think back to the finding about the declining rate of discrimination against Black applicants by lenders. Black applicants tend to have poorer credit and higher leverage, two things that were controlled for in the experiment—much to Michael Hobbes’s disdain—and a sociologist might link those factors to others, such as years of living with small slights and negative stereotypes that might steer them toward predatory loans. The interest rates Black borrowers pay on their car loans tend to be higher than those paid by non-Hispanic white borrowers; predatory lenders consistently target Black neighborhoods; and Black college graduates are more likely to funnel income back to their families than white graduates, who are more likely to be receiving money from their parents, which they can use to pay off debt and build wealth.

In this picture, US history and systems make it impossible to achieve a situation where everyone arrives at a critical moment of measurement on equal footing. The first two hiring experiments from Bohren, Imas, and Hull demonstrated that the sociologists have a point.

In their final hiring experiment, the researchers presented a formal process for distinguishing between and separately measuring direct and systemic discrimination. They developed an experimental two-step approach they call an iterated audit and demonstrated it by recruiting real-life hiring managers as participants and giving them the résumés of three groups of candidates. The first group had résumés with white-sounding names and strong work-experience sections. The second had Black-sounding names and equally strong work-experience sections. Comparing how managers react to these two worker profiles is a classic audit study that measures direct discrimination. “Economists are looking for the causal effect of race, so they hold as much as they can constant,” says Imas.

But the third set of candidates had résumés with Black-sounding names and poorer work-experience sections, reflecting what seminal research by the late Pager reveals: unequal access to the labor market means Black job applicants have fewer examples of relevant work experience on their résumés, on average, than white applicants. Importantly, the differences in work experience reflect the racial disparities in access to entry-level jobs among equally qualified candidates, as documented by Pager in her famous experimental study. Explains Imas, “The iterated audit points out when your control variables have discrimination baked in.”

The hiring managers in Bohren, Imas, and Hull’s experiment did not discriminate much against Black workers with the same level of experience as white workers. But they hired Black candidates with less work experience at significantly lower rates. This was despite the researchers explaining to the hiring managers that the work experience varied because of discrimination in the past. “These findings highlight the difficulty of using a similar tactic to mitigate total discrimination when it is caused by systemic factors,” they write.

Psychological frictions such as outcome bias—the tendency to attribute differences in relevant outcomes (such as work experience) to differences in personality factors (such as effort or ability) without taking the source of the differences into account—can blunt the effectiveness of information in counteracting systemic discrimination. The researchers conducted two other experiments, one of which highlights the scope of this challenge: participants were unable to counteract systemic discrimination even in a setting where they were provided information about the source of the bias and explicitly incentivized to identify productivity.

Imas points out that a multistep iterated audit can be used beyond recruitment and hiring. The approach can be applied to any setting where performance indicators such as standardized test scores, letters of recommendation, or criminal records derive from circumstances that themselves are defined by discrimination—in healthcare, criminal justice, education, and more.

Those performance indicators may be misleading, and not simply because a sexist recruiter passed along biased information, as in Bohren, Hull, and Imas’s first two experiments. Consider short, unpaid internships: when they appear on a job applicant’s résumé, they might suggest to companies that the candidate is more prepared than a rival for the position who has no internships to boast of. In fact, because they are so short, these stints say little about a candidate’s relative experience. And because they are easier to secure for children of professionals than for children of working-class parents, and working for free is more manageable for the former group, they also tell us little about a candidate’s relative skills or drive. The inequity around internships thus feeds systemic discrimination in hiring at the entry level.

Ohio State’s Trevon Logan suggests that the research from Bohren, Hull, and Imas will allow economists to ask and answer more questions about how inequity developed. For example, he points to 2023 research by Hull and a different group of coresearchers that applies the framework in a study of foster-care placements. The researchers find that between 2008 and 2017, Black children were 50 percent more likely than white children to be placed in foster care, a disparity that they write was “entirely driven by white investigators and by cases where maltreatment potential is present.” But University of Pennsylvania’s Dorothy E. Roberts, a legal scholar, argues in her books Torn Apart and Shattered Bonds that foster care was created to be a “family policing system” that punishes Black families.

“If it’s actually a family policing system, they’re finding the system is actually working as it was designed,” says Logan. “So my mind is now moving beyond discrimination and its effects to: Are these systems actually successful? That would motivate me to think about how we construct systems rooted in racial inequality.”

The race-based college-admissions process struck down by the Supreme Court represents one such system being widely discussed. The plaintiffs in the lawsuits that led to the ruling successfully argued that Harvard College and the University of North Carolina, by using race-conscious affirmative action, had discriminated against high-achieving Asian American applicants, including first-generation Americans. But some critics of the decision lament that doing away with affirmative action will create an additional hurdle for Black applicants to cross, especially if legacy and sports admissions are shown to be privileging white applicants. The fight has pitted disadvantaged groups against each other and raised the question of whether it’s possible to design a system of affirmative action that actually reverses systemic racism—a question that iterated audits could perhaps help answer. “Our tool allows you to put numbers on things,” observes Imas. And as the approach becomes more widely used, he predicts, those numbers will be harder to ignore.

Additional reporting by Alysson Reedy

More from Chicago Booth Review

More from Chicago Booth

Your Privacy
We want to demonstrate our commitment to your privacy. Please review Chicago Booth's privacy notice, which provides information explaining how and why we collect particular information when you visit our website.