Paper Algorithmic Behavioral Science: Machine Learning as a Tool for Scientific Discovery
While hypothesis testing is a highly formalized activity, hypothesis generation remains largely informal. We propose a procedure that uses machine learning algorithms—and their capacity to notice patterns that people might not—to generate novel hypotheses about human behavior. We illustrate the procedure with a concrete empirical application: pre-trial decisions by judges. We begin with a striking fact. An algorithmic model reveals that a single factor explains nearly half of the predictable variation in who judges choose to jail: the pixels in the defendant’s mugshot. The mugshot remains highly predictive even after controlling for race, skin color, demographics, and facial features previously emphasized by psychologists. Moreover, human judgments about who will be jailed—based on the mugshots—do significantly worse than the algorithm’s. What, then, has the algorithm discovered about who judges choose to jail? To answer this question, we build a communication procedure that allows people to see what the algorithm “sees.” We find that subjects using the procedure appear to understand the algorithm: they are able to articulate facial features that turn out to explain the algorithm’s predictions. These novel features also explain the actual choices judges make: defendants with these features are jailed at significantly higher rates. Though our results are specific, our approach is general. The modern world produces troves of high-dimensional data, e.g., from cell phones, satellites, online behavior, news headlines, corporate filings, and high-frequency time series. Our framework provides a way to produce novel interpretable hypotheses from high-dimensional data such as these, a way to marry the predictive power of machine learning with human intuition. A central tenet of our paper is that hypothesis generation is in and of itself a valuable activity, and hope this encourages future work in this “pre-scientific” stage of science.
- Authored by
- 2022
- CAAI - Behavioral Science