Paper Demand Estimation with Text and Image Data
This study proposes a demand estimation approach that leverages unstructured data, specifically product images and textual descriptions, to infer consumer substitution patterns. Using pre-trained deep learning models, the method extracts numerical embeddings from these raw inputs and incorporates them into a mixed logit demand framework. The approach is particularly valuable in settings where structured product attribute data are unavailable or where consumers respond to hard-to-quantify characteristics such as visual design. The method is validated through a choice experiment in which it substantially outperforms conventional attribute-based demand models in predicting consumers' second choices, a direct test of substitution behavior. The approach is further applied across 40 product categories on a major e-commerce platform, consistently finding that unstructured data carry meaningful information about how consumers substitute across products. Together, these findings suggest that embeddings derived from product images and descriptions can serve as effective proxies for the dimensions of differentiation that drive consumer choice, even in the absence of traditional attribute data. The proposed framework offers a practical and scalable tool for researchers and practitioners seeking richer demand estimates in data-rich but attribute-sparse environments.
- Authored by
- 2026
- CAAI - Finance