Faculty & Research

Dacheng Xiu

Associate Professor of Econometrics and Statistics and IBM Corporation Faculty Scholar

Phone :
Address :
5807 South Woodlawn Avenue
Chicago, IL 60637

Dacheng Xiu is interested in developing and applying statistical methodologies to exploit the economic implication of financial data. His prior research involves risk measurement and management with high-frequency data and econometric modeling of derivatives. His current work focuses on developing machine learning solutions to big-data problems in empirical asset pricing.

His work has appeared in the Econometrica, Journal of Econometrics, Journal of the American Statistical Association, and he has been invited to publish in the Journal of Business and Economic Statistics. Xiu has presented his work at various conferences and university seminars. He is an Associate Editor for the Journal of Econometrics, and also serves as a referee for many journals in econometrics, statistics, and finance.

Xiu earned his PhD and MA in applied mathematics from Princeton University, where he studied at the Bendheim Center for Finance. Before that, he obtained a BS in mathematics from the University of Science and Technology of China in Hefei, China. Additionally, Xiu's professional experience includes work with TYKHE Capital LLC in New York and Citigroup in their capital markets and banking division.

Outside of academia, Xiu enjoys a variety of sports as well as photography.


2016 - 2017 Course Schedule

Number Name Quarter
41100 Applied Regression Analysis 2017 (Winter)
41600 Econometrics and Statistics Colloquium 2017 (Spring)
41902 Statistical Inference 2017 (Winter)

Other Interests

Skiing, swimming, diving, basketball, and photography


Research Activities

Financial Econometrics, Statistics, Empirical Asset Pricing, and Quantitative Finance

"Hermite Polynomial Based Expansion of European Option Prices." Dacheng Xiu; Journal of Econometrics, 2014, 179(2), pp. 158-77.

"Quasi-Maximum Likelihood Estimation of GARCH Models with Heavy-Tailed Likelihoods." Jianqing Fan, Lei Qi and Dacheng Xiu; Journal of Business & Economic Statistics, 2013, 32(2), pp. 178-91.

"Likelihood-Based Volatility Estimators in the Presence of Market Microstructure Noise," Yacine Aït-Sahalia and Dacheng Xiu, in L. Bauwens, C. Hafner and S. Laurent: Handbook of Volatility Models and Their Applications. Hoboken, New Jersey: John Wiley & Sons, Inc., 2012, pp. 347-61

"Quasi-Maximum Likelihood Estimation of Volatility with High Frequency Data." Dacheng Xiu; Journal of Econometrics, 2010, 159(1), pp. 235-50.

"High-Frequency Covariance Estimates with Noisy and Asynchronous Financial Data." Yacine Aït-Sahalia, Jianqing Fan and Dacheng Xiu; Journal of the American Statistical Association, 2010, 105(492), pp. 1504-17.

For a listing of research publications, please visit the university library listing page.

REVISION: Efficient Estimation of Integrated Volatility Functionals via Multiscale Jacknife
Date Posted: Apr  05, 2017
We propose semi-parametrically efficient estimators for general integrated volatility functionals of multivariate semimartingale processes. It is known that a plug-in method that uses nonparametric estimates of spot volatilities induces high-order biases which need to be corrected to obey a central limit theorem. Such bias terms arise from boundary effects, the diffusive and jump movements of stochastic volatility, and the sampling error from the nonparametric spot volatility estimation. We propose a novel jackknife method for bias-correction. The jackknife estimator is simply formed as a linear combination of a few uncorrected estimators associated with different local window sizes used in the estimation of spot volatility. We show theoretically that our estimator is asymptotically mixed Gaussian, semi-parametrically efficient, and more robust to the choice of local windows. To facilitate the practical use, we introduce a simulation-based estimator of the asymptotic variance, so ...

REVISION: Taming the Factor Zoo
Date Posted: Apr  05, 2017
The asset pricing literature has produced hundreds of potential risk factors. Organizing this "zoo of factors" and distinguishing between useful, useless, and redundant factors require econometric techniques that can deal with the curse of dimensionality. We propose a model-selection method that allows us to systematically evaluate the contribution to asset pricing of any new factor, above and beyond what is explained by a high-dimensional set of existing factors. Our procedure selects the best parsimonious model out of the large set of existing factors, and uses it as the control in making statistical inference about the contribution of new factors. Our inference allows for model selection mistakes, and is therefore more reliable in finite sample. We derive the asymptotic properties of our test and apply it to a large set of factors proposed in the literature. We show that despite the fact that hundreds of factors have been proposed in the last 30 years, some recent factors -- like ...

New: Knowing Factors or Factor Loadings, or Neither? Evaluating Estimators of Large Covariance Matrices with Noisy and Asynchronous Data
Date Posted: Feb  23, 2017
In this paper, we investigate factor-model-based large covariance (and precision) matrices estimators using high frequency data, which are asynchronous, and potentially contaminated by the market microstructure noise. Our estimation strategies rely on the pre-averaging method with refresh time to solve the microstructure problems, while using three different specifications of factor models, and their corresponding estimators, respectively, to battle against the curse of dimensionality. To estimate a factor model, we either adopt the time-series regression (TSR) to recover loadings if factors are known, or use the cross-sectional regression (CSR) to recover factors from known loadings, or use the principal component analysis (PCA) if neither factors nor their loadings are assumed known. We compare the convergence rates in these scenarios using the joint in-fill and increasing dimensionality asymptotics. To evaluate the empirical trade-o_ between robustness to model misspecification ...

REVISION: A Hausman Test for the Presence of Market Microstructure Noise in High Frequency Data
Date Posted: Feb  21, 2017
We develop tests that help assess whether a high frequency data sample can be treated as reasonably free of market microstructure noise at a given sampling frequency for the purpose of implementing high frequency volatility and other estimators. The tests are based on the Hausman principle of comparing two estimators, one that is efficient but not robust to the deviation being tested, and one that is robust but not as efficient. We investigate the asymptotic properties of the test statistic in a general nonparametric setting, and compare it with several alternatives that are also developed in the paper. Empirically, we find that improvements in stock market liquidity over the past decade have increased the frequency at which simple, uncorrected, volatility estimators can be safely employed.

REVISION: Resolution of Policy Uncertainty and Sudden Declines in Volatility
Date Posted: Feb  13, 2017
We introduce downward volatility jumps into a general non-affine modeling framework of the term structure of variance. With variance swaps and S&P 500 returns, we find that downward volatility jumps are associated with a resolution of policy uncertainty, mostly through statements from FOMC meetings and speeches of the Fed chairman. We also find that such jumps are priced with positive risk premia, which reflect the price of the "put protection" offered by the Fed. Ignoring them may lead to an incorrect interpretation of such tail events. Moreover, variance risk premia tend to be insignificant or even positive at the inception of crises. On the modeling side, we explore the structural differences and relative goodness-of-fits of factor specifications, and find that the log-volatility model with two Ornstein-Uhlenbeck factors and double-sided jumps is superior in capturing volatility dynamics and pricing variance swaps, compared to the affine model prevalent in the literature or ...

REVISION: Inference on Risk Premia in the Presence of Omitted Factors
Date Posted: Jan  24, 2017
We propose a three-pass method to estimate the risk premia of observable factors in a linear asset pricing model, which is valid even when the observed factors are just a subset of the true factors that drive asset prices. Standard methods to estimate risk premia are biased in the presence of omitted priced factors correlated with the observed factors. We show that the risk premium of a factor can be identified in a linear factor model regardless of the rotation of the other control factors as long as they together span the space of true factors. Motivated by this rotation invariance result, our approach uses principal components to recover the factor space and combines the estimated principal components with each observed factor to obtain a consistent estimate of its risk premium. This methodology also accounts for potential measurement error in the observed factors and detects when such factors are spurious or even useless. The methodology exploits the blessings of dimensionality, ...

REVISION: Econometric Analysis of Multivariate Realised QML: Estimation of the Covariation of Equity Prices under Asynchronous Trading
Date Posted: Dec  05, 2016
Estimating the covariance between assets using high frequency data is challenging due to market microstructure effects and asynchronous trading. In this paper we develop a multivariate realised quasi maximum likelihood (QML) approach, carrying out inference as if the observations arise from an asynchronously observed vector scaled Brownian model observed with error. Under stochastic volatility the resulting realised QML estimator is positive definite, uses all available data, is consistent and asymptotically mixed normal. The quasi-likelihood is computed using a Kalman filter and optimised using a relatively simple EM algorithm. We also propose an alternative estimator using a factor model, which scales well with the number of assets. We derive the theoretical properties of these estimators and prove that they achieve the efficient rate of convergence. Our estimators are also analysed using Monte Carlo methods and applied to equity data with varying levels of liquidity.

REVISION: Principal Component Analysis of High Frequency Data
Date Posted: Oct  11, 2016
We develop the necessary methodology to conduct principal component analysis at high frequency. We construct estimators of realized eigenvalues, eigenvectors, and principal components and provide the asymptotic distribution of these estimators. Empirically, we study the high frequency covariance structure of the constituents of the S&P 100 Index using as little as one week of high frequency data at a time. The explanatory power of the high frequency principal components varies over time. During the recent financial crisis, the first principal component becomes increasingly dominant, explaining up to 60% of the variation on its own, while the second principal component drives the common variation of financial sector stocks.

REVISION: Using Principal Component Analysis to Estimate a High Dimensional Factor Model with High-Frequency Data
Date Posted: Oct  11, 2016
This paper constructs an estimator for the number of common factors in a setting where both the sampling frequency and the number of variables increase. Empirically, we document that the covariance matrix of a large portfolio of US equities is well represented by a low rank common structure with sparse residual matrix. When employed for out-of-sample portfolio allocation, the proposed estimator largely outperforms the sample covariance estimator.

REVISION: Nonparametric Estimation of the Leverage Effect: A Trade-off between Robustness and Efficiency
Date Posted: Nov  24, 2015
We consider two new approaches to nonparametric estimation of the leverage effect. The first approach uses stock prices alone. The second approach uses the data on stock prices as well as a certain volatility instrument, such as the CBOE volatility index (VIX) or the Black-Scholes implied volatility. The theoretical justification for the instrument-based estimator relies on a certain invariance property, which can be exploited when high frequency data is available. The price-only estimator is more robust since it is valid under weaker assumptions. However, in the presence of a valid volatility instrument, the price-only estimator is inefficient as the instrument-based estimator has a faster rate of convergence. We consider two empirical applications, in which we study the relationship between the leverage effect and the debt-to-equity ratio, credit risk, and illiquidity.

REVISION: Incorporating Global Industrial Classification Standard into Portfolio Allocation: A Simple Factor-Based Large Covariance Matrix Estimator with High Frequency Data
Date Posted: Apr  22, 2015
We document a striking block-diagonal pattern in the factor model residual covariances of the S&P 500 Equity Index constituents, after sorting the assets by their assigned Global Industry Classification Standard (GICS) codes. Cognizant of this structure, we propose combining a location-based thresholding approach based on sector inclusion with the Fama-French and SDPR sector Exchange Traded Funds (ETF’s). We investigate the performance of our estimators in an out-of-sample portfolio allocation study. We find that our simple and positive-definite covariance matrix estimator yields strong empirical results under a variety of factor models and thresholding schemes. Conversely, we find that the Fama-French factor model is only suitable for covariance estimation when used in conjunction with our proposed thresholding technique. Theoretically, we provide justification for the empirical results by jointly analyzing the in-fill and diverging dimension asymptotics.

REVISION: A Tale of Two Option Markets: Pricing Kernels and Volatility Risk
Date Posted: Apr  15, 2015
Using both S&P 500 option and recently introduced VIX option prices, we study pricing kernels and their dependence on multiple volatility factors. We first propose nonparametric estimates of marginal pricing kernels, conditional on the VIX and the slope of the variance swap term structure. Our estimates highlight the state-dependence nature of the pricing kernels. In particular, conditioning on volatility factors, the pricing kernel of market returns exhibit a downward sloping shape up to the extreme end of the right tail. Moreover, the volatility pricing kernel features a striking U-shape, implying that investors have high marginal utility in both high and low volatility states. This finding on the volatility pricing kernel presents a new empirical challenge to both existing equilibrium and reduced-form asset pricing models of volatility risk. Finally, using a full-fledged parametric model, we recover the joint pricing kernel, which is not otherwise identifiable.

New: Generalized Method of Integrated Moments for High-Frequency Data
Date Posted: Feb  06, 2015
We propose a semiparametric two-step inference procedure for a finite-dimensional parameter based on moment conditions constructed from high-frequency data. The population moment conditions take the form of temporally integrated functionals of state-variable processes that include the latent stochastic volatility process of an asset. In the first step, we nonparametrically recover the volatility path from high-frequency asset returns. The nonparametric volatility estimator is then used to form sample moment functions in the second-step GMM estimation, which requires the correction of a high-order nonlinearity bias from the first step. We show that the proposed estimator is consistent and asymptotically mixed Gaussian and propose a consistent estimator for the conditional asymptotic variance. We also construct a Bierens-type consistent specification test. These infill asymptotic results are based on a novel empirical-process-type theory for general integrated functionals of noisy ...

New: Increased Correlation Among Asset Classes: Are Volatility or Jumps to Blame, or Both?
Date Posted: Apr  16, 2014
We develop estimators and asymptotic theory to decompose the quadratic covariation between two assets into its continuous and jump components, in a manner that is robust to the presence of market microstructure noise. Using high frequency data on different assets classes, we find that the recent financial crisis led to an increase in both the quadratic variations of the assets and their correlations. However, we find little evidence to suggest a change between the relative contributions of the Brownian and jump components, as both comove. Co-jumps stem from surprising news announcements that occur primarily before the opening of the U.S. market, and are also accompanied by an increase in Brownian-driven correlations.

REVISION: Hermite Polynomial Based Expansion of European Option Prices
Date Posted: Dec  20, 2013
We seek a closed-form series approximation of European option prices under a variety of diffusion models. The proposed convergent series are derived using the Hermite polynomial approach. Departing from the usual option pricing routine in the literature, our model assumptions have no requirements for affine dynamics or explicit characteristic functions. Moreover, convergent expansions provide a distinct insight into how and on which order the model parameters affect option prices, in contrast with small-time asymptotic expansions in the literature. With closed-form expansions, we explicitly translate model features into option prices, such as mean-reverting drift and self-exciting or skewed jumps. Numerical examples illustrate the accuracy of this approach and its advantage over alternative expansion methods.

New: Spot Variance Regressions
Date Posted: Feb  12, 2013
We study a nonlinear vector regression model for discretely sampled high-frequency data with the latent spot variance of an asset as a covariate. We propose a two-stage inference procedure by first nonparametrically recovering the volatility path from asset returns and then conducting inference based on the generalized method of moments (GMM). The GMM estimator is nonstandard in that the second-order asymptotics is dominated by a bias term, rendering the standard inference implausible. We ...

REVISION: Quasi Maximum Likelihood Estimation of GARCH Models with Heavy-Tailed Likelihoods
Date Posted: Aug  06, 2012
The non-Gaussian maximum likelihood estimator is frequently used in GARCH models with the intention of capturing the heavy-tailed returns. However, unless the parametric likelihood family contains the true likelihood, the estimator is inconsistent due to density misspecification. To correct this bias, we identify an unknown scale parameter that is critical to the identification, and propose a two-step quasi maximum likelihood procedure with non-Gaussian likelihood functions. This novel ...

New: High Frequency Covariance Estimates with Noisy and Asynchronous Financial Data
Date Posted: Jun  28, 2010
This paper proposes a consistent and efficient estimator of the high frequency covariance (quadratic covariation) of two arbitrary assets, observed asynchronously with market microstructure noise. This estimator is built upon the marriage of the quasi-maximum likelihood estimator of the quadratic variation and the proposed Generalized Synchronization scheme. It is therefore not influenced by the Epps effect. Moreover, the estimation procedure is free of tuning parameters or bandwidths and ...

REVISION: Quasi-Maximum Likelihood Estimation of Volatility with High Frequency Data
Date Posted: Jun  27, 2010
This paper investigates the properties of the well-known maximum likelihood estimator in the presence of stochastic volatility and market microstructure noise, by extending the classic asymptotic results of quasi-maximum likelihood estimation. When trying to estimate the integrated volatility and the variance of noise, this parametric approach remains consistent, efficient and robust as a quasi-estimator under misspecified assumptions. Moreover, it shares the model-free feature with ...