2007 — 2011 
He, Xuming (coPI) [⬀] Wuebbles, Donald [⬀] Liang, XinZhong Shao, Xiaofeng 
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information 
Cmg Collaborative Research: Statistical Evaluation of ModelBased Uncertainties Leading to Improved Climate Change Projections At Regional to Local Scales @ University of Illinois At UrbanaChampaign
This research project brings together an interdisciplinary team of atmospheric scientists and statisticians to attack an outstanding issue in the field of climate change research: namely, how to obtain statistically robust projections of future climate change at regional to local scales. It is well known that global change is modified by local and regional features in ways that even regional models are challenged to capture, producing unique patterns in each individual region. Quantifying these patterns of change is essential to identifying appropriate adaptation and mitigation strategies to cope with the likely impacts of climate change on both human and natural systems. Driven by both the persistent limitations in presentday modeling capacity, as well as the potential globalscale impacts of climate change, the investigators propose to develop a set of scientifically and statisticallyadvanced techniques to reduce the uncertainties inherent in use of global and regional climate model output fields to generate localscale climate projections. Utilizing available observations, reanalysis data, and historical global and regional climate model simulations, the investigators will first develop a set of statistical techniques that will reduce the dimensionality of both global and regional model differences relative to observations. Statistical techniques to quantify modelobservational differences and capture the range of future climate projections will include proven methods for spatial interpolation of observations, as well as new spectral and wavelet analyses, and development of an advanced quantile regression approach with Bayesian empirical likelihoods. Building on the investigators? previous research analyzing the ability of both global and regional climate models to simulate key atmospheric dynamical features, we will then assess the physical features of the models that are likely contributing to these differences. Both physical and statistical characterizations of model limitations will then be applied reduce uncertainty in a range of IPCC AR4 global model simulations of future climate change, based on multiple realizations of future emissions scenarios and available regional climate model simulations. The final project goal is to synthesize the above methods into a generalized framework that combines physical and statistical analyses to assess historical global and regional model performance, and then use these characterizations of model performance to reduce the uncertainty in future projections of key surface climate variables at regional to local scales.
The work proposed addresses an ongoing and crucial need in climate change research to characterize and account for model limitations in order to reduce uncertainties at the regional to local scale where the societal, economic, and environmental impacts of climate change occur. This project is unique from both a scientific and statistical perspective, combining a wellestablished research program on global and regional climate model analysis with innovative statistical approaches. Advanced statistical methods will be used to merge all available information including observations, data assimilations, global and regional climate model simulations, and other depictions of the internal variability of the climate system to characterize model differences relative to observations, and to produce improved highresolution projections of future changes in surface climate. This project will involve the extensive use of highperformance computing capabilities The capabilities that will be developed are designed to reduce uncertainties in the likely range of future climate change, enabling more effective analyses of the potential impacts of climate change at regional to local scales. At the same time, the project will challenge the stateoftheart in terms of the techniques and statistical tools developed, and their application to the field of regional climate projections. The proposed collaborative research will also provide interdisciplinary training to students and postdoctoral fellows at several institutions, with the crossdisciplinary fertilization of ideas fostered through the close interactions on this project providing invaluable insights into both the research and the educational processes.

0.964 
2008 — 2011 
Shao, Xiaofeng 
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information 
Statistical Inference For Long Memory and Nonlinear Time Series @ University of Illinois At UrbanaChampaign
The proposal aims to develop methodological and theoretical tools for statistical inference of long memory and/or nonlinear time series, for which the traditional methods and theory developed for linear ARMAtype series are not known to be applicable. Since the applications of long memory and nonlinear models are rapidly growing, there is an urgent and crucial need to either provide a theoretical justification for existing methods or propose novel methods that are able to accommodate long memory and nonlinear features. To meet this need, the investigator proposes to study the following research problems: confidence interval for spectral means and ratio statistics; Whittle estimation and diagnostic checking for fractionally integrated time series with uncorrelated but dependent errors; new tests of independence and noncorrelations between two stationary time series; frequency domain semiparametric inference for bivariate fractionally integrated nonlinear time series. All of them are linked to the second order properties of the long/short time series with nonlinear features, and together, they cover a wide spectrum of important inference issues for such series.
Time series with long memory and nonlinearities occur in various fields, including atmosphere science, environmental science, geophysics, hydrology, economics, finance and others. This work will greatly enhance the available methodologies and theories, provide more tools and have potential applications in all such fields. The proposed research has significant impact on education through involvement of Ph.D students directly in the proposed research and incorporation of results into graduate statistical courses.

0.964 
2011 — 2015 
Shao, Xiaofeng 
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information 
Statistical Inference For Temporally Dependent Functional Data @ University of Illinois At UrbanaChampaign
The PI develops a systematic body of methods and related theory on inference of temporally dependent functional data. The basic tool is the selfnormalization (SN), a new studentizing technique developed recently in the univariate time series setting. The PI proposes to advance new SNbased methods in functional setup and develop (i) a class of SNbased test statistics to test for a change point in the mean function and dependent structure of weakly dependent functional data; (ii) a class of SNbased test statistics to test for white noise in Hilbert space and effective diagnostic checking tools for the AR(1) model in functional space; (iii) new SNbased tests in the two sample setup. The tests can be used to check if the two possibly dependent functional time series have the same mean and/or autocovariance structure. In this proposal, the SN is the foundation on which the body of connected and systematic inference methods for temporally dependent functional data is built.
The proposal is motivated by ongoing collaboration with atmospheric scientists on statistical assessment of properties of numerical model outputs as compared to real observations. To study climate change, which is one of the most urgent problems facing the world this century, scientists have relied primarily on climate projections from numerical climate models. There is currently a major interest to study how different the numerical model outputs are from real observations and the characterization of their difference. Analyzing these data are quite challenging because they are massive and highly complex with intricate spatialtemporal dependence. The SNbased inference methods that the PI develops in this proposal address these issues. With the assistance of functional principal component analysis, the SNbased methods are able to handle massive data sets with dependence, because the methods automatically take the unknown weak dependence into account, do not involve the choice of any tuning parameters (so are quite efficient computationally), and are very straightforward to implement with asymptotically pivotal limiting distributions. A direct application of the SNbased methods to climate data is expected to help atmospheric scientists gain a better understanding of the ability of numerical model outputs in mimicking real observations. In addition, the proposed methods will have broad direct applications to data that are obtained from very precise measurements at fine temporal scales which frequently arise in engineering, physical science and finance. On the educational front, the PI will develop new advanced topic courses, mentor undergraduate and graduate students and expose them to the stateoftheart research in this project.

0.964 
2014 — 2017 
Shao, Xiaofeng 
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information 
Statistical Modeling, Adjustment and Inference For Seasonal Time Series @ University of Illinois At UrbanaChampaign
This project studies novel inference procedures and models for seasonal time series. The results of this research will have direct impact on the diagnostics of seasonal adjustment procedures that are currently implemented at the U.S. Census Bureau and other domestic or foreign agencies where seasonal adjustments are routinely published. The "Visual Significance" method used at the Census Bureau lacks a rigorous statistical justification and the new spectral peak detection methods will help to quantify type I and II errors in a disciplined fashion for a wide class of processes. Although motivated by research problems at Census, the new methodology and models are expected to be useful in the analysis of time series from various disciplines, including economics, astronomy, environmental science, and atmospheric sciences, among others.
Specifically, the project consists of three interrelated parts. In the first part, the PI will develop two new methods of spectral peak detection, which are intended to provide more principled approaches to the "Visual Significance" method used at the U.S. Census Bureau. In the second part, the PI will address the bandlimited goodnessoffit testing using the integral of the square of the normalized periodogram. Instead of assuming the strong Gaussianlike assumption as done in the literature, the PI will use a new Studentizer, so that the limiting distribution of the selfnormalizationbased test statistic is pivotal under less stringent assumptions. In the third part, the PI will study a new parametric class of spectral density, which can be used in modelbased seasonal adjustment to improve the quality of model fitting and seasonal adjustment. The new parametric models and related modelbased seasonal adjustment, if successfully developed, may offer a more effective means of modeling and adjusting time series. The research will promote teaching and training through mentoring of undergraduate and graduate students and through the development of related lecture notes.

0.964 
2016 — 2019 
Shao, Xiaofeng 
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information 
Collaborative Research: Statistical Inference For Functional and High Dimensional Data With New Dependence Metrics @ University of Illinois At UrbanaChampaign
Due to the rapid development of information technologies and their applications in many scientific fields such as climate science, medical imaging, and finance, statistical analysis of highdimensional data and infinitedimensional functional data has become increasingly important. A key challenge associated with the analysis of such big data is how to measure and infer complex dependence structure, which is a fundamental step in statistics and becomes more difficult owing to the data's high dimensionality and huge size. The main goal of this research project is to develop new dependence measures for quantifying dependence of large scale data sets such as temporally dependent functional data and high dimensional data, and utilize these new measures to develop novel statistical tools for conducting sparse principal component analysis, dimensional reduction, and simultaneous hypothesis testing. Building on the new dependence metrics that can capture nonlinear and nonmonotonic dependence, the methodologies under development are expected to lead to more accurate prediction and inference, as well as more effective dimension reduction in the analysis of functional and high dimensional data.
The research consists of three projects addressing different challenges in the analysis of functional and high dimensional data. In Project 1, the investigators introduce a new operatorvalued quantity to characterize the conditional mean (in)dependence of one functionvalued random element given another, and apply the newly developed dependent metrics to do dimension reduction for functional time series under a new framework of finite dimensional functional data. In Project 2, the investigators explore a new dimension reduction framework for regression models with high dimensional response, which requires less stringent linear model assumptions and is more flexible in terms of capturing possible nonlinear dependence between the response and the covariates. In Project 3, the investigators develop new tests for the mutual independence of high dimensional data via distance covariance and rank distance covariance using both sum of squares and maximum type test statistics. Overall, the three lines of research are all related to big data, and they touch upon various aspects of modern statistics; the project aims to push the current frontiers in areas including sparse principal component analysis, inference for dependent functional data, and high dimensional multivariate analysis to another level.

0.964 
2018 — 2021 
Shao, Xiaofeng 
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information 
Statistical Inference For HighDimensional Time Series @ University of Illinois At UrbanaChampaign
Due to the rapid development of information technologies and their applications in many scientific fields, high dimensional time series (HDTS) are routinely collected nowadays. The methods and theory developed for the inference of low and fixed dimensional time series may not be applicable when the dimension is comparable to or exceeds time series length, and there is an urgent need to develop new statistical methods that can accommodate both high dimensionality and temporal dependence. Statistical inference for HDTS is fundamentally important and has many applications in disciplines ranging from climate science to medical imaging and finance, among others.
This project aims to develop innovative theory and methodologies to address several important inference problems in the analysis of HDTS. The research is built on the selfnormalized approach, which has found great success in dealing with low dimensional problems. Its advance to the high dimensional context is challenging both methodologically and theoretically, and it requires a new methodological formulation and new theory. This project covers the inference of the mean, covariance matrix, and autocovariance matrix for HDTS, and the tests developed can be used to detect change points, certain structure of the covariance matrix and target dense alternative. On the theoretical front, the weak convergence of sequential Ustatistic based process will be investigated and is of independent interest.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

0.964 
2020 — 2023 
Shao, Xiaofeng 
N/AActivity Code Description: No activity code was retrieved: click on the grant title for more information 
Collaborative Research: Segmentation of Time Series Via SelfNormalization @ University of Illinois At UrbanaChampaign
This project aims to develop new statistical methodology and theory for changepoint analysis of time series data. Changepoint models have wide applications in many scientific areas, including modeling the daily volatility of the U.S. financial market, and the weekly growth rate of an infectious disease such as coronavirus, among others. Compared with existing methodologies, this research will provide inference for a flexible range of change point models, which will remain valid under complex dependence relationships exhibited by real datasets. The methodologies ensuing from the project will be disseminated to the relevant scientific communities via publications, conference and seminar presentations, and the development of opensource software. The Principal Investigators (PIs) will jointly mentor a Ph.D. student and involve undergraduate students in the research, and offer advanced topic courses to introduce the stateoftheart techniques in time series analysis.
Time series segmentation, also known as changepoint estimation, is one of the fundamental problems in statistics, where a time series is partitioned into piecewise homogeneous segments such that each piece shares the same behavior. There is a vast body of literature devoted to changepoint estimation in independent observations; however, robust methodology and rigorous theory that can accommodate temporal dependence is still scarce. Motivated by the recent success of the selfnormalization (SN) method, which was developed by one of the PIs for structural break testing and other inference problems in time series, the PIs will advance the selfnormalization technique to time series segmentation. Specifically, the PIs will develop a systematic and unified SNbased changepoint estimation methodology and the associated theory for (i) segmenting a piecewise stationary time series into homogeneous pieces so within each piece a finite dimensional parameter is constant; (ii) segmenting a linear trend model with stationary and weakly dependent errors into periods with constant slope. The segmentation algorithms to be developed are broadly applicable to fixeddimensional time series data and can be further extended to cover highdimensional and locally stationary time series with proper modification of the selfnormalized test statistics.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

0.964 