Ever since the fundamental recognition of the potential role of the computer in modern statistics, the bootstrap and other computer-intensive statistical methods have been developed extensively for inference with independent data. Such methods are even more important in the context of dependent data where the distribution theory for estimators and test statistics may be difficult or impractical to obtain. Furthermore, the recent information explosion has resulted in datasets of unprecedented size that call for flexible, and by necessity computer-intensive, methods of data analysis. Time series analysis in particular is vital in many diverse scientific disciplines. As a consequence of the development of efficient and robust methods for the statistical analysis of dependent data, more accurate and reliable inferences may be drawn from datasets of practical importance resulting in appreciable benefits to the society. Examples include data from meteorology/atmospheric science (e.g. climate data), economics (e.g. stock market returns), biostatistics (e.g. fMRI data), and bioinformatics (e.g. genetics and microarray data). The project also involves developing curriculum, mentoring undergraduate students' research, supervising graduate students, and developing open-source software, organizing workshops. The project focuses on the development of methods of inference for the analysis of dependent and otherwise complex data without relying on unrealistic and/or unverifiable model assumptions. In particular: (a) Subsampling and resampling for big data will be studied, including the notion of scalable subagging applied to deep learning to improve both speed as well as accuracy of estimation; (b) Central limit theorem for the median of a triangular array of dependent data will be proved with application to median-of-means and robust scalable subagging; (c) Model-free bootstrap will be studied and compared to conformal prediction in nonparametric regression; (d) A novel class of nonstationary dependent errors will be introduced with application to fitting large autoregressive (AR) models to nonstationary time series; (e) Markov resampling and linear process bootstrap will be developed for stationary random fields; (f) Skip-sampling of discrete Fourier transform ordinates will be introduced and compared to the traditional frequency domain bootstrap for stationary time series; (g) Smoothing estimators of time-varying covariance matrices will be constructed for locally stationary multivariate time series; (h) Bootstrap for time series with a seasonal component will be further developed; and (i) Multi-step ahead point and interval predictors will be constructed for nonlinear autoregressions. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria. NSF Award ID: 2413718 | Program: 01002425DB NSF RESEARCH & RELATED ACTIVIT | Principal Investigator: Dimitris Politis | Institution: University of California-San Diego, LA JOLLA, CA | Award Amount: $300,000 View on NSF Award Search: https://www.nsf.gov/awardsearch/showAward?AWD_ID=2413718 View on Research.gov: https://www.research.gov/awardapi-service/v1/awards/2413718.html

Computer-intensive methods for dependent and complex data

Description

Interested in this grant?

Grant Details

External Links

Get personalized grant matches