The rise in technological developments in collecting data has allowed for variables to be continuously measured over time and space. This type of data can be characterized as multivariate spatially varying functional data. The field of Functional Data Analysis (FDA) consists of different types of models for functional data. Many existing functional regression models don’t consider the spatial component which could be detrimental to prediction since there is the possibility of location-specific effects. The main emphasis of this dissertation is the development of a novel two-step procedure for prediction in Generalized Spatially Varying Functional Models (GSVFM).
Functional data are inherently infinite-dimensional because they represent continuous functions. Since the GSVFM suffers from the curse of dimensionality, functional models can not be estimated directly. To address both the infinite-dimensionality and the spatial varying components of the data, a novel two-step procedure is introduced. The first step of the procedure is to reduce the dimension of the GSVFM through the method of Functional Principal Components Analysis (FPCA). This reduces the GSVFM to a Generalized Spatial Varying Coefficient Model (GSVCM) which is the second step in the procedure. The GSVCM considers the spatial locations in the data. The proposed two-step procedure is able to capture location-specific effects that previous functional regression models can’t.
This research is motivated by a crop-yield prediction application in agriculture. The agriculture data is collected at the county-level from five Midwest states, Kansas, Iowa, Illinois, Indiana and Missouri. For each county, we observe daily minimum and maximum temperature time series data. The temperature time series data can be viewed as functions, where the temperature is indexed by the day. Since the temperature data varies across the Midwest counties, this represents the multivariate spatially varying functional data. The precipitation, irrigated land and crop-yield are collected at the county level. The goal is to apply the GSVFM to predict the spatially varying crop yield through the scalar predictor variables and the multivariate spatially varying functional data. Existing functional models are used to compare performance with the GSVFM.
The dissertation consists of two projects that use the novel two-step procedure to estimate the GSVFM and the Spatially Varying Functional Quantile Model (SVFQM). The first project aims at predicting the conditional mean and the second project extends the GSVFM to the SVFQM that predicts the conditional quantile. This research addresses the current gap in functional models that do not consider the spatial component.