Characterizing the spatial heterogeneity of aquifer properties, particularly hydraulic conductivity, is paramount in groundwater modeling when the transport and fate of contaminants need to be predicted. The field of geostatistics has focused on describing this heterogeneity with spatial random functions. The field of stochastic hydrogeology uses these functions to incorporate uncertainty about the subsurface in groundwater modeling predictions. Bayesian inference can update prior knowledge about the spatial patterns of the subsurface (e.g. plausible ranges of values) with a variety of information (e.g. direct measurements of hydraulic conductivity as well as indirect measurements such as water table drawdown at an observation well) in order to yield posterior knowledge. This dissertation focuses on expanding the tools for Bayesian inference of these spatial random functions.
First, the development of open-source software tools for guiding users through the Bayesian inference process are described. There is an desktop application that implements the Method of Anchored Distributions and is referred to as MAD#. It is built in a modular fashion such that it can be coupled with any geostatistical software and any numerical modeling software. This modularity allows for a wide variety of spatial random functions and subsurface processes to be incorporated in the Bayesian inference process. There is also a R package, called anchoredDistr, that supplements the MAD# software. While the MAD# software handles the communication between the geostatistical software and the numerical modeling software, the anchoredDistr package provides more flexibility in analyzing the results from MAD#. Since R is an open-source statistical computing language, the anchoredDistr package allows users to take advantage of the plethora of statistical tools in R to calculate the posterior knowledge in the Bayesian process. Although MAD# provides a post-processing module to calculate this posterior knowledge, it does not provide all of the options that the R community can provide for modifications.
Second, the expansion of which kinds of data and knowledge can be incorporated into the Bayesian process is explored. Incorporating time series (e.g. the drawdown of a water table from pumping over time) as indirect data in Bayesian inference poses a computational problem referred to as the `curse of dimensionality'. Since each additional measurement in time is correlated with the measurements before and after it, the calculation of probability distributions of these data become multi-dimensional. A synthetic case study incorporating drawdown time series in the Bayesian inference process is explored. A second form of information, conceptual models of geology, is also explored with a synthetic case study. Conceptual models of geology (e.g. a graphical representation of assumed geologic layering) can be described with images. There is a geostatistical technique called Multipoint Statistics that uses images as its input. The synthetic case study provides a proof-of-concept example in which the Bayesian inference process can infer conceptual models of geology using Multipoint statistics.
Third, the issue of devising spatial models with realistic geology while constraining the complexity of the model is explored. An aquifer analog is used as the basis for an example. An aquifer analog is a data set with data of hydraulic properties at high spatial resolution, i.e. much higher than expected for ordinary field measurements. The aquifer analog used in this dissertation has ten soil types distributed in three-dimensional space. The objective posed is to predict the early arrival time of a contaminant traveling through the analog. Given this prediction goal, the task is to simplify the analog into a simplified structure without changing the prediction outcome. The purpose of this exercise is to take a goal-oriented approach to defining a parsimonious spatial model for describing this complex aquifer analog such that a geostatistical model can be inferred for this kind of geology in a computationally efficient manner.
Ultimately, any uncertainty quantification regarding the spatial heterogeneity of subsurface properties has the goal of improving groundwater modeling prediction efforts. With the addition of freely available software tools, the ability to integrate more forms of information, and methodology for translating complex geological structures into parsimonious spatial models, the characterization of our groundwater resources improves.