In the modern world, we are witnessing a rapid acceleration in the adoption of complex and poorly understood systems, such as neural networks, and their use in processing high-dimensional sensor data like cameras and LiDAR. While it is hard to deny the effectiveness and impact of these systems, their theoretical understanding remains elusive. Unfortunately, this makes their use in closed-loop control systems hazardous, as many probabilistic characterizations of their errors fail to hold in this regime, and the tools used to prove properties such as safety and stability (e.g., Input-to-State-Stability theory) are not applicable. Furthermore, while substantial work exists on applying machine learning to control problems—mostly empirical in nature—far less has explored the application of control-theoretic tools to the analysis of machine learning systems.
The primary aim of this work is to bridge these gaps by providing rigorous worst-case error, safety, and stability guarantees for control systems with neural network or black-box components in the loop, as well as improving the theoretical analysis of these systems through control-theoretical tools. Specifically, this work focuses on: 1. Developing a neural network architecture that enjoys deterministic error bounds. 2. Deriving worst-case pose estimate error guarantees for LiDAR localization. 3. Performing a formal analysis of the long-term dynamics of generative models that are trained on their own synthetic data.
The first part of this dissertation builds on prior work showing that arbitrary depth residual networks (as opposed to the classical result involving arbitrary width) enjoy universal approximation capabilities in the uniform norm sense. However, these results lack a training procedure. We address this gap by developing an architecture and training algorithm exploiting monotonicity that result in residual neural networks with deterministic error bounds by construction. Additionally, such deterministic error bounds enable formal safety and stability guarantees to be proven when using these networks in control loops. Thus, we develop a framework based on Input-to-State-Stability (ISS) that exploits these deterministic errors when using neural networks as state observers, or feedback controllers.
The second part considers the problem of LiDAR-based localization. In particular, it looks at the so-called point cloud registration problem, which is a core routine of most localization and Simultaneous Localization And Mapping (SLAM) algorithms. The literature is rich with a wide variety of algorithms that attempt to solve the problem, from the classical Iterative Closest Point (ICP) to feature and learning-based approaches. However, the existing methods lack in near unanimity the ability to provide bounds on the pose estimation error they provide. In Part II we present a simple and fast point cloud registration algorithm called PASTA (Provably Accurate Simple Transformation Alignment), provide an extensive formal analysis of its worst-case estimation error, and experimentally verify its effectiveness. Such an algorithm, due its formal error bounds, and fast execution time, can be used as a supervisor for other localization algorithms that would otherwise not enjoy worst-case error bounds.
Finally, in the third part, we consider a topic very relevant to recent developments: generative models. As these models rapidly spread in popularity and use, the synthetic data they generate enters the internet, and, in turn, becomes part of the datasets used to train the next generation of generative models. There are rising concerns (supported by empirical observations) about the long-term consequences of this process, with fears that it may lead to the internet and these models to "degenerate" over time. In this work we analyze the learning dynamics of generative models that are fed back their own produced content in addition to their original training dataset, with particular focus on the effect of "temperature", a parameter typically used to modulate the sampling of these models. Using tools from control theory, we show that, unless a sufficient amount of external data is introduced at each iteration, any non-trivial temperature leads the model to asymptotically degenerate. In fact, either the generative distribution collapses to a small set of outputs, or becomes uniform over a large set of outputs.