A multi-agent system is defined as a collection of intelligent agents which are able to interact with each other or with their environments to solve problems that are beyond the individual capacities or knowledge of each problem solver. Distributed optimization in a multi-agent system allows for acting the agents in a distributed manner (using only local information from their neighbors) to achieve global optimization objectives cooperatively so as to increase flexibility and robustness. Examples of cooperative tasks include optimization of network flows, big-data analysis, design of sensor networks, multi-robot teams, and resource allocation. In this dissertation, two distributed optimization problems are investigated, where two different methods of convex optimization and reinforcement learning are employed to solve these problems.
In our first problem, a distributed time-varying convex optimization problem is studied for continuous-time multi-agent systems. The objective is to minimize the sum of local time-varying objective functions, each of which is known to only an individual agent, through local interaction. Here, the optimal point is time varying and creates an optimal trajectory, which renders extra challenge to well-studied distributed time-invariant optimization problems. To begin with, we study distributed time-varying optimization problems with convex objective functions for undirected topologies. We propose multiple optimization algorithms for different application scenarios: 1) optimization problems that do not have explicit bounds on any information about the local objective functions, 2) optimization problems with linear equality constraints, 3) optimization problems with nonlinear inequality constraints, and 4) optimization problems with both linear equality and nonlinear inequality constraints. Furthermore, we aim to seek a design methodology for distributed time-varying optimization under possibly weight-unbalanced directed networks---the most general and thus most challenging case from the network topology perspective. Particularly, we study distributed time-varying optimization problems with quadratic objective functions, which are also called distributed average tracking problems.
In the last part of this dissertation, the packet routing (path optimization) problem is addressed for distributed communication network systems that are consist of multiple routers and multiple links. Here, the goal is that all the routers work cooperatively to get all the packets delivered to their destinations through available paths with minimum time overall. Specifically, we aim to find optimal paths for packets in the presence of link failures, which is a crucial challenge faced by communication networks. We propose to leverage deep policy optimization (reinforcement learning) algorithms for enabling distributed model-free control in communication networks and present a novel meta-learning-based framework for enabling quick adaptation to topology changes.