Minimax Optimization for Games and for Fine-Tuning of Language Models
- Yuan, Huizhuo
- Advisor(s): Gu, Quanquan
Abstract
This thesis advances the field of convex-concave minimax optimization by introducing novelalgorithms that outperform traditional methods in various stochastic optimization settings, including separable strongly convex-strongly concave (SC-SC), bilinearly coupled strongly convex-strongly concave (bi-SC-SC), bilinear games, etc. Notably, we propose the Accelerated Gradient-Optimistic Gradient (AG-OG) Descent Ascent and the stochastic Accelerated Gradient-Extragradient (AG-EG) method. Both algorithms achieve optimal convergence rates in separable minimax optimization and strongly monotone variational inequalities, respectively, with lower-bound matching convergence rates specifically in the bilinear game setting. Furthermore, we explore the application of minimax optimization to enhance large language model fine-tuning through a novel self-play mechanism, demonstrating significant performance improvements across multiple benchmarks. Our contributions provide powerful tools for both theoretical research and practical applications in AI and ML, pushing the boundaries of optimization techniques in these fields.