Treffer: Federated Learning for Reinforcement Learning and Control
Weitere Informationen
Federated learning (FL), a novel distributed learning paradigm, has attracted significant attention in the past few years. Federated algorithms take a client/server computation model, and provide scope to train large-scale machine learning models over an edge-based distributed computing architecture. In the paradigm of FL, models are trained collaboratively under the coordination of a central server while storing data locally on the edge/clients. This thesis addresses critical challenges in FL, focusing on supervised learning, reinforcement learning (RL), control systems, and personalized system identification. By developing robust, efficient algorithms, our research enhances FL’s applicability across diverse, real-world environments characterized by data heterogeneity and communication constraints. In the first part, we introduce an algorithm for supervised FL to address the challenges posed by heterogeneous client data, ensuring stable convergence and effective learning, even with partial client participation. In the federated reinforcement learning (FRL) part, we develop algorithms that leverage similarities across heterogeneous environments to improve sample efficiency and accelerate policy learning. Our setup involves 𝑁 agents interacting with environments that share the same state and action space but differ in their reward functions and state transition kernels. Through rigorous theoretical analysis, we show that information exchange via FL can expedite both policy evaluation and optimization in decentralized, multi-agent settings, enabling faster, more efficient, and robust learning. Extending FL into control systems, we propose the 𝙵𝚎𝚍𝙻𝚀𝚁 algorithm, which enables agents with unknown but similar dynamics to collaboratively learn stabilizing policies, addressing the unique demands of closed-loop stability in federated control. Our method overcomes numerous technical challenges, such as heterogeneity in the agents’dynamics, multiple local updates, and stability concerns. We show that our proposed algorithm ...