All Times EDT

Thursday, June 4

Interactive Machine Learning

Thu, Jun 4, 1:20 PM - 2:55 PM
TBD

On the Global Convergence of Policy Optimization in Deep Reinforcement Learning (308249)

*Zhaoran Wang, Northwestern University

Policy optimization (with neural networks as actor and critic) is the workhorse behind the success of deep reinforcement learning. However, its global convergence remains less understood, even in classical settings with linear function approximators. In this talk, I will show that coupled with neural networks, a variant of proximal/trust-region policy optimization (PPO/TRPO) globally converges to the optimal policy. In particular, I will illustrate how the overparametrization of neural networks enable us to establish strong guarantees. (Joint work with Qi Cai, Jason Lee, Boyi Liu, Zhuoran Yang)

Online Program

On the Global Convergence of Policy Optimization in Deep Reinforcement Learning (308249)

ASA Meetings Department