On the theory of policy gradient methods: Optimality, approximation, and distribution shift
Authors
Alekh Agarwal, Sham M Kakade, Jason D Lee, Gaurav Mahajan
Publication date
2021
Journal
Journal of Machine Learning Research
Volume
22
Issue
98
Pages
1-76
Total citations
2019202020212022202320242025117712616419427243