elib
DLR-Header
DLR-Logo -> http://www.dlr.de
DLR Portal Home | Imprint | Privacy Policy | Contact | Deutsch
Fontsize: [-] Text [+]

The 37 Implementation Details of Proximal Policy Optimization

Huang, Shengyi and Dossa, Rousslan Fernand Julien and Raffin, Antonin and Kanervisto, Anssi and Wang, Weixun (2022) The 37 Implementation Details of Proximal Policy Optimization. In: The ICLR Blog Track 2023. ICLR 2022, Virtual.

Full text not available from this repository.

Official URL: https://iclr-blog-track.github.io/2022/03/25/ppo-implementation-details/

Abstract

Proximal policy optimization (PPO) has become one of the most popular deep reinforcement learning (DRL) algorithms. Yet, reproducing the PPO's results has been challenging in the community. While recent works conducted ablation studies to provide insight on PPO's implementation details, these works are not structured as tutorials and only focus on details concerning robotics tasks. As a result, reproducing PPO from scratch can become a daunting experience. Instead of introducing additional improvements, or doing further ablation studies, this blog post takes a step back and focuses on delivering a thorough reproduction of PPO in all accounts, as well as aggregating, documenting, and cataloging its most salient implementation details. This blog post also points out software engineering challenges in PPO and further efficiency improvement via the accelerated vectorized environments. With these, we believe this blog post will help people understand PPO faster and better, facilitating customization and research upon this versatile RL algorithm.

Item URL in elib:https://elib.dlr.de/191986/
Document Type:Conference or Workshop Item (Other)
Title:The 37 Implementation Details of Proximal Policy Optimization
Authors:
AuthorsInstitution or Email of AuthorsAuthor's ORCID iDORCID Put Code
Huang, ShengyiUNSPECIFIEDUNSPECIFIEDUNSPECIFIED
Dossa, Rousslan Fernand JulienUNSPECIFIEDUNSPECIFIEDUNSPECIFIED
Raffin, AntoninUNSPECIFIEDhttps://orcid.org/0000-0001-6036-6950UNSPECIFIED
Kanervisto, AnssiUNSPECIFIEDUNSPECIFIEDUNSPECIFIED
Wang, WeixunUNSPECIFIEDUNSPECIFIEDUNSPECIFIED
Date:March 2022
Journal or Publication Title:The ICLR Blog Track 2023
Refereed publication:Yes
Open Access:No
Gold Open Access:No
In SCOPUS:No
In ISI Web of Science:No
Status:Published
Keywords:ppo, reinforcement learning, implementation, policy optimization
Event Title:ICLR 2022
Event Location:Virtual
Event Type:international Conference
HGF - Research field:Aeronautics, Space and Transport
HGF - Program:Space
HGF - Program Themes:Robotics
DLR - Research area:Raumfahrt
DLR - Program:R RO - Robotics
DLR - Research theme (Project):R - Autonomous learning robots [RO]
Location: Oberpfaffenhofen
Institutes and Institutions:Institute of Robotics and Mechatronics (since 2013)
Institute of Robotics and Mechatronics (since 2013) > Cognitive Robotics
Deposited By: Raffin, Antonin
Deposited On:08 Dec 2022 16:12
Last Modified:29 Mar 2023 00:53

Repository Staff Only: item control page

Browse
Search
Help & Contact
Information
electronic library is running on EPrints 3.3.12
Website and database design: Copyright © German Aerospace Center (DLR). All rights reserved.