Yan, Yashuai und Egle, Tobias und Ott, Christian und Lee, Dongheui (2026) Efficiently Learning Robust Torque-Based Locomotion Through Reinforcement With Model-Based Supervision. IEEE Robotics and Automation Letters, 11 (4), Seiten 4155-4162. IEEE - Institute of Electrical and Electronics Engineers. doi: 10.1109/LRA.2026.3664534. ISSN 2377-3766.
Dieses Archiv kann nicht den Volltext zur Verfügung stellen.
Offizielle URL: https://ieeexplore.ieee.org/document/11395590
Kurzfassung
We propose a control framework that integrates model-based bipedal locomotion with residual reinforcement learning (RL) to achieve robust and adaptive walking in the presence of real-world uncertainties. Our approach leverages a model-based controller--comprising a Divergent Component of Motion (DCM) trajectory planner and a whole-body controller--as a reliable base policy. To address the uncertainties of inaccurate dynamics modeling and sensor noise, we introduce a residual policy trained through RL with domain randomization. Crucially, we employ a model-based oracle policy, which has privileged access to ground-truth dynamics during training, to supervise the residual policy via a novel supervised loss. This supervision enables the policy to efficiently learn corrective behaviors that compensate for unmodeled effects without extensive reward shaping. Our method demonstrates improved robustness and generalization across a range of randomized conditions, offering a scalable solution for sim-to-real transfer in bipedal locomotion.
| elib-URL des Eintrags: | https://elib.dlr.de/224123/ | ||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Dokumentart: | Zeitschriftenbeitrag | ||||||||||||||||||||
| Titel: | Efficiently Learning Robust Torque-Based Locomotion Through Reinforcement With Model-Based Supervision | ||||||||||||||||||||
| Autoren: |
| ||||||||||||||||||||
| Datum: | 13 Februar 2026 | ||||||||||||||||||||
| Erschienen in: | IEEE Robotics and Automation Letters | ||||||||||||||||||||
| Referierte Publikation: | Ja | ||||||||||||||||||||
| Open Access: | Nein | ||||||||||||||||||||
| Gold Open Access: | Nein | ||||||||||||||||||||
| In SCOPUS: | Ja | ||||||||||||||||||||
| In ISI Web of Science: | Ja | ||||||||||||||||||||
| Band: | 11 | ||||||||||||||||||||
| DOI: | 10.1109/LRA.2026.3664534 | ||||||||||||||||||||
| Seitenbereich: | Seiten 4155-4162 | ||||||||||||||||||||
| Verlag: | IEEE - Institute of Electrical and Electronics Engineers | ||||||||||||||||||||
| ISSN: | 2377-3766 | ||||||||||||||||||||
| Status: | veröffentlicht | ||||||||||||||||||||
| Stichwörter: | Locomotion | ||||||||||||||||||||
| HGF - Forschungsbereich: | Luftfahrt, Raumfahrt und Verkehr | ||||||||||||||||||||
| HGF - Programm: | Raumfahrt | ||||||||||||||||||||
| HGF - Programmthema: | Robotik | ||||||||||||||||||||
| DLR - Schwerpunkt: | Raumfahrt | ||||||||||||||||||||
| DLR - Forschungsgebiet: | R RO - Robotik | ||||||||||||||||||||
| DLR - Teilgebiet (Projekt, Vorhaben): | R - Laufroboter/Lokomotion [RO], R - Basistechnologien [RO] | ||||||||||||||||||||
| Standort: | Oberpfaffenhofen | ||||||||||||||||||||
| Institute & Einrichtungen: | Institut für Robotik und Mechatronik (ab 2013) Institut für Robotik und Mechatronik (ab 2013) > Analyse und Regelung komplexer Robotersysteme | ||||||||||||||||||||
| Hinterlegt von: | Strobl, Dr.-Ing. Klaus H. | ||||||||||||||||||||
| Hinterlegt am: | 29 Apr 2026 14:26 | ||||||||||||||||||||
| Letzte Änderung: | 29 Apr 2026 14:26 |
Nur für Mitarbeiter des Archivs: Kontrollseite des Eintrags