A survey on policy search algorithms for learning robot controllers in a handful of trials

Chatzilygeroudis, Konstantinos und Vassiliades, Vassilis und Stulp, Freek und Calinon, Sylvain und Mouret, Baptiste (2019) A survey on policy search algorithms for learning robot controllers in a handful of trials. IEEE Transactions on Robotics, 36 (2), Seiten 328-347. IEEE - Institute of Electrical and Electronics Engineers. doi: 10.1109/TRO.2019.2958211. ISSN 1552-3098.

Dieses Archiv kann nicht den Volltext zur Verfügung stellen.

Offizielle URL: https://ieeexplore.ieee.org/abstract/document/8944013

Kurzfassung

Most policy search (PS) algorithms require thousands of training episodes to find an effective policy, which is often infeasible with a physical robot. This survey article focuses on the extreme other end of the spectrum: how can a robot adapt with only a handful of trials (a dozen) and a few minutes? By analogy with the word "big-data", we refer to this challenge as "micro-data reinforcement learning". In this article, we show that a first strategy is to leverage prior knowledge on the policy structure (e.g., dynamic movement primitives), on the policy parameters (e.g., demonstrations), or on the dynamics (e.g., simulators). A second strategy is to create data-driven surrogate models of the expected reward (e.g., Bayesian optimization) or the dynamical model (e.g., model-based PS), so that the policy optimizer queries the model instead of the real system. Overall, all successful micro-data algorithms combine these two strategies by varying the kind of model and prior knowledge. The current scientific challenges essentially revolve around scaling up to complex robots, designing generic priors, and optimizing the computing time.

elib-URL des Eintrags:

https://elib.dlr.de/136058/

Dokumentart:

Zeitschriftenbeitrag

Titel:

A survey on policy search algorithms for learning robot controllers in a handful of trials

Autoren:

Autoren	Institution oder E-Mail-Adresse	Autoren-ORCID-iD	ORCID Put Code
Chatzilygeroudis, Konstantinos	NICHT SPEZIFIZIERT	NICHT SPEZIFIZIERT	NICHT SPEZIFIZIERT
Vassiliades, Vassilis	NICHT SPEZIFIZIERT	NICHT SPEZIFIZIERT	NICHT SPEZIFIZIERT
Stulp, Freek	Freek.Stulp (at) dlr.de	https://orcid.org/0000-0001-9555-9517	NICHT SPEZIFIZIERT
Calinon, Sylvain	IDIAP, Switzerland	NICHT SPEZIFIZIERT	NICHT SPEZIFIZIERT
Mouret, Baptiste	NICHT SPEZIFIZIERT	NICHT SPEZIFIZIERT	NICHT SPEZIFIZIERT

Datum:

27 Dezember 2019

Erschienen in:

IEEE Transactions on Robotics

Referierte Publikation:

Open Access:

Gold Open Access:

Nein

In SCOPUS:

In ISI Web of Science:

Band:

DOI:

10.1109/TRO.2019.2958211

Seitenbereich:

Seiten 328-347

Verlag:

IEEE - Institute of Electrical and Electronics Engineers

ISSN:

1552-3098

Status:

veröffentlicht

Stichwörter:

robotics, reinforcement learning

HGF - Forschungsbereich:

Luftfahrt, Raumfahrt und Verkehr

HGF - Programm:

Raumfahrt

HGF - Programmthema:

Technik für Raumfahrtsysteme

DLR - Schwerpunkt:

Raumfahrt

DLR - Forschungsgebiet:

R SY - Technik für Raumfahrtsysteme

DLR - Teilgebiet (Projekt, Vorhaben):

R - Vorhaben Intelligente Mobilität (alt)

Standort:

Oberpfaffenhofen

Institute & Einrichtungen:

Institut für Robotik und Mechatronik (ab 2013) > Kognitive Robotik

Hinterlegt von:

Stulp, Freek

Hinterlegt am:

14 Sep 2020 10:10

Letzte Änderung:

27 Jun 2023 09:33

Nur für Mitarbeiter des Archivs: Kontrollseite des Eintrags