DisTop: Discovering a Topological representation to learn diverse and rewarding skills - Laboratoire d'InfoRmatique en Image et Systèmes d'information Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2021

DisTop: Discovering a Topological representation to learn diverse and rewarding skills

DisTop: découvrir une représentation topologique pour apprendre des compétences diverses et récompensantes

Résumé

An efficient way for a deep reinforcement learning (DRL) agent to explore can be to learn a set of skills that achieves a uniform distribution of states. Following this, we introduce DisTop, a new model that simultaneously learns diverse skills and focuses on improving rewarding skills. DisTop progressively builds a discrete topology of the environment using an unsupervised contrastive loss, a growing network and a goal-conditioned policy. Using this topology, a state-independent hierarchical policy can select where the agent has to keep discovering skills in the state space and explicitly forget skills unrelated to tasks. In turn, the new set of visited states allows an improved learnt representation and the learning loop continues. Our experiments emphasize that DisTop is agnostic to the ground state representation and that the agent can discover the topology of its environment whether the states are high-dimensional binary data, images, or proprioceptive inputs. We demonstrate that this paradigm is competitive on MuJoCo benchmarks with state-of-the-art algorithms on both single-task dense rewards and diverse skill discovery. By combining these two aspects, we show that DisTop outperforms a state-of-the-art hierarchical reinforcement learning (HRL) algorithm when rewards are sparse. We believe DisTop opens new perspectives by showing that bottom-up skill discovery combined with representation learning can tackle different hard reward settings.
Fichier principal
Vignette du fichier
DisTop.pdf (2.07 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03352684 , version 1 (23-09-2021)

Identifiants

  • HAL Id : hal-03352684 , version 1

Citer

Arthur Aubret, Laetitia Matignon, Salima Hassas. DisTop: Discovering a Topological representation to learn diverse and rewarding skills. 2021. ⟨hal-03352684⟩
61 Consultations
55 Téléchargements

Partager

Gmail Facebook X LinkedIn More