DisTop: Discovering a Topological representation to learn diverse and rewarding skills

Arthur Aubret; Laetitia Matignon; Salima Hassas

Pré-Publication, Document De Travail Année : 2021

DisTop: Discovering a Topological representation to learn diverse and rewarding skills

DisTop: découvrir une représentation topologique pour apprendre des compétences diverses et récompensantes

(1, 2) , (1, 2) , (1, 2)

1
2

Arthur Aubret

Fonction : Auteur

Laboratoire d'InfoRmatique en Image et Systèmes d'information

Systèmes Cognitifs et Systèmes Multi-Agents

Laetitia Matignon

Fonction : Auteur

Laboratoire d'InfoRmatique en Image et Systèmes d'information

Systèmes Cognitifs et Systèmes Multi-Agents

Salima Hassas

Fonction : Auteur

Laboratoire d'InfoRmatique en Image et Systèmes d'information

Systèmes Cognitifs et Systèmes Multi-Agents

Résumé

An efficient way for a deep reinforcement learning (DRL) agent to explore can be to learn a set of skills that achieves a uniform distribution of states. Following this, we introduce DisTop, a new model that simultaneously learns diverse skills and focuses on improving rewarding skills. DisTop progressively builds a discrete topology of the environment using an unsupervised contrastive loss, a growing network and a goal-conditioned policy. Using this topology, a state-independent hierarchical policy can select where the agent has to keep discovering skills in the state space and explicitly forget skills unrelated to tasks. In turn, the new set of visited states allows an improved learnt representation and the learning loop continues. Our experiments emphasize that DisTop is agnostic to the ground state representation and that the agent can discover the topology of its environment whether the states are high-dimensional binary data, images, or proprioceptive inputs. We demonstrate that this paradigm is competitive on MuJoCo benchmarks with state-of-the-art algorithms on both single-task dense rewards and diverse skill discovery. By combining these two aspects, we show that DisTop outperforms a state-of-the-art hierarchical reinforcement learning (HRL) algorithm when rewards are sparse. We believe DisTop opens new perspectives by showing that bottom-up skill discovery combined with representation learning can tackle different hard reward settings.

Domaines

Intelligence artificielle [cs.AI] Réseau de neurones [cs.NE] Apprentissage [cs.LG]

Fichier principal

DisTop.pdf (2.07 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Arthur Aubret : Connectez-vous pour contacter le contributeur

https://hal.science/hal-03352684

Soumis le : jeudi 23 septembre 2021-14:06:44

Dernière modification le : jeudi 18 avril 2024-16:44:45

Archivage à long terme le : vendredi 24 décembre 2021-20:49:26

Dates et versions

hal-03352684 , version 1 (23-09-2021)

Identifiants

HAL Id : hal-03352684 , version 1

Citer

Arthur Aubret, Laetitia Matignon, Salima Hassas. DisTop: Discovering a Topological representation to learn diverse and rewarding skills. 2021. ⟨hal-03352684⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS UNIV-LYON1 UNIV-LYON2 INSA-LYON EC-LYON LIRIS INSA-GROUPE UDL

61 Consultations

55 Téléchargements

DisTop: Discovering a Topological representation to learn diverse and rewarding skills

DisTop: découvrir une représentation topologique pour apprendre des compétences diverses et récompensantes

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager