Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Exploration in AI Today (EXAIT)

Exploration by Exploitation: Curriculum Learning for Reinforcement Learning Agents through Competence-Based Curriculum Policy Search

Tabitha Edith Lee · Rosemary Nan Ke · Sarvesh Patil · Annya Dahmani · Eunice Yiu · Esra'a Saleh · Alison Gopnik · Oliver Kroemer · Glen Berseth

Keywords: [ reinforcement learning ] [ exploration by exploitation ] [ curriculum learning ]


Abstract:

We present CURATE, an algorithm for automatic curriculum learning for reinforcement learning agents to solve a difficult target task distribution with sparse rewards. Initially, due to fundamental exploration challenges without informed priors or specialized algorithms, agents may be unable to consistently receive rewards, leading to inefficient learning. Through "exploration by exploitation," CURATE dynamically scales the task difficulty to match the agent’s current competence. By exploiting its current capabilities that were learned in easier tasks, the agent improves its exploration in more difficult tasks. While training the agent, CURATE conducts policy search in the curriculum space to learn a task distribution for the agent corresponding to the easiest tasks that the agent has not yet solved. As the agent's mastery grows, the learned curriculum adapts correspondingly in an approximately easiest-to-hardest fashion, efficiently culminating in an agent that can solve the target tasks. Our experiments demonstrate that the curricula learned by CURATE achieve greater sample efficiency for solving the target tasks than state-of-the-art algorithms and most baselines. Although a hand-designed curriculum was more performant for one-dimensional curricula, CURATE excels in two-dimensional curricula where the optimal task sequencing is not obvious.

Chat is not available.