ICML Poster Beyond The Rainbow: High Performance Deep Reinforcement Learning on a Desktop PC

Poster

Beyond The Rainbow: High Performance Deep Reinforcement Learning on a Desktop PC

Tyler Clark · Mark Towers · Christine Evers · Jonathon Hare

West Exhibition Hall B2-B3 #W-608

[ Abstract ] [ Lay Summary ]

[ Poster] [ OpenReview]

Thu 17 Jul 11 a.m. PDT — 1:30 p.m. PDT

Abstract:

Rainbow Deep Q-Network (DQN) demonstrated combining multiple independent enhancements could significantly boost a reinforcement learning (RL) agent’s performance. In this paper, we present “Beyond The Rainbow” (BTR), a novel algorithm that integrates six improvements from across the RL literature to Rainbow DQN, establishing a new state-of-the-art for RL using a desktop PC, with a human-normalized interquartile mean (IQM) of 7.6 on Atari-60. Beyond Atari, we demonstrate BTR’s capability to handle complex 3D games, successfully training agents to play Super Mario Galaxy, Mario Kart, and Mortal Kombat with minimal algorithmic changes. Designing BTR with computational efficiency in mind, agents can be trained using a high-end desktop PC on 200 million Atari frames within 12 hours. Additionally, we conduct detailed ablation studies of each component, analyzing the performance and impact using numerous measures.

Lay Summary:

Reinforcement Learning (RL) can teach computers to make decisions by trial and error, achieving superhuman performance in many domains such as video games and robotics. However, state-of-the-art RL methods often require days or weeks of training on expensive GPU clusters, placing them out of reach for most researchers, students, and small companies. In response, we introduce Beyond The Rainbow (BTR), an algorithm that combines six recent advancements from the field. On a modern desktop PC, BTR reaches near–state-of-the-art performance on 60 Atari games in under 12 hours. We further demonstrate that, with minimal changes, BTR can learn to play complex 3D titles such as Super Mario Galaxy and Mario Kart. By reducing both training time and hardware cost, BTR makes high-performance RL accessible to anyone with a consumer-grade machine. This democratization promises to broaden participation in RL research and accelerate innovation.

Chat is not available.