Skip to yearly menu bar Skip to main content


Poster

EARL-BO: Reinforcement Learning for Multi-Step Lookahead, High-Dimensional Bayesian Optimization

Mujin Cheon · Jay Lee · Dong-Yeun Koh · Calvin Tsay

West Exhibition Hall B2-B3 #W-603
[ ] [ ]
Wed 16 Jul 4:30 p.m. PDT — 7 p.m. PDT

Abstract:

To avoid myopic behavior, multi-step lookahead Bayesian optimization (BO) algorithms consider the sequential nature of BO and have demonstrated promising results in recent years. However, owing to the curse of dimensionality, most of these methods make significant approximations or suffer scalability issues. This paper presents a novel reinforcement learning (RL)-based framework for multi-step lookahead BO in high-dimensional black-box optimization problems. The proposed method enhances the scalability and decision-making quality of multi-step lookahead BO by efficiently solving the sequential dynamic program of the BO process in a near-optimal manner using RL. We first introduce an Attention-DeepSets encoder to represent the state of knowledge to the RL agent and subsequently propose a multi-task, fine-tuning procedure based on end-to-end (encoder-RL) on-policy learning. We evaluate the proposed method, EARL-BO (Encoder Augmented RL for BO), on synthetic benchmark functions and hyperparameter tuning problems, finding significantly improved performance compared to existing multi-step lookahead and high-dimensional BO methods.

Lay Summary:

Optimizing complex systems, such as tuning training settings for a machine learning model, often involves trial and error, which can be expensive and time-consuming. A smart approach called Bayesian optimization (BO) helps choose the best experiments to run, but most methods only think one step ahead. Looking multiple steps ahead can lead to better decisions, but this quickly becomes computationally overwhelming, especially with many variables involved. We introduce a method called EARL-BO, which uses reinforcement learning (RL) to make smarter, multi-step decisions efficiently, even in high-dimensional problems. Key contributions include a framework for encoding the state of information and for training the RL algorithm. Our experiments show that EARL-BO outperforms existing methods in both synthetic tasks and real-world scenarios like tuning machine learning models.

Chat is not available.