Poster
Armijo Line-search Can Make (Stochastic) Gradient Descent Provably Faster
Sharan Vaswani · Reza Babanezhad
West Exhibition Hall B2-B3 #W-501
Gradient descent (GD) is the standard optimization method for training machine learning (ML) models. The performance of GD is sensitive to the choice of its step-size parameter. Armijo line-search is a common technique to "search" for a good step-size in each GD step. Armijo line-search does not only make GD more robust, but also makes the method faster in practice. In this paper, we theoretically characterize how fast can it go? We show that for common ML objectives such as logistic regression, GD with Armijo line-search (GDLS) can be exponentially faster than using GD with a fixed, pre-determined step-size. Moreover, for specific problems in supervised learning and reinforcement learning, we prove that GDLS can theoretically match or outperform algorithms explicitly designed for these problems. Our results thus demonstrate the universal effectiveness of GDLS, and show that this classic algorithm is all you need!