Poster
in
Affinity Workshop: New In ML
Hessian Matrix’s Key Eigenvalue Estimation for Accelerated Convergence
Abstract:
Related studies show that among various zeroth-order optimization algorithms in machine learning, the most effective approach is to build a quadratic model of the objective function based on its observed values. This modeling relies on information from the Hessian matrix. Once the Hessian is obtained, its inverse enables the use of Newton’s method, which converges much faster than stochastic gradient descent (SGD) and adaptive moment estimation (Adam). However, due to its high computational complexity—reaching $\mathrm{O}(n^3)$ as the dimensionality $n$ increases—Newton’s method is impractical for high-dimensional optimization tasks like the training of deep learning models. To address this, we propose Hessian matrix's key eigenvalue estimation (HKEE), an algorithm with $\mathrm{O}(1)$ complexity. HKEE estimates the key eigenvalue of the Hessian matrix along the gradient direction to approximate the step size of Newton's method. This results in an optimizer that maintains SGD-level overhead while significantly improving convergence speed. Our method combines the strengths of Newton's method and SGD, accelerating the training of deep learning models with relatively low memory usage. Code will be released after peer revi3w.
Chat is not available.