Skip to yearly menu bar Skip to main content


Poster
in
Workshop: 3rd Workshop on High-dimensional Learning Dynamics (HiLD)

Generalisation and Safety Critical Evaluations at Sharp Minima: A Geometric Reappraisal

Israel Mason-Williams · Gabryel Mason-Williams · Helen Yannakoudakis


Abstract:

The geometric flatness of neural network minima has long been associated with desirable generalisation properties. In this paper, we extensively explore the hypothesis that robust, calibrated and functionally similar models sit at flatter minima, inline with prevailing understandings of the relationship between flatness and generalisation. Contrary to common assertions in the literature, we find a relationship between increased sharpness, generalisaton, calibration, robustness and functional representation in neural networks across architectures when using Sharpness Aware Minimisation, augmentation and weight decay as regulariser controls.Our findings suggest that the role of increased sharpness should be considered independently for individual models when reasoning about the geometric properties of neural networks. We show that sharpness can be related to generalisation and safety-relevant properties against the flatter minima found without the use of our regularisation controls. Understanding these properties calls for a re-evaluation of the role of sharpness in geometric landscapes.

Chat is not available.