ICML An Empirical Investigation of Initialization Strategies for Kolmogorov

Poster
in
Workshop: Methods and Opportunities at Small Scale (MOSS)

An Empirical Investigation of Initialization Strategies for Kolmogorov–Arnold Networks

Spyros Rigas · Dhruv Verma · Georgios Alexandridis · Yixuan Wang

Keywords: [ initialization ] [ power law ] [ deep learning ] [ Kolmogorov–Arnold networks ]

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract:

Kolmogorov–Arnold Networks (KANs) are a recently introduced neural architecture that use trainable activation functions instead of fixed ones, offering greater flexibility and interpretability. Although KANs have shown promising results across various tasks, little attention has been given to how they should be initialized. In this work, we explore alternative initialization strategies, including two variance-preserving methods based on classical ideas and an empirical power-law approach with tunable exponents. Using function fitting as a small-scale testbed, we run a large grid search over architectures and initialization settings. We find that power-law configurations consistently outperform the standard baseline initialization across all architectures. The variance-preserving methods tend to underperform on smaller models but outperform the baseline as networks grow deeper and wider, though they still do not match the performance of power-law initialization. Overall, our results highlight initialization as an important yet underexplored aspect of KANs and point to several directions for future work.

Chat is not available.

Poster in Workshop: Methods and Opportunities at Small Scale (MOSS)

An Empirical Investigation of Initialization Strategies for Kolmogorov–Arnold Networks

Spyros Rigas · Dhruv Verma · Georgios Alexandridis · Yixuan Wang

Poster
in
Workshop: Methods and Opportunities at Small Scale (MOSS)