Poster
Maximum Update Parametrization and Zero-Shot Hyperparameter Transfer for Fourier Neural Operators
Shanda Li · Shinjae Yoo · Yiming Yang
East Exhibition Hall A-B #E-1607
Scientists use special computer programs called neural networks to solve complex physics equations that describe things like fluid flow, heat transfer, and wave propagation. To handle more complex problems, these programs need to become much larger, but finding the right training settings becomes prohibitively expensive - sometimes requiring months of computer time. Our research solves this with a mathematical discovery: by adjusting certain numbers in a specific way when scaling up the program, the optimal training settings remain constant. This means we can find the best settings using a small, inexpensive version of the program, then apply those same settings to train a much larger version - like using a recipe for a small cake to successfully bake a wedding cake. Testing our approach on fluid dynamics problems, we successfully train programs with nearly one billion parameters while using only 30% of the traditional computational cost, making powerful physics simulators accessible to researchers without breaking computing budgets.