Poster
HiRemate: Hierarchical Approach for Efficient Re-materialization of Neural Networks
Julia Gusak · Xunyi Zhao · Théotime Le Hellard · Zhe LI · Lionel Eyraud-Dubois · Olivier Beaumont
East Exhibition Hall A-B #E-1803
Training deep neural networks requires storing many intermediate results during the forward pass so they can be reused during backpropagation. Although the model’s weights may fit on a single GPU, the total memory needed for training can exceed the device’s capacity, largely due to the size of these intermediate values. One way to reduce memory usage is through re-materialization, which selectively recomputes some of them instead of storing everything. However, for large models, deciding what to recompute is a challenging problem.We introduce HiRemate, a framework that tackles this problem in a hierarchical manner. The computation graph of the neural network is first divided into parts small enough to make the problem easy to solve. Thanks to our algorithm, these partial solutions are then merged—several times if necessary—until we obtain a complete solution for the entire graph. HiRemate is designed for models whose weights fit in GPU memory and focuses on reducing activation memory during training. It also supports re-materialization strategies from the literature, making it easy to combine different methods within a single framework.We tested HiRemate on a range of common neural networks and consistently saw large memory savings with only a small increase in training time. This makes it easier to train modern deep learning models on limited hardware.