Poster
in
Workshop: CODEML: Championing Open-source DEvelopment in Machine Learning
Gundam: A PyTorch Native Platform for Training Generative AI Models
Tianyu Liu · Wanchao Liang
Gundam is a PyTorch native open-source platform (Anonymized GitHub: https://anonymous.4open.science/r/gundam-F883) designed for scalable and flexible training of generative AI models. Integrated tightly with PyTorch's distributed stack while offering efficient optimizations and modular configurations, Gundam showcases elastic training of LLMs with composable 4-D parallelism. Moreover, Gundam supports extensible abstractions to experiment with new model architectures (e.g., diffusion models) or infrastructure techniques (e.g., a compiler-first FSDP implementation), while biasing towards a clean, minimal codebase. This paper presents the motivation, system architecture, and demonstrated impact of Gundam, underscoring its alignment with the CODEML mission to advance open, sustainable machine learning development.