Workshop
2nd AI for Math Workshop @ ICML 2025
Yinya Huang · Zhicheng Yang · Xiaodan Liang · Zhengying Liu · Zhijiang Guo · Swaroop Mishra · Mateja Jamnik · Kun Zhang · Isabelle Guyon · Yilun Zhou · Marina Vinyes · Mert Unsal · Jing Tang · Haiming Wang · Wenda Li · Bin Dong · Alex Gu · Baran Hashemi · Zhizhen Qin · Soonho Kong · Leni Aniva · Xiaohan Lin · Kun Xiang
West Ballroom C
Fri 18 Jul, 8:30 a.m. PDT
Mathematical reasoning stands as a pinnacle of human intelligence. The rapid advancements in artificial intelligence, particularly in large language models (LLMs), have opened new frontiers at the intersection of AI and mathematical reasoning. This workshop aims to explore the potential of AI in comprehending and advancing mathematical reasoning, with a focus on fostering collaboration between humans and machines to push the boundaries of mathematical discovery. The central theme revolves around the question: >``How can we leverage and advance the mathematical reasoning abilities of machine learning models, and drive innovation across scientific and practical domains?''Our workshop will bring together researchers from diverse backgrounds, institutions, and disciplines to discuss the progress and future of AI technologies in mathematics. Specifically, we will delve into the areas related to the following:* Automated Theorem Proving: How can we build consistent theorem-proving systems? How can theorem-proving systems assist humans through human-computer interaction?* Automated Theorem Generation: Can neural models generate new and practically meaningful theorems that have been discovered? How can we utilize these newly generated theorems?* Autoformalization and Verification: How can we improve the precision of translating natural language proofs into formal proofs, and vice versa?* Problem Solving: How can we develop AI models to solve complex mathematical computational problems across various domains? How can AI models improve themselves during the learning process?* Applications of AI in Mathematics: What are the practical applications of AI-driven mathematical reasoning in various fields such as sciences, engineering, finance, and education?The intended outcome is to identify new ideas, open problems, and interdisciplinary areas for future research related to mathematical reasoning. To this end, we welcome papers on areas related, but not limited, to:* Algorithm: How to develop effective algorithms (e.g., reinforcement learning, self-improve/evolve) to improve reasoning ability?What are the key principles for developing algorithms that minimize resource consumption (e.g., time, memory) while maintaining or improving reasoning performance?* Data Generation: Can AI models generate questions that they cannot answer correctly?Can AI models achieve self-improvement through self-generated data?* Tool Utilization: How can AI systems leverage existing tools, such as code and software, to solve practical mathematical problems more effectively?* Limitation Analysis: What are the drawbacks or limitations of current models in mathematical reasoning (e.g. robustness, generalization, and reasoning boundary)? How can these limitations be quantitatively analyzed?
Schedule
Fri 8:30 a.m. - 8:35 a.m.
|
Opening Remarks
|
🔗 |
Fri 8:35 a.m. - 9:05 a.m.
|
Invited Talk - Is Mathematics Obsolete?
|
Jeremy Avigad 🔗 |
Fri 9:05 a.m. - 9:35 a.m.
|
Invited Talk - Formalizing the Future: Lean’s Impact on Mathematics, Programming, and AI
|
Leonardo de Moura 🔗 |
Fri 9:35 a.m. - 9:40 a.m.
|
Challenge Remarks
|
🔗 |
Fri 9:40 a.m. - 9:55 a.m.
|
Contributed Talk - Challenge Track 1 Winner
|
🔗 |
Fri 9:55 a.m. - 10:10 a.m.
|
Contributed Talk - Challenge Track 2 Winner
|
Ruitao Wu 🔗 |
Fri 10:10 a.m. - 10:25 a.m.
|
Contributed Talk: Best Paper Award I - Token Hidden Reward: Steering Exploration-Exploitation in GRPO Training
|
wenlong deng 🔗 |
Fri 10:25 a.m. - 10:40 a.m.
|
Contributed Talk: Best Paper Award II - Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?
|
Yang Yue 🔗 |
Fri 10:40 a.m. - 10:50 a.m.
|
Award Ceremony
|
🔗 |
Fri 10:50 a.m. - 12:20 p.m.
|
Poster Session
(
Poster Session
)
>
|
🔗 |
Fri 12:20 p.m. - 12:45 p.m.
|
Lunch Break
|
🔗 |
Fri 12:45 p.m. - 1:00 p.m.
|
Goedel-Prover-V2 The Strongest Open-Source Theorem Prover to Date
|
Chi Jin 🔗 |
Fri 1:00 p.m. - 1:30 p.m.
|
Invited Talk - Learning Efficient Recursive Numeral Systems via Reinforcement Learning
|
Moa Johansson 🔗 |
Fri 1:30 p.m. - 2:00 p.m.
|
Invited Talk - Large Language Models for Math (Education): From Problem-Solving to Teaching Problem-Solving
|
Mrinmaya Sachan 🔗 |
Fri 2:00 p.m. - 2:30 p.m.
|
Invited Talk - AI and the Verified Software Grand Challenge
|
Swarat Chaudhuri 🔗 |
Fri 2:30 p.m. - 3:00 p.m.
|
AIMO Remarks - AI4Math in 2025: Closing Gaps, Exposing Fault Lines
|
Simon Frieder 🔗 |
Fri 3:00 p.m. - 3:30 p.m.
|
Invited Talk - Open Training Recipes for Mathematical Reasoning in Language Models
|
Valentina Pyatkin 🔗 |
Fri 3:30 p.m. - 3:45 p.m.
|
Coffee Break
|
🔗 |
Fri 3:45 p.m. - 4:15 p.m.
|
Invited Talk - How can Machine Learning Help Mathematicians?
|
Amaury Hayat 🔗 |
Fri 4:15 p.m. - 4:45 p.m.
|
Invited Talk - Beyond Theorem Proving: Towards Proof Engineering at Scale
|
Huajian Xin 🔗 |
Fri 4:45 p.m. - 5:55 p.m.
|
Panel Discussion
|
Jeremy Avigad · Leonardo de Moura · Moa Johansson · Amaury Hayat · Sergei Gukov · Shubho Sengupta 🔗 |
Fri 5:55 p.m. - 6:00 p.m.
|
Closing Remarks
|
🔗 |
-
|
IntegralBench: Benchmarking LLMs with Definite Integral Problems ( Poster ) > link | Bintao Tang · Xin Yang · Yuhao Wang · Zixuan Qiu · Zimo Ji · Wenyuan Jiang 🔗 |
-
|
Inequality Ranking and Inference System ($\texttt{\textbf{IRIS}}$): Giving Mathematical Conjectures Numerical Value ( Poster ) > link | Jillian Eddy · Randy Davila · Jesus De Loera · Junwei Lu · Ethan Fang · Zini Yang 🔗 |
-
|
Not All Votes Count! Translated Program for Verification Improves Self-Consistency of Language Models for Math Reasoning ( Poster ) > link | Vernon Yan Han Toh · Deepanway Ghosal · Soujanya Poria 🔗 |
-
|
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model? ( Oral ) > link | Yang Yue · Zhiqi Chen · Rui Lu · Andrew Zhao · Zhaokai Wang · Yang Yue · Shiji Song · Gao Huang 🔗 |
-
|
Putting the Value Back in RL: Better Test-Time Scaling by Unifying LLM Reasoners With Verifiers ( Poster ) > link | Kusha Sareen · Morgane Moss · Alessandro Sordoni · Rishabh Agarwal · Seyedarian Hosseini 🔗 |
-
|
Beyond Single-Task: Robust Multi-Task Length Generalization for LLMs ( Poster ) > link | Yi Hu · Shijia Kang · Haotong Yang · xu · Muhan Zhang 🔗 |
-
|
NeSyGeo: A Neuro-Symbolic Framework for Multimodal Geometric Reasoning Data Generation ( Poster ) > link | Weiming Wu · Zi-Kang Wang · Jin Ye · Zhi Zhou · Yu-Feng Li · Lan-Zhe Guo 🔗 |
-
|
Lean Meets Theoretical Computer Science: Scalable Synthesis of Theorem Proving Challenges in Formal-Informal Pairs ( Poster ) > link | Terry Jingchen Zhang · Wenyuan Jiang · Rongchuan Liu · Yisong Wang · Ning Wang · Junran Yang · Yinya Huang · Mrinmaya Sachan 🔗 |
-
|
TinyV: Reducing False Negatives in Verification Improves RL for LLM Reasoning ( Poster ) > link | Zhangchen Xu · Yuetai Li · Fengqing Jiang · Bhaskar Ramasubramanian · Luyao Niu · Yuchen Lin · Radha Poovendran 🔗 |
-
|
Prover Agent: An Agent-based Framework for Formal Mathematical Proofs ( Poster ) > link | Kaito Baba · Chaoran Liu · Shuhei Kurita · Akiyoshi Sannai 🔗 |
-
|
Putnam-AXIOM: A Functional and Static Benchmark ( Poster ) > link | Aryan Gulati · Brando Miranda · Eric Chen · Emily Xia · Kai Fronsdal · Bruno de Moraes Dumont · Sanmi Koyejo 🔗 |
-
|
SoS1: O1 and R1-Like Reasoning LLMs are Sum-of-Square Solvers ( Poster ) > link | Kechen Li · Wenqi Zhu · Coralia Cartis · Tianbo Ji · Shiwei Liu 🔗 |
-
|
VeriBench: End-to-End Formal Verification Benchmark for AI Code Generation in Lean 4 ( Poster ) > link |
11 presentersBrando Miranda · Zhanke Zhou · Allen Nie · Elyas Obbad · Leni Aniva · Kai Fronsdal · Weston Kirk · Dilara Soylu · Andrea Yu · Ying Li · Sanmi Koyejo |
-
|
Learning to Discover Abstractions for LLM Reasoning ( Poster ) > link | Yuxiao Qu · Anikait Singh · Yoonho Lee · Amrith Setlur · Russ Salakhutdinov · Chelsea Finn · Aviral Kumar 🔗 |
-
|
e3: Learning to Explore Enables Extrapolation of Test-Time Compute for LLMs ( Poster ) > link | Amrith Setlur · Matthew Yang · Charlie Snell · Jeremiah Greer · Ian Wu · Virginia Smith · Max Simchowitz · Aviral Kumar 🔗 |
-
|
InternLM2.5-StepProver: Advancing Automated Theorem Proving via Critic-Guided Search ( Poster ) > link | Zijian Wu · Suozhi Huang · Zhejian Zhou · Huaiyuan Ying · Zheng Yuan · Wenwei Zhang · Dahua Lin · Kai Chen 🔗 |
-
|
When To Solve, When To Verify: Compute-Optimal Problem Solving and Generative Verification for LLM Reasoning ( Poster ) > link | Nishad Singhi · Hritik Bansal · Seyedarian Hosseini · Aditya Grover · Kai-Wei Chang · Marcus Rohrbach · Anna Rohrbach 🔗 |
-
|
VerifierQ: Enhancing LLM Test Time Compute with Q-Learning-based Verifiers ( Poster ) > link | Jianing Qi · Hao Tang · Zhigang Zhu 🔗 |
-
|
Entropy-Based Adaptive Weighting for Self-Training ( Poster ) > link | Xiaoxuan Wang · Yihe Deng · Mingyu Ma · Wei Wang 🔗 |
-
|
CLEVER: A Curated Benchmark for Formally Verified Code Generation ( Poster ) > link | Amitayush Thakur · Jasper Lee · George Tsoukalas · Meghana Sistla · Matthew Zhao · Stefan Zetzsche · Greg Durrett · Yisong Yue · Swarat Chaudhuri 🔗 |
-
|
LeanTree: Accelerating White-Box Proof Search with Factorized States in Lean 4 ( Poster ) > link | Matěj Kripner · Michal Sustr · Milan Straka 🔗 |
-
|
Widening the Mathematical Search Space with Abstraction‑Encouraging Prompts ( Poster ) > link | Shervin Ardeshir 🔗 |
-
|
Scalable Best-of-N Selection for Large Language Models via Self-Certainty ( Poster ) > link | Zhewei Kang · Xuandong Zhao · Dawn Song 🔗 |
-
|
Ada-R1: Hybrid CoT via Bi-Level Adaptive Reasoning Optimization ( Poster ) > link | Haotian Luo · Haiying He · Yibo Wang · Jinluan Yang · Rui Liu · Naiqiang Tan · Xiaochun Cao · Dacheng Tao · Li Shen 🔗 |
-
|
$Q\sharp$: Provably Optimal Distributional RL for LLM Post-Training ( Poster ) > link | Jin Zhou · Kaiwen Wang · Jonathan Chang · Zhaolin Gao · Nathan Kallus · Kilian Weinberger · Kianté Brantley · Wen Sun 🔗 |
-
|
Reward Inside the Model: A Lightweight Hidden‑State Reward Model for LLM's Best-of-N sampling ( Poster ) > link | Jizhou Guo · Zhaomin Wu · Philip Yu 🔗 |
-
|
Optimizing Anytime Reasoning via Budget Relative Policy Optimization ( Poster ) > link | Penghui Qi · Zichen Liu · Tianyu Pang · Chao Du · Wee Sun Lee · Min Lin 🔗 |
-
|
Direct Induction Proof Challenge: Evaluating Large Language Models on Deeply Nested Mathematical Induction ( Poster ) > link | Risako Ando · Koji Mineshima · Mitsuhiro Okada 🔗 |
-
|
RealMath: A Continuous Benchmark for Evaluating Language Models on Research-Level Mathematics ( Poster ) > link | Jie Zhang · Cezara Petrui · Kristina Nikolić · Florian Tramer 🔗 |
-
|
Understanding R1-Zero-Like Training: A Critical Perspective ( Poster ) > link | Zichen Liu · Changyu Chen · Wenjun Li · Penghui Qi · Tianyu Pang · Chao Du · Wee Sun Lee · Min Lin 🔗 |
-
|
O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning ( Poster ) > link | Haotian Luo · Li Shen · Haiying He · Yibo Wang · Shiwei Liu · Wei Li · Naiqiang Tan · Xiaochun Cao · Dacheng Tao 🔗 |
-
|
DSR-Bench: Evaluating the Structural Reasoning Abilities of LLMs via Data Structures ( Poster ) > link | Yu He · Yingxi Li · Colin White · Ellen Vitercik 🔗 |
-
|
A Survey on Large Language Model Reasoning Failures ( Poster ) > link | Peiyang Song · Pengrui Han · Noah Goodman 🔗 |
-
|
PADRE: Pseudo-Likelihood based Alignment of Diffusion Language Models ( Poster ) > link | Shiv Shankar 🔗 |
-
|
ProofWala: Multilingual Proof Data Synthesis and Theorem-Proving ( Poster ) > link | Amitayush Thakur · George Tsoukalas · Greg Durrett · Swarat Chaudhuri 🔗 |
-
|
Reinforcement Learning Teachers of Test Time Scaling ( Poster ) > link | Edoardo Cetin · Tianyu Zhao · Yujin Tang 🔗 |
-
|
Is Human-Written Data Enough? The Challenge of Teaching Reasoning to LLMs Without RL or Distillation ( Poster ) > link |
24 presentersWei Du · Branislav Kisacanin · George Armstrong · Shubham Toshniwal · Ivan Moshkov · Alexan Ayrapetyan · Sadegh Mahdavi · Dan Zhao · Shizhe Diao · Dragan Mašulović · Advaith Avadhanam · Max Wang · Shitij Govil · Sri Yanamandra · Mihir Tandon · Sriram Ananthakrishnan · Vedant Rathi · David Zhang · Joonseok Kang · Leon Luo · Titu Andreescu · Ashmit Dutta · Boris Ginsburg · Igor Gitman |
-
|
Token Hidden Reward: Steering Exploration-Exploitation in GRPO Training ( Oral ) > link | wenlong deng · YI REN · Danica J Sutherland · Christos Thrampoulidis · Xiaoxiao Li 🔗 |
-
|
Boosting LLM Reasoning via Spontaneous Self-Correction ( Poster ) > link |
14 presentersXutong Zhao · Tengyu Xu · Xuewei Wang · Zhengxing Chen · Di Jin · Liang Tan · Yen-Ting Lin · Zishun Yu · Zhuokai Zhao · Yun He · Sinong Wang · Han Fang · Sarath Chandar · Chen Zhu |
-
|
COAST: Intelligent Time-Adaptive Neural Operators ( Poster ) > link | Zhikai Wu · Shiyang Zhang · Sizhuang He · Sifan Wang · Min Zhu · Anran Jiao · Lu Lu · David van Dijk 🔗 |
-
|
OctoThinker: Mid-Training Incentivizes Reinforcement Learning Scaling ( Poster ) > link | Zengzhi Wang · Fan Zhou · Xuefeng Li · Pengfei Liu 🔗 |
-
|
The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning ( Poster ) > link | Xinyu Zhu · Mengzhou Xia · Zhepei Wei · Wei-Lin Chen · Danqi Chen · Yu Meng 🔗 |
-
|
Solving Inequality Proofs with Large Language Models ( Poster ) > link | Jiayi Sheng · Luna Lyu · Jikai Jin · Tony Xia · Alex Gu · James Zou · Pan Lu 🔗 |
-
|
Simple, Scalable Reasoning via Iterated Summarization ( Poster ) > link | Vivek Vajipey · Aditya Tadimeti · Justin Shen · Ben Prystawski · Michael Li · Noah Goodman 🔗 |
-
|
Landscape of Thoughts: Visualizing the Reasoning Process of Large Language Models ( Poster ) > link | Zhanke Zhou · Zhaocheng Zhu · Xuan Li · Mikhail Galkin · Xiao Feng · Sanmi Koyejo · Jian Tang · Bo Han 🔗 |
-
|
On the Effect of Negative Gradient in Group Relative Deep Reinforcement Optimization ( Poster ) > link | wenlong deng · YI REN · Muchen Li · Danica J Sutherland · Xiaoxiao Li · Christos Thrampoulidis 🔗 |
-
|
Instilling Parallel Reasoning into Language Models ( Poster ) > link | Matthew Macfarlane · Minseon Kim · Nebojsa Jojic · Weijia Xu · Lucas Caccia · Xingdi Yuan · Wanru Zhao · Zhengyan Shi · Alessandro Sordoni 🔗 |
-
|
Majority of the Bests: Improving Best-of-N via Bootstrapping ( Poster ) > link | Amin Rakhsha · Tianyu Zhang · Kanika Madan · Amir-massoud Farahmand · Amir Khasahmadi 🔗 |
-
|
Target-Based Automated Conjecturing for Neural Theorem Proving ( Poster ) > link | Marco Dos Santos · Albert Jiang · Wenda Li · Mateja Jamnik 🔗 |
-
|
Inferring Loop Invariants for Program Verification: an Abductive Learning Perspective ( Poster ) > link | Luan Dai-Yang · Ming Li 🔗 |
-
|
Scaling Natural-Language Graph-Based Test Time Compute for Automated Theorem Proving ( Poster ) > link | Tim Knappe · Vincent Li · Yule Fu · Kevin Han · Kevin Zhu 🔗 |
-
|
SPEED-RL: Faster Training of Reasoning Models via Online Curriculum Learning ( Poster ) > link | Ruiqi Zhang · Daman Arora · Song Mei · Andrea Zanette 🔗 |
-
|
Training Language Models to Reason Efficiently ( Poster ) > link | Daman Arora · Andrea Zanette 🔗 |
-
|
The Open Proof Corpus: A Large-Scale Study of LLM-Generated Mathematical Proofs ( Poster ) > link |
16 presentersJasper Dekoninck · Ivo Petrov · Kristian Minchev · Miroslav Marinov · Maria Drencheva · Lyuba Konova · Milen Shumanov · Kaloyan Tsvetkov · Nikolay Drenchev · Lazar Todorov · Kalina Nikolova · Nikolay Georgiev · Vanesa Kalinkova · Margulan Ismoldayev · Mislav Balunovic · Martin Vechev |
-
|
KELPS: A Framework for Verified Multi-Language Autoformalization via Semantic-Syntactic Alignment ( Poster ) > link | Jiyao Zhang · Chengli Zhong · Hui Xu · Qige Li · Jiajia Tian · Yi Zhou 🔗 |
-
|
Let’s Try Again: Eliciting Multi-Turn Reasoning in Language Models via Simplistic Feedback ( Poster ) > link | Licheng Liu · Zihan Wang · Linjie Li · Chenwei Xu · Yiping Lu · Han Liu · Avirup Sil · Manling Li 🔗 |
-
|
EquivaMap: Leveraging LLMs for Automatic Equivalence Checking of Optimization Formulations ( Poster ) > link | Haotian Zhai · Connor Lawless · Ellen Vitercik · Liu Leqi 🔗 |
-
|
Plane Geometry Diagram Formalization via Vision-Language Models ( Poster ) > link | Xiaoteng Cui · Yi Liu 🔗 |
-
|
MM-Agent: LLM as Agents for Real-world Mathematical Modeling Problem ( Poster ) > link | Fan Liu · Zhe-Rui Yang · Cancheng Liu · Tianrui Song · Xiaofeng Gao · Hao Liu 🔗 |
-
|
Lemmanaid: Neuro-Symbolic Lemma Conjecturing ( Poster ) > link | Yousef Alhessi · Sólrún Einarsdóttir · George Granberry · Emily First · Moa Johansson · Sorin Lerner · Nicholas Smallbone 🔗 |
-
|
Learning-Guided Local Search for Asymmetric Traveling Salesman Problem ( Poster ) > link | Lejun Zhou · Yi Ju · Scott Moura 🔗 |
-
|
Forget Less, Solve More: Sequential Fine-Tuning with Adapter Shrinking for Math Word Problems ( Poster ) > link | Gauri Toshniwal · S Balasundaram 🔗 |
-
|
Learning an Effective Premise Retrieval Model for Efficient Mathematical Formalization ( Poster ) > link | Yicheng Tao · Haotian Liu · Shanwen Wang · Hongteng Xu 🔗 |
-
|
Scaling Mathematical Reasoning through Data, Tools, and Generative Selection ( Poster ) > link | Ivan Moshkov · Darragh Hanley · Ivan Sorokin · Shubham Toshniwal · Christof Henkel · Benedikt Schifferer · Wei Du · Igor Gitman 🔗 |
-
|
CoDaPO: Confidence and Difficulty-Adaptive Policy Optimization for Post-Training Language Models ( Poster ) > link | Zhanke Zhou · Xiangyu Lu · Chentao Cao · Brando Miranda · Tongliang Liu · Bo Han · Sanmi Koyejo 🔗 |
-
|
Vision Language Models are Biased: Counting legs of an animal is surprisingly hard ( Poster ) > link | An Vo · Khai-Nguyen Nguyen · Mohammad Reza Taesiri · Vy Dang · Anh Nguyen · Daeyoung Kim 🔗 |
-
|
Discovering Hidden Algebraic Structures via Transformers with Rank-Aware Beam GRPO ( Poster ) > link | Jaeha Lee · Gio Huh · Ning Su · Tony YU 🔗 |
-
|
LeanTutor: A Lean-Verified Tutor for Mathematical Proofs ( Poster ) > link | Manooshree Patel · Rayna Bhattacharyya · Thomas Lu · Arnav Mehta · Niels Voss · Narges Norouzi · Gireeja Ranade 🔗 |
-
|
Which Data Attributes Stimulate Math and Code Reasoning? An Investigation via Influence Functions ( Poster ) > link | Siqi Kou · Qingyuan Tian · Hanwen Xu · Zihao Zeng · Zhijie Deng 🔗 |
-
|
ProofCompass: Enhancing Specialized Provers with LLM Guidance ( Poster ) > link | Nicolas Wischermann · Claudio Mayrink Verdun · Gabriel Poesia · Francesco Noseda 🔗 |
-
|
Done Is Better than Perfect: Unlocking Efficient Reasoning by Structured Multi-Turn Decomposition ( Poster ) > link | Zihao Zeng · Xuyao Huang · Boxiu Li · Hao Zhang · Zhijie Deng 🔗 |
-
|
Machine Learning and LLM-Boost Symbolic Regression for Predicting $\mathbb{Q}$-Gonality of Modular Curves ( Poster ) > link | Xu Zhuang · Yuxiang Yao · Po-Chu Hsu · Xiaokang Wang · Peikai Qi 🔗 |
-
|
Graph Neural Networks for Tensor Product Decompositions of Lie Algebra Representations ( Poster ) > link | Max Vargas · Helen Jenne · Davis Brown · Henry Kvinge 🔗 |
-
|
A Markov Categorical Framework for Language Modeling ( Poster ) > link | Yifan Zhang 🔗 |
-
|
On the Limits of RLVR: Support, Entropy, and the Illusion of Reasoning ( Poster ) > link | Fang Wu · Yejin Choi 🔗 |
-
|
Learning to Solve Complex Problems via Dataset Decomposition ( Poster ) > link | Wanru Zhao · Lucas Caccia · Zhengyan Shi · Minseon Kim · Xingdi Yuan · Weijia Xu · Marc-Alexandre Côté · Alessandro Sordoni 🔗 |
-
|
DeduCE: Deductive Consistency as a Framework to Evaluate LLM Reasoning ( Poster ) > link | Atharva Pandey · Kshitij Dubey · Rahul Sharma · Amit Sharma 🔗 |
-
|
Boolformer: Symbolic Regression of Logic Functions with Transformers ( Poster ) > link | Stéphane d'Ascoli · Arthur Renard · Emmanuel Abbe · Vassilis Papadopoulos · Samy Bengio · Joshua M Susskind 🔗 |
-
|
Spectral Policy Optimization: Coloring your Incorrect Reasoning in GRPO ( Poster ) > link | Chen · Xiaopeng Li · Ziniu Li · Xi Chen · Tianyi Lin 🔗 |
-
|
Reward Under Attack: Evaluating the Sensitivity of Process Reward Models ( Poster ) > link | Udbhav Bamba · Rishabh Tiwari · Heng Yang · Kurt Keutzer · Amir Gholaminejad 🔗 |
-
|
NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation ( Poster ) > link | Xiangyan Liu · Jinjie Ni · Zijian Wu · Chao Du · Longxu Dou · Haonan Wang · Tianyu Pang · Michael Shieh 🔗 |
-
|
Chain-of-Thought Reasoning for Math: Theoretical Foundation and Applications ( Poster ) > link | Jessica Liang 🔗 |
-
|
Enhancing Graph Neural Network for Boolean Satisfiability Solving via Data Augmentation ( Poster ) > link | Yi Fu · Anthony Tompkins · Yang Song · Maurice Pagnucco 🔗 |
-
|
Learning to Reason without External Rewards ( Poster ) > link | Xuandong Zhao · Zhewei Kang · Aosong Feng · Sergey Levine · Dawn Song 🔗 |
-
|
Do Not Let Low-Probability Tokens Over-Dominate in RL for LLMs ( Poster ) > link | Zhihe Yang · Xufang Luo · Zilong Wang · Dongqi Han · Zhiyuan He · Dongsheng Li · Yunjian Xu 🔗 |
-
|
Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks ( Poster ) > link | Taishi Nakamura · Satoki Ishikawa · Masaki Kawamura · Takumi Okamoto · Daisuke Nohara · Jun Suzuki · Rio Yokota 🔗 |
-
|
Beyond Accuracy: A Policy Gradient Reweighting Approach for Pass@K Maximization in LLMs ( Poster ) > link | Sadegh Mahdavi · Muchen Li · Kaiwen Liu · Renjie Liao · Christos Thrampoulidis 🔗 |
-
|
Discrete Feynman-Kac Correctors ( Poster ) > link | Mohsin Hasan · Marta Skreta · Alan Aspuru-Guzik · Yoshua Bengio · Kirill Neklyudov 🔗 |
-
|
A Comprehensive Evaluation of Contemporary Machine-Learning-Based Solvers for CO ( Poster ) > link | Shengyu Feng · Weiwei Sun · Shanda Li · Ameet Talwalkar · Yiming Yang 🔗 |
-
|
How Is LLM Reasoning Distracted by Irrelevant Context? An Analysis Using a Controlled Benchmark ( Poster ) > link | Minglai Yang · Ethan Huang · Liang Zhang · Mihai Surdeanu · William Wang · Liangming Pan 🔗 |
-
|
From Narrative to Formalism: A Case Study in the Origin of Molecular Translation System ( Poster ) > link | Dmitry Zubarev 🔗 |
-
|
Towards Geometry Problem Solving in the Large Model Era: A Survey ( Poster ) > link | Yurui Zhao · Xiang Wang · Jiahong Liu · Irwin King · Zhitao Huang 🔗 |
-
|
A Practical Two-Stage Recipe for Mathematical LLMs: Maximizing Accuracy with SFT and Efficiency with Reinforcement Learning ( Poster ) > link | Hiroshi Yoshihara · Taiki Yamaguchi · Yuichi Inoue 🔗 |
-
|
Omni-Think: Scaling Multi-Task Learning in LLMs via Reinforcement Learning ( Poster ) > link | Derek Li · Jiaming Zhou · Amirreza Kazemi · Qianyi Sun · Abbas Ghaddar · Liheng Ma · Yu Luo · Dong Li · Jianye Hao · Yingxue Zhang 🔗 |
-
|
GenSelect: A Generative Approach to Best-of-N ( Poster ) > link | Shubham Toshniwal · Ivan Sorokin · Aleksander Ficek · Ivan Moshkov · Igor Gitman 🔗 |
-
|
Generalized Tree Edit Distance (GTED): A Faithful Evaluation Metric for Statement Autoformalization ( Poster ) > link | Yuntian Liu · Tao Zhu · Xiaoyang LIU · Yu Chen · Liu ZhaoXuan · Guo qingfeng · Jiashuo Zhang · Kangjie Bao · Tao Luo 🔗 |
-
|
FMC: Formalization of Natural Language Mathematical Competition Problems ( Poster ) > link | Jiaxuan Xie · Chengwu Liu · Ye Yuan · Siqi Li · Zhiping Xiao · Ming Zhang 🔗 |
-
|
A Compute-Matched Re-Evaluation of TroVE on MATH ( Poster ) > link | Tobias Sesterhenn · Ian Berlot-Attwell · Janis Zenkner · Christian Bartelt 🔗 |
-
|
Lean Finder: Semantic Search for Mathlib That Understands User Intents ( Poster ) > link | Jialin Lu · Kye Emond · Weiran Sun · Wuyang Chen 🔗 |
-
|
RL‑QESA: Reinforcement‑Learning Quasi‑Equilibrium Simulated Annealing ( Poster ) > link | Ruichen Xu · Kai Li · Haochun Wang · Georgios Kementzidis · Wei Zhu · Yuefan Deng 🔗 |
-
|
Verina: Benchmarking Verifiable Code Generation ( Poster ) > link | Zhe Ye · Zhengxu Yan · Jingxuan He · Timothe Kasriel · Kaiyu Yang · Dawn Song 🔗 |
-
|
Proof or Bluff? Evaluating LLMs on 2025 USA Math Olympiad ( Poster ) > link | Ivo Petrov · Jasper Dekoninck · Lyuben Baltadzhiev · Maria Drencheva · Kristian Minchev · Mislav Balunovic · Nikola Jovanović · Martin Vechev 🔗 |
-
|
Physics-Constrained Symbolic Regression from Imagery ( Poster ) > link | Zhenyu Yu · MOHD IDRIS · Pei Wang 🔗 |
-
|
Small Models Struggle to Learn from Strong Reasoners ( Poster ) > link | Yuetai Li · Xiang Yue · Zhangchen Xu · Fengqing Jiang · Luyao Niu · Yuchen Lin · Bhaskar Ramasubramanian · Radha Poovendran 🔗 |
-
|
Value-Guided Search for Efficient Chain-of-Thought Reasoning ( Poster ) > link | Kaiwen Wang · Jin Zhou · Jonathan Chang · Zhaolin Gao · Nathan Kallus · Kianté Brantley · Wen Sun 🔗 |
-
|
Chain of Thought in Order: Discovering Learning-Friendly Orders for Arithmetic ( Poster ) > link | Yuta Sato · Kazuhiko Kawamoto · Hiroshi Kera 🔗 |
-
|
Governing Equation Discovery from Data Based on Differential Invariants ( Poster ) > link | Lexiang Hu · Yikang Li · Zhouchen Lin 🔗 |
-
|
Temporal Sampling for Forgotten Reasoning in LLMs ( Poster ) > link | Yuetai Li · Zhangchen Xu · Fengqing Jiang · Bhaskar Ramasubramanian · Luyao Niu · Yuchen Lin · Xiang Yue · Radha Poovendran 🔗 |
-
|
POD-KAN-NO: a physically interpretable neural operator ( Poster ) > link | Yanyu Ke 🔗 |
-
|
MIRB: Mathematical Information Retrieval Benchmark ( Poster ) > link | Haocheng Ju · Bin Dong 🔗 |
-
|
README: Rapid Equation Discovery with Multimodal Encoders ( Poster ) > link | Gregory Kang Ruey Lau · Yue Kang · Zi-Yu Khoo · Apivich Hemachandra · Ruth Wan Theng Chew · Bryan Kian Hsiang Low 🔗 |
-
|
Learning Moderately Input-Sensitive Functions: A Case Study in QR Code Decoding ( Poster ) > link | Kazuki Yoda · Kazuhiko Kawamoto · Hiroshi Kera 🔗 |