ICML 2025 Papers

Layout:

mini compact topic detail

The Limits of Tractable Marginalization

One-Pass Feature Evolvable Learning with Theoretical Guarantees

Portable Reward Tuning: Towards Reusable Fine-Tuning across Different Pretrained Models

Momentum-Driven Adaptivity: Towards Tuning-Free Asynchronous Federated Learning

Continuous Bayesian Model Selection for Multivariate Causal Discovery

Incentivize without Bonus: Provably Efficient Model-based Online Multi-agent RL for Markov Games

Online Learning in Risk Sensitive constrained MDP

AdaWorld: Learning Adaptable World Models with Latent Actions

EVOLvE: Evaluating and Optimizing LLMs For In-Context Exploration

Fine-Grained Captioning of Long Videos through Scene Graph Consolidation

NeuroTree: Hierarchical Functional Brain Pathway Decoding for Mental Health Disorders

CommVQ: Commutative Vector Quantization for KV Cache Compression

Investigating Non-Transitivity in LLM-as-a-Judge

Should Decision-Makers Reveal Classifiers in Online Strategic Classification?

Position: You Can't Manufacture a NeRF

Blink of an eye: a simple theory for feature localization in generative models

Unsupervised Learning for Class Distribution Mismatch

Chip Placement with Diffusion Models

Otter: Generating Tests from Issues to Validate SWE Patches

The Impact of On-Policy Parallelized Data Collection on Deep Reinforcement Learning Networks

Reinforcement Learning with Random Time Horizons

Bridging Fairness and Efficiency in Conformal Inference: A Surrogate-Assisted Group-Clustered Approach

On the Provable Separation of Scales in Maximal Update Parameterization

A Near-Optimal Single-Loop Stochastic Algorithm for Convex Finite-Sum Coupled Compositional Optimization

Faster Stochastic Optimization with Arbitrary Delays via Adaptive Asynchronous Mini-Batching

Gradient Aligned Regression via Pairwise Losses

InfoCons: Identifying Interpretable Critical Concepts in Point Clouds via Information Theory

Discovering Global False Negatives On the Fly for Self-supervised Contrastive Learning

Discriminative Finetuning of Generative Large Language Models without Reward Models and Human Preference Data

NeuronTune: Towards Self-Guided Spurious Bias Mitigation

Sample Efficient Demonstration Selection for In-Context Learning

Neural Guided Diffusion Bridges

Hierarchical Reinforcement Learning with Targeted Causal Interventions

CERTAIN: Context Uncertainty-aware One-Shot Adaptation for Context-based Offline Meta Reinforcement Learning

Boosting Adversarial Robustness with CLAT: Criticality Leveraged Adversarial Training

Sanity Checking Causal Representation Learning on a Simple Real-World System

ELMO : Efficiency via Low-precision and Peak Memory Optimization in Large Output Spaces

Position: Not All Explanations for Deep Learning Phenomena Are Equally Valuable

Can Transformers Learn Full Bayesian Inference in Context?

Online Episodic Convex Reinforcement Learning

Revisiting Unbiased Implicit Variational Inference

Position: The Future of Bayesian Prediction Is Prior-Fitted

Streamline Without Sacrifice - Squeeze out Computation Redundancy in LMM

Learning to Trust Bellman Updates: Selective State-Adaptive Regularization for Offline RL

Circumventing Backdoor Space via Weight Symmetry

Computing Voting Rules with Improvement Feedback

Towards flexible perception with visual memory

Nonlinear transformers can perform inference-time feature learning

Concept Reachability in Diffusion Models: Beyond Dataset Constraints

Revisiting Diffusion Models: From Generative Pre-training to One-Step Generation

Spatial Reasoning with Denoising Models

FlexiReID: Adaptive Mixture of Expert for Multi-Modal Person Re-Identification

Bayesian Basis Function Approximation for Scalable Gaussian Process Priors in Deep Generative Models

Contrastive Visual Data Augmentation

Position: Scaling LLM Agents Requires Asymptotic Analysis with LLM Primitives

Discovering Symbolic Cognitive Models from Human and Animal Behavior

An Asymptotically Optimal Approximation Algorithm for Multiobjective Submodular Maximization at Scale

Upweighting Easy Samples in Fine-Tuning Mitigates Forgetting

Testing Conditional Mean Independence Using Generative Neural Networks

Low-Rank Tensor Transitions (LoRT) for Transferable Tensor Regression

Explicit Preference Optimization: No Need for an Implicit Reward Model

StealthInk: A Multi-bit and Stealthy Watermark for Large Language Models

Physics-Informed Weakly Supervised Learning For Interatomic Potentials

Position: Explainable AI Cannot Advance Without Better User Studies

Online Differentially Private Conformal Prediction for Uncertainty Quantification

Griffin: Towards a Graph-Centric Relational Database Foundation Model

ResQ: Mixed-Precision Quantization of Large Language Models with Low-Rank Residuals

Inverse Optimization via Learning Feasible Regions

RealRAG: Retrieval-augmented Realistic Image Generation via Self-reflective Contrastive Learning

Adaptive Message Passing: A General Framework to Mitigate Oversmoothing, Oversquashing, and Underreaching

GRAIL: Graph Edit Distance and Node Alignment using LLM-Generated Code

Multi-Marginal Stochastic Flow Matching for High-Dimensional Snapshot Data at Irregular Time Points

Temporal Distance-aware Transition Augmentation for Offline Model-based Reinforcement Learning

Rank-One Modified Value Iteration

Scaling Collapse Reveals Universal Dynamics in Compute-Optimally Trained Neural Networks

Preserving AUC Fairness in Learning with Noisy Protected Groups

Understanding the Forgetting of (Replay-based) Continual Learning via Feature Learning: Angle Matters

Modularized Self-Reflected Video Reasoner for Multimodal LLM with Application to Video Question Answering

ResearchTown: Simulator of Human Research Community

Accurate Identification of Communication Between Multiple Interacting Neural Populations

FisherSFT: Data-Efficient Supervised Fine-Tuning of Language Models Using Information Gain

Neural Solver Selection for Combinatorial Optimization

Embedding Safety into RL: A New Take on Trust Region Methods

An Online Adaptive Sampling Algorithm for Stochastic Difference-of-convex Optimization with Time-varying Distributions

Temporal Query Network for Efficient Multivariate Time Series Forecasting

Contextual Optimization Under Model Misspecification: A Tractable and Generalizable Approach

Scaling Laws for Task-Optimized Models of the Primate Visual Ventral Stream

System-Aware Unlearning Algorithms: Use Lesser, Forget Faster

VideoJAM: Joint Appearance-Motion Representations for Enhanced Motion Generation in Video Models

The Sample Complexity of Online Strategic Decision Making with Information Asymmetry and Knowledge Transportability

Learning Extrapolative Sequence Transformations from Markov Chains

SketchDNN: Joint Continuous-Discrete Diffusion for CAD Sketch Generation

Identifying Metric Structures of Deep Latent Variable Models

Leveraging Randomness in Model and Data Partitioning for Privacy Amplification

ExpProof : Operationalizing Explanations for Confidential Models with ZKPs

From Theory to Practice: Rethinking Green and Martin Kernels for Unleashing Graph Transformers

Benign Overfitting in Token Selection of Attention Mechanism

Provably Efficient Exploration in Inverse Constrained Reinforcement Learning

DEALing with Image Reconstruction: Deep Attentive Least Squares

An Effective and Secure Federated Multi-View Clustering Method with Information-Theoretic Perspective

Differentially Private Boxplots

Aligning LLMs by Predicting Preferences from User Writing Samples

Maximum Coverage in Turnstile Streams with Applications to Fingerprinting Measures

How Contaminated Is Your Benchmark? Measuring Dataset Leakage in Large Language Models with Kernel Divergence

Square$\chi$PO: Differentially Private and Robust $\chi^2$-Preference Optimization in Offline Direct Alignment

Determinant Estimation under Memory Constraints and Neural Scaling Laws

Position: Current Model Licensing Practices are Dragging Us into a Quagmire of Legal Noncompliance

Editable Noise Map Inversion: Encoding Target-image into Noise For High-Fidelity Image Manipulation

Improving Multi-Class Calibration through Normalization-Aware Isotonic Techniques

All-Purpose Mean Estimation over R: Optimal Sub-Gaussianity with Outlier Robustness and Low Moments Performance

MoH: Multi-Head Attention as Mixture-of-Head Attention

Capturing Temporal Dynamics in Large-Scale Canopy Tree Height Estimation

Understanding the Emergence of Multimodal Representation Alignment

A Non-isotropic Time Series Diffusion Model with Moving Average Transitions

Conditional Diffusion Model with Nonlinear Data Transformation for Time Series Forecasting

Adaptive Partitioning Schemes for Optimistic Optimization

Nonparametric Identification of Latent Concepts

Global Convergence and Rich Feature Learning in $L$-Layer Infinite-Width Neural Networks under $\mu$ Parametrization

The Role of Randomness in Stability

M+: Extending MemoryLLM with Scalable Long-Term Memory

Instance-Optimal Pure Exploration for Linear Bandits on Continuous Arms

Plan-and-Act: Improving Planning of Agents for Long-Horizon Tasks

Latent Variable Causal Discovery under Selection Bias

EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM

COGNATE: Acceleration of Sparse Tensor Programs on Emerging Hardware using Transfer Learning

GTR: A General, Multi-View, and Dynamic Framework for Trajectory Representation Learning

A Sample Efficient Conditional Independence Test in the Presence of Discretization

DataDecide: How to Predict Best Pretraining Data with Small Experiments

Robust Autonomy Emerges from Self-Play

Identification of Latent Confounders via Investigating the Tensor Ranks of the Nonlinear Observations

Extracting Rare Dependence Patterns via Adaptive Sample Reweighting

FG-CLIP: Fine-Grained Visual and Textual Alignment

LangTime: A Language-Guided Unified Model for Time Series Forecasting with Proximal Policy Optimization

Fairness on Principal Stratum: A New Perspective on Counterfactual Fairness

Unpaired Point Cloud Completion via Unbalanced Optimal Transport

Online Curvature-Aware Replay: Leveraging $\mathbf{2^{nd}}$ Order Information for Online Continual Learning

Adversarial Inputs for Linear Algebra Backends

Look Twice Before You Answer: Memory-Space Visual Retracing for Hallucination Mitigation in Multimodal Large Language Models

Guided Search Strategies in Non-Serializable Environments with Applications to Software Engineering Agents

Shifting Time: Time-series Forecasting with Khatri-Rao Neural Operators

Permutation-based Rank Test in the Presence of Discretization and Application in Causal Discovery with Mixed Data

Text-to-CAD Generation Through Infusing Visual Feedback in Large Language Models

iN2V: Bringing Transductive Node Embeddings to Inductive Graphs

DiffusionVLA: Scaling Robot Foundation Models via Unified Diffusion and Autoregression

Offline Model-based Optimization for Real-World Molecular Discovery

Task-Gated Multi-Expert Collaboration Network for Degraded Multi-Modal Image Fusion

ICLShield: Exploring and Mitigating In-Context Learning Backdoor Attacks

A Sub-Problem Quantum Alternating Operator Ansatz for Correlation Clustering

Beyond Communication Overhead: A Multilevel Monte Carlo Approach for Mitigating Compression Bias in Distributed Learning

Where is the Truth? The Risk of Getting Confounded in a Continual World

CodeSteer: Symbolic-Augmented Language Models via Code/Text Guidance

MODA: MOdular Duplex Attention for Multimodal Perception, Cognition, and Emotion Understanding

Provable Policy Gradient for Robust Average-Reward MDPs Beyond Rectangularity

Categorical Distributional Reinforcement Learning with Kullback-Leibler Divergence: Convergence and Asymptotics

A Checks-and-Balances Framework for Context-Aware Ethical AI Alignment

What can large language models do for sustainable food?

Exact risk curves of signSGD in High-Dimensions: quantifying preconditioning and noise-compression effects

Sample-Optimal Agnostic Boosting with Unlabeled Data

Measuring Diversity: Axioms and Challenges

BaxBench: Can LLMs Generate Correct and Secure Backends?

Discrete Neural Algorithmic Reasoning

Improving the Effective Receptive Field of Message-Passing Neural Networks

TreeLoRA: Efficient Continual Learning via Layer-Wise LoRAs Guided by a Hierarchical Gradient-Similarity Tree

TimePoint: Accelerated Time Series Alignment via Self-Supervised Keypoint and Descriptor Learning

QLASS: Boosting Language Agent Inference via Q-Guided Stepwise Search

Textural or Textual: How Vision-Language Models Read Text in Images

Exact Recovery of Sparse Binary Vectors from Generalized Linear Measurements

Learning the RoPEs: Better 2D and 3D Position Encodings with STRING

Inverse Bridge Matching Distillation

LaMAGIC2: Advanced Circuit Formulations for Language Model-Based Analog Topology Generation

Efficient Generative Modeling with Residual Vector Quantization-Based Tokens

DragSolver: A Multi-Scale Transformer for Real-World Automotive Drag Coefficient Estimation

Efficient Multivariate Robust Mean Estimation Under Mean-Shift Contamination

Towards scientific discovery with dictionary learning: Extracting biological concepts from microscopy foundation models

LineFlow: A Framework to Learn Active Control of Production Lines

Revealing Weaknesses in Text Watermarking Through Self-Information Rewrite Attacks

An Optimistic Algorithm for online CMDPS with Anytime Adversarial Constraints

Enhancing Ligand Validity and Affinity in Structure-Based Drug Design with Multi-Reward Optimization

Active Learning for Efficient Discovery of Optimal Combinatorial Perturbations

Censor Dependent Variational Inference

KinDEL: DNA-Encoded Library Dataset for Kinase Inhibitors

Position: Graph Learning Will Lose Relevance Due To Poor Benchmarks

A Variational Perspective on Generative Protein Fitness Optimization

Model Immunization from a Condition Number Perspective

Primphormer: Efficient Graph Transformers with Primal Representations

Towards a Formal Theory of Representational Compositionality

Multi-band Frequency Reconstruction for Neural Psychoacoustic Coding

Peripheral Memory for LLMs: Integration of Sequential Memory Banks with Adaptive Querying

Reducing Confounding Bias without Data Splitting for Causal Inference via Optimal Transport

Reinforcement Learning with Adaptive Reward Modeling for Expensive-to-Evaluate Systems

CodeIO: Condensing Reasoning Patterns via Code Input-Output Prediction

QMamba: On First Exploration of Vision Mamba for Image Quality Assessment

A Causal World Model Underlying Next Token Prediction: Exploring GPT in a Controlled Environment

Cover learning for large-scale topology representation

Visual Graph Arena: Evaluating Visual Conceptualization of Vision and Multimodal Large Language Models

Falcon: Fast Visuomotor Policies via Partial Denoising

VCT: Training Consistency Models with Variational Noise Coupling

BDC-CLIP: Brownian Distance Covariance for Adapting CLIP to Action Recognition

Quadratic Upper Bound for Boosting Robustness

BSemiFL: Semi-supervised Federated Learning via a Bayesian Approach

MathConstruct: Challenging LLM Reasoning with Constructive Proofs

Compute or Load KV Cache? Why Not Both?

Contour Integration Underlies Human-Like Vision

IMPACT: Iterative Mask-based Parallel Decoding for Text-to-Audio Generation with Diffusion Modeling

Generative Audio Language Modeling with Continuous-valued Tokens and Masked Next-Token Prediction

HEAP: Hyper Extended A-PDHG Operator for Constrained High-dim PDEs

Prediction-Powered Adaptive Shrinkage Estimation

Predicting High-precision Depth on Low-Precision Devices Using 2D Hilbert Curves

CVE-Bench: A Benchmark for AI Agents’ Ability to Exploit Real-World Web Application Vulnerabilities

Learning Gaussian DAG Models without Condition Number Bounds

Backdoor Attacks in Token Selection of Attention Mechanism

Differentiable Structure Learning with Ancestral Constraints

MONA: Myopic Optimization with Non-myopic Approval Can Mitigate Multi-step Reward Hacking

Explainable Concept Generation through Vision-Language Preference Learning for Understanding Neural Networks' Internal Representations

WildChat-50M: A Deep Dive Into the Role of Synthetic Data in Post-Training

Independence Tests for Language Models

QoS-Efficient Serving of Multiple Mixture-of-Expert LLMs Using Partial Runtime Reconfiguration

Dialogue Without Limits: Constant-Sized KV Caches for Extended Response in LLMs

Best of Both Worlds: Advantages of Hybrid Graph Sequence Models

Beyond Sensor Data: Foundation Models of Behavioral Data from Wearables Improve Health Predictions

Stability and Generalization Analysis of Decentralized SGD: Sharper Bounds Beyond Lipschitzness and Smoothness

Tensor-Var: Efficient Four-Dimensional Variational Data Assimilation

Contextual Bandits for Unbounded Context Distributions

Calibrated Physics-Informed Uncertainty Quantification

Joint Metric Space Embedding by Unbalanced Optimal Transport with Gromov–Wasserstein Marginal Penalization

Pre-Training Graph Contrastive Masked Autoencoders are Strong Distillers for EEG

Position: Rethinking LLM Bias Probing Using Lessons from the Social Sciences

Feature Learning beyond the Lazy-Rich Dichotomy: Insights from Representational Geometry

CROW: Eliminating Backdoors from Large Language Models via Internal Consistency Regularization

PPDiff: Diffusing in Hybrid Sequence-Structure Space for Protein-Protein Complex Design

BECAME: Bayesian Continual Learning with Adaptive Model Merging

Provably Efficient Algorithm for Best Scoring Rule Identification in Online Principal-Agent Information Acquisition

Beyond Log-Concavity and Score Regularity: Improved Convergence Bounds for Score-Based Generative Models in W2-distance

PROXSPARSE: REGULARIZED LEARNING OF SEMI-STRUCTURED SPARSITY MASKS FOR PRETRAINED LLMS

NEAR: Neural Electromagnetic Array Response

Robust Secure Swap: Responsible Face Swap With Persons of Interest Redaction and Provenance Traceability

Over-Tokenized Transformer: Vocabulary is Generally Worth Scaling

Stronger Neyman Regret Guarantees for Adaptive Experimental Design

GS-Bias: Global-Spatial Bias Learner for Single-Image Test-Time Adaptation of Vision-Language Models

Adaptive Self-improvement LLM Agentic System for ML Library Development

When, Where and Why to Average Weights?

QuanONet: Quantum Neural Operator with Application to Differential Equation

Cost-efficient Collaboration between On-device and Cloud Language Models

Neural Discovery in Mathematics: Do Machines Dream of Colored Planes?

CLIMB: Data Foundations for Large Scale Multimodal Clinical Foundation Models

Variational Counterfactual Intervention Planning to Achieve Target Outcomes

BanditSpec: Adaptive Speculative Decoding via Bandit Algorithms

Generalized Interpolating Discrete Diffusion

Vintix: Action Model via In-Context Reinforcement Learning

Minimum Width for Universal Approximation using Squashable Activation Functions

Permutation-Free High-Order Interaction Tests

FactTest: Factuality Testing in Large Language Models with Finite-Sample and Distribution-Free Guarantees

MMedPO: Aligning Medical Vision-Language Models with Clinical-Aware Multimodal Preference Optimization

Flow Matching for Denoised Social Recommendation

WOMD-Reasoning: A Large-Scale Dataset for Interaction Reasoning in Driving

Galileo: Learning Global & Local Features of Many Remote Sensing Modalities

Addressing Misspecification in Simulation-based Inference through Data-driven Calibration

Learning from Sample Stability for Deep Clustering

Multi-Timescale Dynamics Model Bayesian Optimization for Plasma Stabilization in Tokamaks

A Square Peg in a Square Hole: Meta-Expert for Long-Tailed Semi-Supervised Learning

Implicit Riemannian Optimism with Applications to Min-Max Problems

On the Learnability of Distribution Classes with Adaptive Adversaries

Efficient and Privacy-Preserving Soft Prompt Transfer for LLMs

Evaluating LLMs Across Multi-Cognitive Levels: From Medical Knowledge Mastery to Scenario-Based Problem Solving

RoSTE: An Efficient Quantization-Aware Supervised Fine-Tuning Approach for Large Language Models

Learning curves theory for hierarchically compositional data with power-law distributed features

Scaling Inference-Efficient Language Models

Pre-training Auto-regressive Robotic Models with 4D Representations

How Compositional Generalization and Creativity Improve as Diffusion Models are Trained

CoastalBench: A Decade-Long High-Resolution Dataset to Emulate Complex Coastal Processes

Masked Generative Nested Transformers with Decode Time Scaling

Towards Attributions of Input Variables in a Coalition

B-score: Detecting biases in large language models using response history

NTK-DFL: Enhancing Decentralized Federated Learning in Heterogeneous Settings via Neural Tangent Kernel

Improving Multimodal Learning Balance and Sufficiency through Data Remixing

The Logical Implication Steering Method for Conditional Interventions on Transformer Generation

R.I.P.: Better Models by Survival of the Fittest Prompts

From Kernels to Features: A Multi-Scale Adaptive Theory of Feature Learning

Diversified Flow Matching with Translation Identifiability

Data-Driven Selection of Instrumental Variables for Additive Nonlinear, Constant Effects Models

Redundancy Undermines the Trustworthiness of Self-Interpretable GNNs

The Panaceas for Improving Low-Rank Decomposition in Communication-Efficient Federated Learning

Fully Dynamic Embedding into $\ell_p$ Spaces

Emergent Response Planning in LLMs

Learning Time-Aware Causal Representation for Model Generalization in Evolving Domains

Adapting While Learning: Grounding LLMs for Scientific Problems with Tool Usage Adaptation

LaCache: Ladder-Shaped KV Caching for Efficient Long-Context Modeling of Large Language Models

Discovering Spoofing Attempts on Language Model Watermarks

Stay Hungry, Keep Learning: Sustainable Plasticity for Deep Reinforcement Learning

Integration-free Kernels for Equivariant Gaussian Process Modelling

iDPA: Instance Decoupled Prompt Attention for Incremental Medical Object Detection

Copilot Arena: A Platform for Code LLM Evaluation in the Wild

C-3PO: Compact Plug-and-Play Proxy Optimization to Achieve Human-like Retrieval-Augmented Generation

LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language Models

Q-Supervised Contrastive Representation: A State Decoupling Framework for Safe Offline Reinforcement Learning

Craftium: Bridging Flexibility and Efficiency for Rich 3D Single- and Multi-Agent Environments

ERICT: Enhancing Robustness by Identifying Concept Tokens in Zero-Shot Vision Language Models

Curriculum Learning for Biological Sequence Prediction: The Case of De Novo Peptide Sequencing

Universal Biological Sequence Reranking for Improved De Novo Peptide Sequencing

Instance Correlation Graph-based Naive Bayes

Understanding and Improving Length Generalization in Recurrent Models

Graph Minimum Factor Distance and Its Application to Large-Scale Graph Data Clustering

The impact of uncertainty on regularized learning in games

A Cross Modal Knowledge Distillation & Data Augmentation Recipe for Improving Transcriptomics Representations through Morphological Features

Discovering Physics Laws of Dynamical Systems via Invariant Function Learning

Learning Joint Interventional Effects from Single-Variable Interventions in Additive Models

Human-Aligned Image Models Improve Visual Decoding from the Brain

Understanding Multimodal LLMs Under Distribution Shifts: An Information-Theoretic Approach

Primal-Dual Neural Algorithmic Reasoning

Test-Time Learning for Large Language Models

High Dynamic Range Novel View Synthesis with Single Exposure

HealthGPT: A Medical Large Vision-Language Model for Unifying Comprehension and Generation via Heterogeneous Knowledge Adaptation

RePaViT: Scalable Vision Transformer Acceleration via Structural Reparameterization on Feedforward Network Layers

ARS: Adaptive Reward Scaling for Multi-Task Reinforcement Learning

ZeroFlow: Overcoming Catastrophic Forgetting is Easier than You Think

On the Training Convergence of Transformers for In-Context Classification of Gaussian Mixtures

Statistical Collusion by Collectives on Learning Platforms

Annealing Flow Generative Models Towards Sampling High-Dimensional and Multi-Modal Distributions

Automated Hypothesis Validation with Agentic Sequential Falsifications

The Batch Complexity of Bandit Pure Exploration

Optimal Decision Tree Pruning Revisited: Algorithms and Complexity

MAS-GPT: Training LLMs to Build LLM-based Multi-Agent Systems

Task Generalization with Autoregressive Compositional Structure: Can Learning from $D$ Tasks Generalize to $D^T$ Tasks?

WorldSimBench: Towards Video Generation Models as World Simulators

Synonymous Variational Inference for Perceptual Image Compression

ITBench: Evaluating AI Agents across Diverse Real-World IT Automation Tasks

Minerva: A Programmable Memory Test Benchmark for Language Models

Fraud-Proof Revenue Division on Subscription Platforms

Beyond CVaR: Leveraging Static Spectral Risk Measures for Enhanced Decision-Making in Distributional Reinforcement Learning

Radio: Rate–Distortion Optimization for Large Language Model Compression

GenMol: A Drug Discovery Generalist with Discrete Diffusion

Be Confident: Uncovering Overfitting in MLLM Multi-Task Tuning

Visual Attention Never Fades: Selective Progressive Attention ReCalibration for Detailed Image Captioning in Multimodal Large Language Models

Learning Survival Distributions with the Asymmetric Laplace Distribution

How to Synthesize Text Data without Model Collapse?

DeepCrossAttention: Supercharging Transformer Residual Connections

NeuralCohort: Cohort-aware Neural Representation Learning for Healthcare Analytics

Bipartite Ranking From Multiple Labels: On Loss Versus Label Aggregation

Lean and Mean Adaptive Optimization via Subset-Norm and Subspace-Momentum with Convergence Guarantees

Interaction-Aware Gaussian Weighting for Clustered Federated Learning

Since Faithfulness Fails: The Performance Limits of Neural Causal Discovery

Private Federated Learning using Preference-Optimized Synthetic Data

History-Guided Video Diffusion

Diffusion models for Gaussian distributions: Exact solutions and Wasserstein errors

Geometric Hyena Networks for Large-scale Equivariant Learning

Black-Box Adversarial Attacks on LLM-Based Code Completion

Testing the Limits of Fine-Tuning for Improving Visual Cognition in Vision Language Models

LoRA Training Provably Converges to a Low-Rank Global Minimum Or It Fails Loudly (But it Probably Won't Fail)

PDE-Transformer: Efficient and Versatile Transformers for Physics Simulations

Evolving Prompts In-Context: An Open-ended, Self-replicating Perspective

Aggregation of Dependent Expert Distributions in Multimodal Variational Autoencoders

EquivaMap: Leveraging LLMs for Automatic Equivalence Checking of Optimization Formulations

Three-Dimensional Trajectory Prediction with 3DMoTraj Dataset

Alberta Wells Dataset: Pinpointing Oil and Gas Wells from Satellite Imagery

AutoAL: Automated Active Learning with Differentiable Query Strategy Search

Emergent Symbolic Mechanisms Support Abstract Reasoning in Large Language Models

A First-order Generative Bilevel Optimization Framework for Diffusion Models

Locate-then-edit for Multi-hop Factual Recall under Knowledge Editing

MATH-Perturb: Benchmarking LLMs' Math Reasoning Abilities against Hard Perturbations

How to Evaluate and Mitigate IP Infringement in Visual Generative AI?

A Closer Look at Transformers for Time Series Forecasting: Understanding Why They Work and Where They Struggle

Falsification of Unconfoundedness by Testing Independence of Causal Mechanisms

BAnG: Bidirectional Anchored Generation for Conditional RNA Design

What Limits Bidirectional Model's Generative Capabilities? A Uni-Bi-Directional Mixture-of-Expert Method For Bidirectional Fine-tuning

Mastering Board Games by External and Internal Planning with Language Models

Guardians of Image Quality: Benchmarking Defenses Against Adversarial Attacks on Image Quality Metrics

Hardware and Software Platform Inference

Exponential Family Variational Flow Matching for Tabular Data Generation

Solving Zero-Sum Convex Markov Games

UI-Vision: A Desktop-centric GUI Benchmark for Visual Perception and Interaction

LIVS: A Pluralistic Alignment Dataset for Inclusive Public Spaces

Improved Learning via k-DTW: A Novel Dissimilarity Measure for Curves

SCENIR: Visual Semantic Clarity through Unsupervised Scene Graph Retrieval

Optimizing Robustness and Accuracy in Mixture of Experts: A Dual-Model Approach

A Generic Family of Graphical Models: Diversity, Efficiency, and Heterogeneity

Correlated Errors in Large Language Models

Towards Global-level Mechanistic Interpretability: A Perspective of Modular Circuits of Large Language Models

unMORE: Unsupervised Multi-Object Segmentation via Center-Boundary Reasoning

A Reductions Approach to Risk-Sensitive Reinforcement Learning with Optimized Certainty Equivalents

MapEval: A Map-Based Evaluation of Geo-Spatial Reasoning in Foundation Models

EvoPress: Accurate Dynamic Model Compression via Evolutionary Search

Towards Understanding Gradient Dynamics of the Sliced-Wasserstein Distance via Critical Point Analysis

Training High Performance Spiking Neural Network by Temporal Model Calibration

Understanding the Statistical Accuracy-Communication Trade-off in Personalized Federated Learning with Minimax Guarantees

Scaling Test-Time Compute Without Verification or RL is Suboptimal

Flowing Datasets with Wasserstein over Wasserstein Gradient Flows

Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models

Rethinking External Slow-Thinking: From Snowball Errors to Probability of Correct Reasoning

The Butterfly Effect: Neural Network Training Trajectories Are Highly Sensitive to Initial Conditions

Test-time Adaptation on Graphs via Adaptive Subgraph-based Selection and Regularized Prototypes

Density Ratio Estimation with Conditional Probability Paths

Sample-specific Noise Injection for Diffusion-based Adversarial Purification

Product of Experts with LLMs: Boosting Performance on ARC Is a Matter of Perspective

What Do Learning Dynamics Reveal About Generalization in LLM Mathematical Reasoning?

Optimizing Test-Time Compute via Meta Reinforcement Finetuning

Cooperation of Experts: Fusing Heterogeneous Information with Large Margin

Multimodal Medical Code Tokenizer

LensLLM: Unveiling Fine-Tuning Dynamics for LLM Selection

Learning with Exact Invariances in Polynomial Time

Disentangling and Integrating Relational and Sensory Information in Transformer Architectures

FeatSharp: Your Vision Model Features, Sharper

Generative Social Choice: The Next Generation

UltraTWD: Optimizing Ultrametric Trees for Tree-Wasserstein Distance

TabFlex: Scaling Tabular Learning to Millions with Linear Attention

On the Clean Generalization and Robust Overfitting in Adversarial Training from Two Theoretical Views: Representation Complexity and Training Dynamics

CAN: Leveraging Clients As Navigators for Generative Replay in Federated Continual Learning

Grokking in the Wild: Data Augmentation for Real-World Multi-Hop Reasoning with Transformers

Online Robust Reinforcement Learning Through Monte-Carlo Planning

FACTER: Fairness-Aware Conformal Thresholding and Prompt Engineering for Enabling Fair LLM-Based Recommender Systems

On the Robustness of Reward Models for Language Model Alignment

Orthus: Autoregressive Interleaved Image-Text Generation with Modality-Specific Heads

Not All Wrong is Bad: Using Adversarial Examples for Unlearning

In-Context Adaptation to Concept Drift for Learned Database Operations

Self-Supervised Transformers as Iterative Solution Improvers for Constraint Satisfaction

FicGCN: Unveiling the Homomorphic Encryption Efficiency from Irregular Graph Convolutional Networks

Skrr: Skip and Re-use Text Encoder Layers for Memory Efficient Text-to-Image Generation

One Stone, Two Birds: Enhancing Adversarial Defense Through the Lens of Distributional Discrepancy

Learning without Isolation: Pathway Protection for Continual Learning

Editable Concept Bottleneck Models

Efficiently Vectorized MCMC on Modern Accelerators

BOOD: Boundary-based Out-Of-Distribution Data Generation

Optimization for Neural Operators can Benefit from Width

TTFSFormer: A TTFS-based Lossless Conversion of Spiking Transformer

Reasoning-as-Logic-Units: Scaling Test-Time Reasoning in Large Language Models Through Logic Unit Alignment

AEQA-NAT : Adaptive End-to-end Quantization Alignment Training Framework for Non-autoregressive Machine Translation

MATS: An Audio Language Model under Text-only Supervision

Proto Successor Measure: Representing the Behavior Space of an RL Agent

Inverse Problem Sampling in Latent Space Using Sequential Monte Carlo

SKOLR: Structured Koopman Operator Linear RNN for Time-Series Forecasting

Beyond Topological Self-Explainable GNNs: A Formal Explainability Perspective

Continual Generalized Category Discovery: Learning and Forgetting from a Bayesian Perspective

An Online Learning Approach to Prompt-based Selection of Generative Models and LLMs

Fusing Reward and Dueling Feedback in Stochastic Bandits

Invariant Deep Uplift Modeling for Incentive Assignment in Online Marketing via Probability of Necessity and Sufficiency

PIGDreamer: Privileged Information Guided World Models for Safe Partially Observable Reinforcement Learning

Pareto-Optimal Fronts for Benchmarking Symbolic Regression Algorithms

Making Hard Problems Easier with Custom Data Distributions and Loss Regularization: A Case Study in Modular Arithmetic

Towards Cost-Effective Reward Guided Text Generation

Diffusion Models are Secretly Exchangeable: Parallelizing DDPMs via Auto Speculation

EasyInv: Toward Fast and Better DDIM Inversion

SparseVLM: Visual Token Sparsification for Efficient Vision-Language Model Inference

Identifying Neural Dynamics Using Interventional State Space Models

Stochastic Poisson Surface Reconstruction with One Solve using Geometric Gaussian Processes

Adaptive Data Collection for Robust Learning Across Multiple Distributions

SageAttention2: Efficient Attention with Thorough Outlier Smoothing and Per-thread INT4 Quantization

The Best of Both Worlds: Bridging Quality and Diversity in Data Selection with Bipartite Graph

The Price of Freedom: Exploring Expressivity and Runtime Tradeoffs in Equivariant Tensor Products

Flow Matching for Few-Trial Neural Adaptation with Stable Latent Dynamics

SafeAuto: Knowledge-Enhanced Safe Autonomous Driving with Multimodal Foundation Models

Rethink GraphODE Generalization within Coupled Dynamical System

Nemotron-CORTEXA: Enhancing LLM Agents for Software Engineering Tasks via Improved Localization and Solution Diversity

Larger or Smaller Reward Margins to Select Preferences for LLM Alignment?

One Wave To Explain Them All: A Unifying Perspective On Feature Attribution

CAT: Contrastive Adversarial Training for Evaluating the Robustness of Protective Perturbations in Latent Diffusion Models

TINED: GNNs-to-MLPs by Teacher Injection and Dirichlet Energy Distillation

Exploring Large Action Sets with Hyperspherical Embeddings using von Mises-Fisher Sampling

Leveraging Sparsity for Sample-Efficient Preference Learning: A Theoretical Perspective

Sample Complexity of Correlation Detection in the Gaussian Wigner Model

TS-SNN: Temporal Shift Module for Spiking Neural Networks

Whitened CLIP as a Likelihood Surrogate of Images and Captions

CoreMatching: A Co-adaptive Sparse Inference Framework with Token and Neuron Pruning for Comprehensive Acceleration of Vision-Language Models

Why Has Predicting Downstream Capabilities of Frontier AI Models with Scale Remained Elusive?

Compositional Condition Question Answering in Tabular Understanding

Privacy Amplification Through Synthetic Data: Insights from Linear Regression

Inference-Time Alignment of Diffusion Models with Direct Noise Optimization

EncryptedLLM: Privacy-Preserving Large Language Model Inference via GPU-Accelerated Fully Homomorphic Encryption

Universal Neural Optimal Transport

Geometric Resampling in Nearly Linear Time for Follow-the-Perturbed-Leader with Best-of-Both-Worlds Guarantee in Bandit Problems

An analytic theory of creativity in convolutional diffusion models

Measuring In-Context Computation Complexity via Hidden State Prediction

Can Transformers Reason Logically? A Study in SAT Solving

Teaching Transformers Causal Reasoning through Axiomatic Training

Position: A Theory of Deep Learning Must Include Compositional Sparsity

AutoStep: Locally adaptive involutive MCMC

SCENT: Robust Spatiotemporal Learning for Continuous Scientific Data via Scalable Conditioned Neural Fields

Wasserstein Policy Optimization

Rethinking Point Cloud Data Augmentation: Topologically Consistent Deformation

Learning Safe Strategies for Value Maximizing Buyers in Uniform Price Auctions

Data-driven Design of Randomized Control Trials with Guaranteed Treatment Effects

FedBEns: One-Shot Federated Learning based on Bayesian Ensemble

On the Private Estimation of Smooth Transport Maps

When Every Millisecond Counts: Real-Time Anomaly Detection via the Multimodal Asynchronous Hybrid Network

LoRA-Gen: Specializing Large Language Model via Online LoRA Generation

ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference

Efficient Curvature-Aware Hypergradient Approximation for Bilevel Optimization

DMOSpeech: Direct Metric Optimization via Distilled Diffusion Model in Zero-Shot Speech Synthesis

POQD: Performance-Oriented Query Decomposer for Multi-vector retrieval

Empowering World Models with Reflection for Embodied Video Prediction

Equivariant Neural Tangent Kernels

R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts

Dynamic Similarity Graph Construction with Kernel Density Estimation

Determining Layer-wise Sparsity for Large Language Models Through a Theoretical Perspective

Enforcing Latent Euclidean Geometry in Single-Cell VAEs for Manifold Interpolation

AutoML-Agent: A Multi-Agent LLM Framework for Full-Pipeline AutoML

A Tale of Two Structures: Do LLMs Capture the Fractal Complexity of Language?

Grammar-Forced Translation of Natural Language to Temporal Logic using LLMs

The Brain's Bitter Lesson: Scaling Speech Decoding With Self-Supervised Learning

Federated Learning for Feature Generalization with Convex Constraints

Outsourced Diffusion Sampling: Efficient Posterior Inference in Latent Spaces of Generative Models

Generative Intervention Models for Causal Perturbation Modeling

Preconditioned Riemannian Gradient Descent Algorithm for Low-Multilinear-Rank Tensor Completion

CEGA: A Cost-Effective Approach for Graph-Based Model Extraction and Acquisition

Effective and Efficient Masked Image Generation Models

Revisiting Convergence: Shuffling Complexity Beyond Lipschitz Smoothness

Generalized Category Discovery via Reciprocal Learning and Class-Wise Distribution Regularization

Divide and Conquer: Grounding LLMs as Efficient Decision-Making Agents via Offline Hierarchical Reinforcement Learning

Learning Representations of Instruments for Partial Identification of Treatment Effects

Modular Duality in Deep Learning

CFPT: Empowering Time Series Forecasting through Cross-Frequency Interaction and Periodic-Aware Timestamp Modeling

On-the-Fly Adaptive Distillation of Transformer to Dual-State Linear Attention for Long-Context LLM Serving

Deep Reinforcement Learning from Hierarchical Preference Design

Think Smarter not Harder: Adaptive Reasoning with Inference Aware Optimization

DIS-CO: Discovering Copyrighted Content in VLMs Training Data

BalancEdit: Dynamically Balancing the Generality-Locality Trade-off in Multi-modal Model Editing

OmniBal: Towards Fast Instruction-Tuning for Vision-Language Models via Omniverse Computation Balance

Self-Play $Q$-Learners Can Provably Collude in the Iterated Prisoner's Dilemma

WikiBigEdit: Understanding the Limits of Lifelong Knowledge Editing in LLMs

Simplicity Bias and Optimization Threshold in Two-Layer ReLU Networks

INRFlow: Flow Matching for INRs in Ambient Space

Distributionally Robust Multi-Agent Reinforcement Learning for Dynamic Chute Mapping

Reward-Augmented Data Enhances Direct Preference Alignment of LLMs

MELON: Provable Defense Against Indirect Prompt Injection Attacks in AI Agents

Toward Data-centric Directed Graph Learning: An Entropy-driven Approach

TIMING: Temporality-Aware Integrated Gradients for Time Series Explanation

Provably Near-Optimal Federated Ensemble Distillation with Negligible Overhead

Secant Line Search for Frank-Wolfe Algorithms

DOLPHIN: A Programmable Framework for Scalable Neurosymbolic Learning

Principled Data Selection for Alignment: The Hidden Risks of Difficult Examples

Unisolver: PDE-Conditional Transformers Towards Universal Neural PDE Solvers

Voronoi-grid-based Pareto Front Learning and Its Application to Collaborative Federated Learning

Time-Aware World Model for Adaptive Prediction and Control

Minimalist Concept Erasure in Generative Models

PARQ: Piecewise-Affine Regularized Quantization

Improved Coresets for Vertical Federated Learning: Regularized Linear and Logistic Regressions

Toward Efficient Kernel-Based Solvers for Nonlinear PDEs

Large Displacement Motion Transfer with Unsupervised Anytime Interpolation

Bayesian Weight Enhancement with Steady-State Adaptation for Test-time Adaptation in Dynamic Environments

ViTally Consistent: Scaling Biological Representation Learning for Cell Microscopy

N2GON: Neural Networks for Graph-of-Net with Position Awareness

Validating Mechanistic Interpretations: An Axiomatic Approach

Action-Constrained Imitation Learning

Highly Compressed Tokenizer Can Generate Without Training

Provable Length Generalization in Sequence Prediction via Spectral Filtering

Can Compressed LLMs Truly Act? An Empirical Evaluation of Agentic Capabilities in LLM Compression

Improved Algorithm for Deep Active Learning under Imbalance via Optimal Separation

Calibrated Language Models and How to Find Them with Label Smoothing

Efficient Long Context Fine-tuning with Chunk Flow

Evaluating Neuron Explanations: A Unified Framework with Sanity Checks

Strengthen Out-of-Distribution Detection Capability with Progressive Self-Knowledge Distillation

TimePro: Efficient Multivariate Long-term Time Series Forecasting with Variable- and Time-Aware Hyper-state

Addressing Concept Mislabeling in Concept Bottleneck Models Through Preference Optimization

From Language Models over Tokens to Language Models over Characters

ZebraLogic: On the Scaling Limits of LLMs for Logical Reasoning

TraceGrad: a Framework Learning Expressive SO(3)-equivariant Non-linear Representations for Electronic-Structure Hamiltonian Prediction

Diversifying Robot Locomotion Behaviors with Extrinsic Behavioral Curiosity

Strong and Weak Identifiability of Optimization-based Causal Discovery in Non-linear Additive Noise Models

Task-Agnostic Pre-training and Task-Guided Fine-tuning for Versatile Diffusion Planner

Understanding Generalization in Quantum Machine Learning with Margins

Resolving Lexical Bias in Model Editing

Nearly Optimal Sample Complexity for Learning with Label Proportions

Delay-DSGN: A Dynamic Spiking Graph Neural Network with Delay Mechanisms for Evolving Graph

Scalable Private Partition Selection via Adaptive Weighting

FlashTP: Fused, Sparsity-Aware Tensor Product for Machine Learning Interatomic Potentials

GSM-$\infty$: How Do your LLMs Behave over Infinitely Increasing Reasoning Complexity and Context Length?

Position: The Artificial Intelligence and Machine Learning Community Should Adopt a More Transparent and Regulated Peer Review Process

A Sharper Global Convergence Analysis for Average Reward Reinforcement Learning via an Actor-Critic Approach

The Ripple Effect: On Unforeseen Complications of Backdoor Attacks

RLEF: Grounding Code LLMs in Execution Feedback with Reinforcement Learning

Optimizing Language Models for Inference Time Objectives using Reinforcement Learning

On the Local Complexity of Linear Regions in Deep ReLU Networks

PILAF: Optimal Human Preference Sampling for Reward Modeling

Improving Diversity in Language Models: When Temperature Fails, Change the Loss

Concept-Centric Token Interpretation for Vector-Quantized Generative Models

Hierarchical Masked Autoregressive Models with Low-Resolution Token Pivots

Safely Learning Optimal Auctions: A Testable Learning Framework for Mechanism Design

Efficient Federated Incomplete Multi-View Clustering

PertEval-scFM: Benchmarking Single-Cell Foundation Models for Perturbation Effect Prediction

Learn Beneficial Noise as Graph Augmentation

Perceptual-GS: Scene-adaptive Perceptual Densification for Gaussian Splatting

Ensemble Learned Bloom Filters: Two Oracles are Better than One

From Logits to Hierarchies: Hierarchical Clustering made Simple

Multi-agent Architecture Search via Agentic Supernet

MENTOR: Mixture-of-Experts Network with Task-Oriented Perturbation for Visual Reinforcement Learning

ML$^2$-GCL: Manifold Learning Inspired Lightweight Graph Contrastive Learning

Deep Unsupervised Hashing via External Guidance

An All-Atom Generative Model for Designing Protein Complexes

Geometric Algebra Planes: Convex Implicit Neural Volumes

Designing Cyclic Peptides via Harmonic SDE with Atom-Bond Modeling

Temporal Difference Flows

SecEmb: Sparsity-Aware Secure Federated Learning of On-Device Recommender System with Large Embedding

The Canary’s Echo: Auditing Privacy Risks of LLM-Generated Synthetic Text

Causal Discovery from Conditionally Stationary Time Series

Scaling Trends in Language Model Robustness

Double-Filter: Efficient Fine-tuning of Pre-trained Vision-Language Models via Patch&Layer Filtering

Predictive Data Selection: The Data That Predicts Is the Data That Teaches

Graph-Based Algorithms for Diverse Similarity Search

GaussMark: A Practical Approach for Structural Watermarking of Language Models

Is Best-of-N the Best of Them? Coverage, Scaling, and Optimality in Inference-Time Alignment

Distillation of Discrete Diffusion through Dimensional Correlations

Counterfactual Effect Decomposition in Multi-Agent Sequential Decision Making

SKIM: Any-bit Quantization Pushing The Limits of Post-Training Quantization

BoxLM: Unifying Structures and Semantics of Medical Concepts for Diagnosis Prediction in Healthcare

PIPA: Preference Alignment as Prior-Informed Statistical Estimation

Behavior-agnostic Task Inference for Robust Offline In-context Reinforcement Learning

Leveraging Offline Data in Linear Latent Contextual Bandits

Avoiding Leakage Poisoning: Concept Interventions Under Distribution Shifts

General agents need world models

Position: AI Competitions Provide the Gold Standard for Empirical Rigor in GenAI Evaluation

Global Context-aware Representation Learning for Spatially Resolved Transcriptomics

Understanding and Mitigating Miscalibration in Prompt Tuning for Vision-Language Models

Token Cleaning: Fine-Grained Data Selection for LLM Supervised Fine-Tuning

LieRE: Lie Rotational Positional Encodings

The Power of Random Features and the Limits of Distribution-Free Gradient Descent

HGOT: Self-supervised Heterogeneous Graph Neural Network with Optimal Transport

EcoMapper: Generative Modeling for Climate-Aware Satellite Imagery

scSSL-Bench: Benchmarking Self-Supervised Learning for Single-Cell Data

Sparse Causal Discovery with Generative Intervention for Unsupervised Graph Domain Adaptation

SEMU: Singular Value Decomposition for Efficient Machine Unlearning

SEFE: Superficial and Essential Forgetting Eliminator for Multimodal Continual Instruction Tuning

Instruction-Following Pruning for Large Language Models

MemFreezing: A Novel Adversarial Attack on Temporal Graph Neural Networks under Limited Future Knowledge

Large Language Model-driven Large Neighborhood Search for Large-Scale MILP Problems

CMoS: Rethinking Time Series Prediction Through the Lens of Chunk-wise Spatial Correlations

Statistical and Computational Guarantees of Kernel Max-Sliced Wasserstein Distances

Noisy SIGNSGD Is More Differentially Private Than You (Might) Think

I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning in Diffusion Models

From Individual Experience to Collective Evidence: A Reporting-Based Framework for Identifying Systemic Harms

NTPP: Generative Speech Language Modeling for Dual-Channel Spoken Dialogue via Next-Token-Pair Prediction

The Disparate Benefits of Deep Ensembles

Unnatural Languages Are Not Bugs but Features for LLMs

TokenSwift: Lossless Acceleration of Ultra Long Sequence Generation

Robot-Gated Interactive Imitation Learning with Adaptive Intervention Mechanism

Explicit Exploration for High-Welfare Equilibria in Game-Theoretic Multiagent Reinforcement Learning

Efficient First-Order Optimization on the Pareto Set for Multi-Objective Learning under Preference Guidance

Behavioral Exploration: Learning to Explore via In-Context Adaptation

T1: Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling

Quantum Algorithms for Finite-horizon Markov Decision Processes

Dynamical Modeling of Behaviorally Relevant Spatiotemporal Patterns in Neural Imaging Data

LangDAug: Langevin Data Augmentation for Multi-Source Domain Generalization in Medical Image Segmentation

One Leaf Reveals the Season: Occlusion-Based Contrastive Learning with Semantic-Aware Views for Efficient Visual Representation

Efficient Source-free Unlearning via Energy-Guided Data Synthesis and Discrimination-Aware Multitask Optimization

Does learning the right latent variables necessarily improve in-context learning?

Feedforward Few-shot Species Range Estimation

Theoretical guarantees on the best-of-n alignment policy

EvFocus: Learning to Reconstruct Sharp Images from Out-of-Focus Event Streams

Constrain Alignment with Sparse Autoencoders

Discovering a Zero (Zero-Vector Class of Machine Learning)

AI for Global Climate Cooperation: Modeling Global Climate Negotiations, Agreements, and Long-Term Cooperation in RICE-N

TopInG: Topologically Interpretable Graph Learning via Persistent Rationale Filtration

GMAIL: Generative Modality Alignment for generated Image Learning

CateKV: On Sequential Consistency for Long-Context LLM Inference Acceleration

On Efficient Estimation of Distributional Treatment Effects under Covariate-Adaptive Randomization

Position: Medical Large Language Model Benchmarks Should Prioritize Construct Validity

Neural Genetic Search in Discrete Spaces

How Do Images Align and Complement LiDAR? Towards a Harmonized Multi-modal 3D Panoptic Segmentation

Proposer-Agent-Evaluator (PAE): Autonomous Skill Discovery For Foundation Model Internet Agents

Hgformer: Hyperbolic Graph Transformer for Collaborative Filtering

Unifying Knowledge from Diverse Datasets to Enhance Spatial-Temporal Modeling: A Granularity-Adaptive Geographical Embedding Approach

BRIDGE: Bootstrapping Text to Control Time-Series Generation via Multi-Agent Iterative Optimization and Diffusion Modeling

Balancing Interference and Correlation in Spatial Experimental Designs: A Causal Graph Cut Approach

FourierMamba: Fourier Learning Integration with State Space Models for Image Deraining

A Reasoning-Based Approach to Cryptic Crossword Clue Solving

SIMPLEMIX: Frustratingly Simple Mixing of Off- and On-policy Data in Language Model Preference Learning

Adapter Naturally Serves as Decoupler for Cross-Domain Few-Shot Semantic Segmentation

High-Dimensional Tensor Regression With Oracle Properties

Non-asymptotic Error Bounds in $\mathcal{W}_2$-Distance with Sqrt(d) Dimension Dependence and First Order Convergence for Langevin Monte Carlo beyond Log-Concavity

Fully Dynamic Euclidean Bi-Chromatic Matching in Sublinear Update Time

Optimal transport-based conformal prediction

What Makes In-context Learning Effective for Mathematical Reasoning

Global-Local Dirichlet Processes for Clustering Grouped Data in the Presence of Group-Specific Idiosyncratic Variables

A-PSRO: A Unified Strategy Learning Method with Advantage Metric for Normal-form Games

Alpha-SQL: Zero-Shot Text-to-SQL using Monte Carlo Tree Search

Quantifying Treatment Effects: Estimating Risk Ratios via Observational Studies

Quantum Optimization via Gradient-Based Hamiltonian Descent

Domain2Vec: Vectorizing Datasets to Find the Optimal Data Mixture without Training

Generalization in Federated Learning: A Conditional Mutual Information Framework

TimeDART: A Diffusion Autoregressive Transformer for Self-Supervised Time Series Representation

Geometric Contact Flows: Contactomorphisms for Dynamics and Control

Knowledge-Guided Wasserstein Distributionally Robust Optimization

Hessian Geometry of Latent Space in Generative Models

Convergence of Mean-Field Langevin Stochastic Descent-Ascent for Distributional Minimax Optimization

QT-DoG: Quantization-Aware Training for Domain Generalization

Outlier Gradient Analysis: Efficiently Identifying Detrimental Training Samples for Deep Learning Models

Online Sparsification of Bipartite-Like Clusters in Graphs

Efficient Graph Continual Learning via Lightweight Graph Neural Tangent Kernels-based Dataset Distillation

Understanding the Unfairness in Network Quantization

Zero Shot Generalization of Vision-Based RL Without Data Augmentation

DISCO: learning to DISCover an evolution Operator for multi-physics-agnostic prediction

DCTdiff: Intriguing Properties of Image Generative Modeling in the DCT Space

Regression for the Mean: Auto-Evaluation and Inference with Few Labels through Post-hoc Regression

FastCAV: Efficient Computation of Concept Activation Vectors for Explaining Deep Neural Networks

The Elicitation Game: Evaluating Capability Elicitation Techniques

Doubly Protected Estimation for Survival Outcomes Utilizing External Controls for Randomized Clinical Trials

CLARIFY: Contrastive Preference Reinforcement Learning for Untangling Ambiguous Queries

Nonlinearly Preconditioned Gradient Methods under Generalized Smoothness

Learning Single Index Models with Diffusion Priors

Neural Encoding and Decoding at Scale

Wyckoff Transformer: Generation of Symmetric Crystals

MetaAgent: Automatically Constructing Multi-Agent Systems Based on Finite State Machines

Generalization Analysis for Supervised Contrastive Representation Learning under Non-IID Settings

Broadband Ground Motion Synthesis by Diffusion Model with Minimal Condition

Sampling from Binary Quadratic Distributions via Stochastic Localization

Accelerating LLM Inference with Lossless Speculative Decoding Algorithms for Heterogeneous Vocabularies

Reidentify: Context-Aware Identity Generation for Contextual Multi-Agent Reinforcement Learning

Online Detection of LLM-Generated Texts via Sequential Hypothesis Testing by Betting

LIMEFLDL: A Local Interpretable Model-Agnostic Explanations Approach for Label Distribution Learning

Efficient Skill Discovery via Regret-Aware Optimization

On Linear Convergence in Smooth Convex-Concave Bilinearly-Coupled Saddle-Point Optimization: Lower Bounds and Optimal Algorithms

Scaling Laws for Forgetting during Finetuning with Pretraining Data Injection

Putnam-AXIOM: A Functional & Static Benchmark for Measuring Higher Level Mathematical Reasoning in LLMs

Avoiding spurious sharpness minimization broadens applicability of SAM

Mechanisms of Projective Composition of Diffusion Models

Explicit Discovery of Nonlinear Symmetries from Dynamic Data

Efficiently Serving Large Multimodal Models Using EPD Disaggregation

Model Steering: Learning with a Reference Model Improves Generalization Bounds and Scaling Laws

Recommendations with Sparse Comparison Data: Provably Fast Convergence for Nonconvex Matrix Factorization

Canonical Rank Adaptation: An Efficient Fine-Tuning Strategy for Vision Transformers

Hierarchical Planning for Complex Tasks with Knowledge Graph-RAG and Symbolic Verification

How to Train Your Multi-Exit Model? Analyzing the Impact of Training Strategies

VTGaussian-SLAM: RGBD SLAM for Large Scale Scenes with Splatting View-Tied 3D Gaussians

Sliding Puzzles Gym: A Scalable Benchmark for State Representation in Visual Reinforcement Learning

Ladder-Residual: Parallelism-Aware Architecture for Accelerating Large Model Inference with Communication Overlapping

Convergence of Consistency Model with Multistep Sampling under General Data Assumptions

Mahalanobis++: Improving OOD Detection via Feature Normalization

Boosting Masked ECG-Text Auto-Encoders as Discriminative Learners

Bounded Rationality for LLMs: Satisficing Alignment at Inference-Time

Audio Flamingo 2: An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Abilities

No-Regret is not enough! Bandits with General Constraints through Adaptive Regret Minimization

Mitigating Object Hallucination in Large Vision-Language Models via Image-Grounded Guidance

MTL-UE: Learning to Learn Nothing for Multi-Task Learning

Promoting Ensemble Diversity with Interactive Bayesian Distributional Robustness for Fine-tuning Foundation Models

Rethinking Confidence Scores and Thresholds in Pseudolabeling-based SSL

ReverB-SNN: Reversing Bit of the Weight and Activation for Spiking Neural Networks

Outlier-Aware Post-Training Quantization for Discrete Graph Diffusion Models

What Makes a Good Feedforward Computational Graph?

Hi Robot: Open-Ended Instruction Following with Hierarchical Vision-Language-Action Models

Adapting to Linear Separable Subsets with Large-Margin in Differentially Private Learning

FAB-PPI: Frequentist, Assisted by Bayes, Prediction-Powered Inference

SDMG: Smoothing Your Diffusion Models for Powerful Graph Representation Learning

Optimizing Noise Distributions for Differential Privacy

Partition First, Embed Later: Laplacian-Based Feature Partitioning for Refined Embedding and Visualization of High-Dimensional Data

Branches: Efficiently Seeking Optimal Sparse Decision Trees via AO*

Training Diffusion-based Generative Models with Limited Data

The Hidden Dimensions of LLM Alignment: A Multi-Dimensional Analysis of Orthogonal Safety Directions

Benefits of Early Stopping in Gradient Descent for Overparameterized Logistic Regression

Clipped SGD Algorithms for Performative Prediction: Tight Bounds for Stochastic Bias and Remedies

Collapse-Proof Non-Contrastive Self-Supervised Learning

CRANE: Reasoning with constrained LLM generation

Ensemble Distribution Distillation via Flow Matching

Unisoma: A Unified Transformer-based Solver for Multi-Solid Systems

Private Lossless Multiple Release

Bayesian Active Learning for Bivariate Causal Discovery

Consensus Is All You Get: The Role of Attention in Transformers

Gradient Descent Converges Arbitrarily Fast for Logistic Regression via Large and Adaptive Stepsizes

WILTing Trees: Interpreting the Distance Between MPNN Embeddings

PhantomWiki: On-Demand Datasets for Reasoning and Retrieval Evaluation

LBI-FL: Low-Bit Integerized Federated Learning with Temporally Dynamic Bit-Width Allocation

Rethinking Addressing in Language Models via Contextualized Equivariant Positional Encoding

OrcaLoca: An LLM Agent Framework for Software Issue Localization

Sample, Scrutinize and Scale: Effective Inference-Time Search by Scaling Verification

Less is More: Federated Graph Learning with Alleviating Topology Heterogeneity from A Causal Perspective

Mixed-curvature decision trees and random forests

Sparse Autoencoders, Again?

Safety-Polarized and Prioritized Reinforcement Learning

An Adaptive Orthogonal Convolution Scheme for Efficient and Flexible CNN Architectures

Approximate Differential Privacy of the $\ell_2$ Mechanism

Competitively Consistent Clustering

Code-Generated Graph Representations Using Multiple LLM Agents for Material Properties Prediction

MOGIC: Metadata-infused Oracle Guidance for Improved Extreme Classification

Learning Multi-Level Features with Matryoshka Sparse Autoencoders

Phase transitions for the existence of unregularized M-estimators in single index models

On the Adversarial Robustness of Multi-Kernel Clustering

Diffusion on Language Model Encodings for Protein Sequence Generation

Unconstrained Robust Online Convex Optimization

Low-Rank Adapting Models for Sparse Autoencoders

Relational Invariant Learning for Robust Solvation Free Energy Prediction

Scalable Model Merging with Progressive Layer-wise Distillation

SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering?

Continuous Semi-Implicit Models

Ultra Lowrate Image Compression with Semantic Residual Coding and Compression-aware Diffusion

Graph World Model

Learning Distribution-wise Control in Representation Space for Language Models

TypyBench: Evaluating LLM Type Inference for Untyped Python Repositories

Exploring Criteria of Loss Reweighting to Enhance LLM Unlearning

Synthetic Text Generation for Training Large Language Models via Gradient Matching

EditLord: Learning Code Transformation Rules for Code Editing

Network Sparsity Unlocks the Scaling Potential of Deep Reinforcement Learning

AffectGPT: A New Dataset, Model, and Benchmark for Emotion Understanding with Multimodal Large Language Models

Fluctuations of the largest eigenvalues of transformed spiked Wigner matrices

Covered Forest: Fine-grained generalization analysis of graph neural networks

EvoControl: Multi-Frequency Bi-Level Control for High-Frequency Continuous Control

Gradient Flow Provably Learns Robust Classifiers for Orthonormal GMMs

Robust Noise Attenuation via Adaptive Pooling of Transformer Outputs

Understanding and Mitigating Memorization in Generative Models via Sharpness of Probability Landscapes

Continual Reinforcement Learning by Planning with Online World Models

Breaking Barriers: Combinatorial Algorithms for Non-Monotone Submodular Maximization with Sublinear Adaptivity and $1/e$ Approximation

The Surprising Effectiveness of Test-Time Training for Few-Shot Learning

NextCoder: Robust Adaptation of Code LMs to Diverse Code Edits

TRACE Back from the Future: A Probabilistic Reasoning Approach to Controllable Language Generation

MMInference: Accelerating Pre-filling for Long-Context Visual Language Models via Modality-Aware Permutation Sparse Attention

HYGMA: Hypergraph Coordination Networks with Dynamic Grouping for Multi-Agent Reinforcement Learning

Conformal Prediction as Bayesian Quadrature

Better to Teach than to Give: Domain Generalized Semantic Segmentation via Agent Queries with Diffusion Model Guidance

An Improved Clique-Picking Algorithm for Counting Markov Equivalent DAGs via Super Cliques Transfer

Rethinking the Stability-Plasticity Trade-off in Continual Learning from an Architectural Perspective

Tensorized Multi-View Multi-Label Classification via Laplace Tensor Rank

Latent Variable Estimation in Bayesian Black-Litterman Models

A General Graph Spectral Wavelet Convolution via Chebyshev Order Decomposition

Improving the Statistical Efficiency of Cross-Conformal Prediction

Diffusion Sampling Correction via Approximately 10 Parameters

Ferret: Federated Full-Parameter Tuning at Scale for Large Language Models

Hyper: Hyperparameter Robust Efficient Exploration in Reinforcement Learning

Representation Preserving Multiclass Agnostic to Realizable Reduction

Inverse Reinforcement Learning with Switching Rewards and History Dependency for Characterizing Animal Behaviors

PAC Learning with Improvements

Hierarchical Equivariant Policy via Frame Transfer

P(all-atom) Is Unlocking New Path For Protein Design

The Number of Trials Matters in Infinite-Horizon General-Utility Markov Decision Processes

TextCenGen: Attention-Guided Text-Centric Background Adaptation for Text-to-Image Generation

Dynamical phases of short-term memory mechanisms in RNNs

Thermalizer: Stable autoregressive neural emulation of spatiotemporal chaos

Online Conformal Prediction via Online Optimization

The Berkeley Function Calling Leaderboard (BFCL): From Tool Use to Agentic Evaluation of Large Language Models

Improved Regret Analysis in Gaussian Process Bandits: Optimality for Noiseless Reward, RKHS norm, and Non-Stationary Variance

Eliciting Language Model Behaviors with Investigator Agents

MAGELLAN: Metacognitive predictions of learning progress guide autotelic LLM agents in large goal spaces

AdaPTS: Adapting Univariate Foundation Models to Probabilistic Multivariate Time Series Forecasting

Ad Hoc Teamwork via Offline Goal-Based Decision Transformers

Policy Gradient with Tree Expansion

Auditing Prompt Caching in Language Model APIs

Representations Shape Weak-to-Strong Generalization: Theoretical Insights and Empirical Predictions

Limitations of measure-first protocols in quantum machine learning

EFDTR: Learnable Elliptical Fourier Descriptor Transformer for Instance Segmentation

A Stronger Mixture of Low-Rank Experts for Fine-Tuning Foundation Models

TANGO: Clustering with Typicality-Aware Nonlocal Mode-Seeking and Graph-Cut Optimization

SGD Jittering: A Training Strategy for Robust and Accurate Model-Based Architectures

Learning to (Learn at Test Time): RNNs with Expressive Hidden States

Double Machine Learning for Causal Inference under Shared-State Interference

SAeUron: Interpretable Concept Unlearning in Diffusion Models with Sparse Autoencoders

On the Importance of Gaussianizing Representations

Explaining, Fast and Slow: Abstraction and Refinement of Provable Explanations

Adaptive Elicitation of Latent Information Using Natural Language

Self-Improving Language Models for Evolutionary Program Synthesis: A Case Study on ARC-AGI

Empirical Design in Reinforcement Learning

Goal-Space Planning with Subgoal Models

Breaking the Curse of Multiagency in Robust Multi-Agent Reinforcement Learning

RULEBREAKERS: Challenging LLMs at the Crossroads between Formal Logic and Human-like Reasoning

Neural Graph Matching Improves Retrieval Augmented Generation in Molecular Machine Learning

Position: Lifetime tuning is incompatible with continual reinforcement learning

SparseLoRA: Accelerating LLM Fine-Tuning with Contextual Sparsity

No Free Lunch from Random Feature Ensembles: Scaling Laws and Near-Optimality Conditions

Diversity By Design: Leveraging Distribution Matching for Offline Model-Based Optimization

Efficient Molecular Conformer Generation with SO(3)-Averaged Flow Matching and Reflow

Survival Analysis via Density Estimation

Fishers for Free? Approximating the Fisher Information Matrix by Recycling the Squared Gradient Accumulator

Can Large Language Models Understand Intermediate Representations in Compilers?

KVTuner: Sensitivity-Aware Layer-Wise Mixed-Precision KV Cache Quantization for Efficient and Nearly Lossless LLM Inference

SafeArena: Evaluating the Safety of Autonomous Web Agents

How to Move Your Dragon: Text-to-Motion Synthesis for Large-Vocabulary Objects

Data-Juicer Sandbox: A Feedback-Driven Suite for Multimodal Data-Model Co-development

Feature learning from non-Gaussian inputs: the case of Independent Component Analysis in high dimensions

Heterogeneous Data Game: Characterizing the Model Competition Across Multiple Data Sources

On the Tension between Byzantine Robustness and No-Attack Accuracy in Distributed Learning

Hybrid Batch Normalisation: Resolving the Dilemma of Batch Normalisation in Federated Learning

Protein Structure Tokenization: Benchmarking and New Recipe

Text-to-LoRA: Instant Transformer Adaption

Navigating the Social Welfare Frontier: Portfolios for Multi-objective Reinforcement Learning

Learnable Spatial-Temporal Positional Encoding for Link Prediction

Tilted Sharpness-Aware Minimization

MoEQuant: Enhancing Quantization for Mixture-of-Experts Large Language Models via Expert-Balanced Sampling and Affinity Guidance

LARM: Large Auto-Regressive Model for Long-Horizon Embodied Intelligence

Convergence of Policy Mirror Descent Beyond Compatible Function Approximation

DynaMind: Reasoning over Abstract Video Dynamics for Embodied Decision-Making

Graph4MM: Weaving Multimodal Learning with Structural Information

Delta Decompression for MoE-based LLMs Compression

Point-Level Topological Representation Learning on Point Clouds

Investigating the Overlooked Hessian Structure: From CNNs to LLMs

Simple and Critical Iterative Denoising: A Recasting of Discrete Diffusion in Graph Generation

Dequantified Diffusion-Schrödinger Bridge for Density Ratio Estimation

Dimension-Independent Rates for Structured Neural Density Estimation

Pretraining Generative Flow Networks with Inexpensive Rewards for Molecular Graph Generation

Counterfactual Graphical Models: Constraints and Inference

Scaffold with Stochastic Gradients: New Analysis with Linear Speed-Up

CAT Merging: A Training-Free Approach for Resolving Conflicts in Model Merging

Massive Values in Self-Attention Modules are the Key to Contextual Knowledge Understanding

Solving Probabilistic Verification Problems of Neural Networks using Branch and Bound

Sortformer: A Novel Approach for Permutation-Resolved Speaker Supervision in Speech-to-Text Systems

A Mixture-Based Framework for Guiding Diffusion Models

Differential Privacy Guarantees of Markov Chain Monte Carlo Algorithms

Position: Evaluating Generative AI Systems Is a Social Science Measurement Challenge

Prediction-Aware Learning in Multi-Agent Systems

Position: Constants are Critical in Regret Bounds for Reinforcement Learning

Position: When Incentives Backfire, Data Stops Being Human

Discrete Markov Probabilistic Models: An Improved Discrete Score-Based Framework with sharp convergence bounds under minimal assumptions

Position: AI Agents Need Authenticated Delegation

Position: Theory of Mind Benchmarks are Broken for Large Language Models

Position: An Empirically Grounded Identifiability Theory Will Accelerate Self Supervised Learning Research

Position: Societal Impacts Research Requires Benchmarks for Creative Composition Tasks

Position: Certified Robustness Does Not (Yet) Imply Model Security

Position: Political Neutrality in AI Is Impossible — But Here Is How to Approximate It

Empirical Privacy Variance

PENCIL: Long Thoughts with Short Memory

Position: Formal Mathematical Reasoning—A New Frontier in AI

Position: Spectral GNNs Rely Less on Graph Fourier Basis than Conceived

Weak-to-Strong Generalization Even in Random Feature Networks, Provably

Position: Uncertainty Quantification Needs Reassessment for Large Language Model Agents

Physics-Informed DeepONets for drift-diffusion on metric graphs: simulation and parameter identification

Machine Learning meets Algebraic Combinatorics: A Suite of Datasets Capturing Research-level Conjecturing Ability in Pure Mathematics

Position: AI Evaluation Should Learn from How We Test Humans

Position: Editing Large Language Models Poses Serious Safety Risks

Dynamic Sparse Training of Diagonally Sparse Networks

Position: General Intelligence Requires Reward-based Pretraining

Position: Future Research and Challenges Remain Towards AI for Software Engineering

Deliberation in Latent Space via Differentiable Cache Augmentation

Position: AI Scaling: From Up to Down and Out

Position: AI Safety Must Embrace an Antifragile Perspective

Machines and Mathematical Mutations: Using GNNs to Characterize Quiver Mutation Classes

Position: We Can’t Understand AI Using our Existing Vocabulary

Position: LLM Social Simulations Are a Promising Research Method

Position: The Categorization of Race in ML is a Flawed Premise

Position: We Need An Algorithmic Understanding of Generative AI

Trajectory World Models for Heterogeneous Environments

Position: Principles of Animal Cognition to Improve LLM Evaluations

A New Approach to Backtracking Counterfactual Explanations: A Unified Causal Framework for Efficient Model Interpretability

Position: Beyond Assistance – Reimagining LLMs as Ethical and Adaptive Co-Creators in Mental Health Care

Position: Strong Consumer Protection is an Inalienable Defense for AI Safety in the United States

Position: Solve Layerwise Linear Models First to Understand Neural Dynamical Phenomena (Neural Collapse, Emergence, Lazy/Rich Regime, and Grokking)

Position: Trustworthy AI Agents Require the Integration of Large Language Models and Formal Methods

Position: Democratic AI is Possible. The Democracy Levels Framework Shows How It Might Work.

Training Flexible Models of Genetic Variant Effects from Functional Annotations using Accelerated Linear Algebra

LLM Data Selection and Utilization via Dynamic Bi-level Optimization

EARL-BO: Reinforcement Learning for Multi-Step Lookahead, High-Dimensional Bayesian Optimization

Reliable Algorithm Selection for Machine Learning-Guided Design

In-Context Fine-Tuning for Time-Series Foundation Models

Zero-Shot Adaptation of Parameter-Efficient Fine-Tuning in Diffusion Models

On Fine-Grained Distinct Element Estimation

Learning-Augmented Hierarchical Clustering

General framework for online-to-nonconvex conversion: Schedule-free SGD is also effective for nonconvex optimization

Faster Approximation Algorithms for k-Center via Data Reduction

Low-distortion and GPU-compatible Tree Embeddings in Hyperbolic Space

Generalization Bounds via Meta-Learned Model Representations: PAC-Bayes and Sample Compression Hypernetworks

The Missing Alignment Link of In-context Learning on Sequences

DEFAME: Dynamic Evidence-based FAct-checking with Multimodal Experts

Accelerating PDE-Constrained Optimization by the Derivative of Neural Operators

S4S: Solving for a Fast Diffusion Model Solver

NICE Data Selection for Instruction Tuning in LLMs with Non-differentiable Evaluation Metric

Revolve: Optimizing AI Systems by Tracking Response Evolution in Textual Optimization

Reward-free World Models for Online Imitation Learning

Clustering Properties of Self-Supervised Learning

Extractive Structures Learned in Pretraining Enable Generalization on Finetuned Facts

Uniform Mean Estimation for Heavy-Tailed Distributions via Median-of-Means

Sharp Generalization for Nonparametric Regression by Over-Parameterized Neural Networks: A Distribution-Free Analysis in Spherical Covariate

FSTLLM: Spatio-Temporal LLM for Few Shot Time Series Forecasting

Mixture of Experts Made Intrinsically Interpretable

Optimizing Temperature for Language Models with Multi-Sample Inference

Causal Abstraction Inference under Lossy Representations

Joint MoE Scaling Laws: Mixture of Experts Can Be Memory Efficient

Leveraging Predictive Equivalence in Decision Trees

Simple Policy Optimization

Exploiting Presentative Feature Distributions for Parameter-Efficient Continual Learning of Large Language Models

Fast and Provable Algorithms for Sparse PCA with Improved Sample Complexity

ArrayDPS: Unsupervised Blind Speech Separation with a Diffusion Prior

From RAG to Memory: Non-Parametric Continual Learning for Large Language Models

DRAG: Data Reconstruction Attack using Guided Diffusion

Adjoint Sampling: Highly Scalable Diffusion Samplers via Adjoint Matching

Differentiable Quadratic Optimization For the Maximum Independent Set Problem

Reflect-then-Plan: Offline Model-Based Planning through a Doubly Bayesian Lens

FSL-SAGE: Accelerating Federated Split Learning via Smashed Activation Gradient Estimation

Finite-Time Global Optimality Convergence in Deep Neural Actor-Critic Methods for Decentralized Multi-Agent Reinforcement Learning

The Jailbreak Tax: How Useful are Your Jailbreak Outputs?

Efficient Heterogeneity-Aware Federated Active Data Selection

LETS Forecast: Learning Embedology for Time Series Forecasting

Learning Efficient Robotic Garment Manipulation with Standardization

Behavior-Regularized Diffusion Policy Optimization for Offline Reinforcement Learning

Causality Inspired Federated Learning for OOD Generalization

Improved Lower Bounds for First-order Stochastic Non-convex Optimization under Markov Sampling

MDDM: Practical Message-Driven Generative Image Steganography Based on Diffusion Models

On the Query Complexity of Verifier-Assisted Language Generation

BRiTE: Bootstrapping Reinforced Thinking Process to Enhance Language Model Reasoning

Gravity-Bench-v1: A Benchmark on Gravitational Physics Discovery for Agents

Predicting mutational effects on protein binding from folding energy

Inductive Gradient Adjustment for Spectral Bias in Implicit Neural Representations

ELEMENTAL: Interactive Learning from Demonstrations and Vision-Language Models for Reward Design in Robotics

Bayesian Optimization from Human Feedback: Near-Optimal Regret Bounds

Gridded Transformer Neural Processes for Spatio-Temporal Data

Latent Mamba Operator for Partial Differential Equations

Simple Randomized Rounding for Max-Min Eigenvalue Augmentation

Just Enough Shifts: Mitigating Over-Refusal in Aligned Language Models with Targeted Representation Fine-Tuning

Accelerating Unbiased LLM Evaluation via Synthetic Feedback

Efficient Online Reinforcement Learning for Diffusion Policy

ELoRA: Low-Rank Adaptation for Equivariant GNNs

Mixture of Hidden-Dimensions: Not All Hidden-States’ Dimensions are Needed in Transformer

A New Concentration Inequality for Sampling Without Replacement and Its Application for Transductive Learning

FunBO: Discovering Acquisition Functions for Bayesian Optimization with FunSearch

High-Dimensional Prediction for Sequential Decision Making

Rethinking Time Encoding via Learnable Transformation Functions

Overcoming Spurious Solutions in Semi-Dual Neural Optimal Transport: A Smoothing Approach for Learning the Optimal Transport Plan

Steering Protein Language Models

A General Representation-Based Approach to Multi-Source Domain Adaptation

Cross-City Latent Space Alignment for Consistency Region Embedding

Learning Vision and Language Concepts for Controllable Image Generation

A General Framework for Inference-time Scaling and Steering of Diffusion Models

The Relationship Between No-Regret Learning and Online Conformal Prediction

Reflection-Window Decoding: Text Generation with Selective Refinement

Position: Algebra Unveils Deep Learning - An Invitation to Neuroalgebraic Geometry

Accelerating Linear Recurrent Neural Networks for the Edge with Unstructured Sparsity

Linear Bandits with Partially Observable Features

Diffusion Adversarial Post-Training for One-Step Video Generation

Adaptive Sensitivity Analysis for Robust Augmentation against Natural Corruptions in Image Segmentation

Closed-form Solutions: A New Perspective on Solving Differential Equations

David and Goliath: Small One-step Model Beats Large Diffusion with Score Post-training

Simplifying DINO via Coding Rate Regularization

AlphaVerus: Bootstrapping Formally Verified Code Generation through Self-Improving Translation and Treefinement

Graph Generative Pre-trained Transformer

A Reduction Framework for Distributionally Robust Reinforcement Learning under Average Reward

SPEX: Scaling Feature Interaction Explanations for LLMs

Beyond Self-Interest: How Group Strategies Reshape Content Creation in Recommendation Platforms?

Statistical Test for Feature Selection Pipelines by Selective Inference

Position: AI Safety should prioritize the Future of Work

Distributionally Robust Active Learning for Gaussian Process Regression

ParallelComp: Parallel Long-Context Compressor for Length Extrapolation

Structure-Guided Large Language Models for Text-to-SQL Generation

Soft Reasoning: Navigating Solution Spaces in Large Language Models through Controlled Embedding Exploration

Geometry-Informed Neural Networks

Active Treatment Effect Estimation via Limited Samples

SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models

Volume Optimality in Conformal Prediction with Structured Prediction Sets

Safe Delta: Consistently Preserving Safety when Fine-Tuning LLMs on Diverse Datasets

DiffAdvMAP: Flexible Diffusion-Based Framework for Generating Natural Unrestricted Adversarial Examples

Generalization and Robustness of the Tilted Empirical Risk

Log-Sum-Exponential Estimator for Off-Policy Evaluation and Learning

Multiaccuracy and Multicalibration via Proxy Groups

A Model of Place Field Reorganization During Reward Maximization

MIB: A Mechanistic Interpretability Benchmark

Improved Last-Iterate Convergence of Shuffling Gradient Methods for Nonsmooth Convex Optimization

Representative Language Generation

Language Models over Canonical Byte-Pair Encodings

Context Matters: Query-aware Dynamic Long Sequence Modeling of Gigapixel Images

EmoGrowth: Incremental Multi-label Emotion Decoding with Augmented Emotional Relation Graph

Detecting Strategic Deception with Linear Probes

SAEBench: A Comprehensive Benchmark for Sparse Autoencoders in Language Model Interpretability

Projection Optimization: A General Framework for Multi-Objective and Multi-Group RLHF

LOB-Bench: Benchmarking Generative AI for Finance - an Application to Limit Order Book Data

Curse of High Dimensionality Issue in Transformer for Long Context Modeling

Hyperbolic-PDE GNN: Spectral Graph Neural Networks in the Perspective of A System of Hyperbolic Partial Differential Equations

Goal-Oriented Skill Abstraction for Offline Multi-Task Reinforcement Learning

Robust Reward Alignment via Hypothesis Space Batch Cutting

Score-of-Mixture Training: One-Step Generative Model Training Made Simple via Score Estimation of Mixture Distributions

Training Software Engineering Agents and Verifiers with SWE-Gym

Fairness Overfitting in Machine Learning: An Information-Theoretic Perspective

Geometric Feature Embedding for Effective 3D Few-Shot Class Incremental Learning

A Geometric Approach to Personalized Recommendation with Set-Theoretic Constraints Using Box Embeddings

SE(3)-Equivariant Diffusion Policy in Spherical Fourier Space

Visual Abstraction: A Plug-and-Play Approach for Text-Visual Retrieval

Sundial: A Family of Highly Capable Time Series Foundation Models

Hypo3D: Exploring Hypothetical Reasoning in 3D

SPD: Sync-Point Drop for Efficient Tensor Parallelism of Large Language Models

Maximum Entropy Reinforcement Learning with Diffusion Policy

Relating Misfit to Gain in Weak-to-Strong Generalization Beyond the Squared Loss

Constant Stepsize Local GD for Logistic Regression: Acceleration by Instability

Return Capping: Sample Efficient CVaR Policy Gradient Optimisation

Position: Supervised Classifiers Answer the Wrong Questions for OOD Detection

Inverse Flow and Consistency Models

Activation by Interval-wise Dropout: A Simple Way to Prevent Neural Networks from Plasticity Loss

Self-cross Feature based Spiking Neural Networks for Efficient Few-shot Learning

CALM: Consensus-Aware Localized Merging for Multi-Task Learning

Flopping for FLOPs: Leveraging Equivariance for Computational Efficiency

Sample Complexity of Distributionally Robust Off-Dynamics Reinforcement Learning with Online Interaction

Unifews: You Need Fewer Operations for Efficient Graph Neural Networks

STAMP Your Content: Proving Dataset Membership via Watermarked Rephrasings

On Expressive Power of Looped Transformers: Theoretical Analysis and Enhancement via Timestep Encoding

Exploring Representations and Interventions in Time Series Foundation Models

MERGE$^3$: Efficient Evolutionary Merging on Consumer-grade GPUs

BARNN: A Bayesian Autoregressive and Recurrent Neural Network

Can Diffusion Models Learn Hidden Inter-Feature Rules Behind Images?

Lower Bounds for Chain-of-Thought Reasoning in Hard-Attention Transformers

Action Dubber: Timing Audible Actions via Inflectional Flow

Exploiting Curvature in Online Convex Optimization with Delayed Feedback

Runtime Analysis of Evolutionary NAS for Multiclass Classification

Emergence in non-neural models: grokking modular arithmetic via average gradient outer product

FLAM: Frame-Wise Language-Audio Modeling

Retrieval Augmented Zero-Shot Enzyme Generation for Specified Substrate

Robust Sparsification via Sensitivity

Batch List-Decodable Linear Regression via Higher Moments

Active Evaluation Acquisition for Efficient LLM Benchmarking

SBGD: Improving Graph Diffusion Generative Model via Stochastic Block Diffusion

Rényi Neural Processes

AlphaPO: Reward Shape Matters for LLM Alignment

PepTune: De Novo Generation of Therapeutic Peptides with Multi-Objective-Guided Discrete Diffusion

Adaptive Sample Sharing for Multi Agent Linear Bandits

Semantic Shift Estimation via Dual-Projection and Classifier Reconstruction for Exemplar-Free Class-Incremental Learning

Unlocking the Power of Rehearsal in Continual Learning: A Theoretical Perspective

L3A: Label-Augmented Analytic Adaptation for Multi-Label Class Incremental Learning

Beyond Cropped Regions: New Benchmark and Corresponding Baseline for Chinese Scene Text Retrieval in Diverse Layouts

Active Fine-Tuning of Multi-Task Policies

Zero-Shot Offline Imitation Learning via Optimal Transport

Graph-constrained Reasoning: Faithful Reasoning on Knowledge Graphs with Large Language Models

Identifying and Understanding Cross-Class Features in Adversarial Training

ReQFlow: Rectified Quaternion Flow for Efficient and High-Quality Protein Backbone Generation

FOUNDER: Grounding Foundation Models in World Models for Open-Ended Embodied Decision Making

Suitability Filter: A Statistical Framework for Classifier Evaluation in Real-World Deployment Settings

Anytime-Constrained Equilibria in Polynomial Time

Symmetry-Driven Discovery of Dynamical Variables in Molecular Simulations

Mastering Massive Multi-Task Reinforcement Learning via Mixture-of-Expert Decision Transformer

Diverging Preferences: When do Annotators Disagree and do Models Know?

Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search

Raising the Bar: Investigating the Values of Large Language Models via Generative Evolving Testing

Mastering Multiple-Expert Routing: Realizable $H$-Consistency and Strong Guarantees for Learning to Defer

Principled Algorithms for Optimizing Generalized Metrics in Binary Classification

Telling Peer Direct Effects from Indirect Effects in Observational Network Data

Balancing the Scales: A Theoretical and Algorithmic Framework for Learning from Imbalanced Data

OR-Bench: An Over-Refusal Benchmark for Large Language Models

Partially Observable Reinforcement Learning with Memory Traces

AutoElicit: Using Large Language Models for Expert Prior Elicitation in Predictive Modelling

Exactly Tight Information-theoretic Generalization Bounds via Binary Jensen-Shannon Divergence

Policy Regularization on Globally Accessible States in Cross-Dynamics Reinforcement Learning

IL-SOAR : Imitation Learning with Soft Optimistic Actor cRitic

Chameleon: A Flexible Data-mixing Framework for Language Model Pretraining and Finetuning

Improving Soft Unification with Knowledge Graph Embedding Methods

Spurious Correlations in High Dimensional Regression: The Roles of Regularization, Simplicity Bias and Over-Parameterization

Accelerating Spectral Clustering under Fairness Constraints

Best of Both Worlds: Regret Minimization versus Minimax Play

EPIC: Efficient Position-Independent Caching for Serving Large Language Models

Continuous-Time Analysis of Heavy Ball Momentum in Min-Max Games

MUDDFormer: Breaking Residual Bottlenecks in Transformers via Multiway Dynamic Dense Connections

Training Deep Learning Models with Norm-Constrained LMOs

Layer-wise Quantization for Quantized Optimistic Dual Averaging

Sparse Training from Random Initialization: Aligning Lottery Ticket Masks using Weight Symmetry

Prices, Bids, Values: One ML-Powered Combinatorial Auction to Rule Them All

Multivariate Conformal Selection

Dissecting Submission Limit in Desk-Rejections: A Mathematical Analysis of Fairness in AI Conference Policies

Janus: Dual-Server Multi-Round Secure Aggregation with Verifiability for Federated Learning

Compositional Causal Reasoning Evaluation in Language Models

Schwarz–Schur Involution: Lightspeed Differentiable Sparse Linear Solvers

Beyond Confidence: Exploiting Homogeneous Pattern for Semi-Supervised Semantic Segmentation

PCEvolve: Private Contrastive Evolution for Synthetic Dataset Generation via Few-Shot Private Data and Generative APIs

Implicit Bias of Gradient Descent for Non-Homogeneous Deep Networks

Direct Motion Models for Assessing Generated Videos

How Effective Can Dropout Be in Multiple Instance Learning ?

LLMs on the Line: Data Determines Loss-to-Loss Scaling Laws

Understanding Synthetic Context Extension via Retrieval Heads

On Teacher Hacking in Language Model Distillation

VersaPRM: Multi-Domain Process Reward Model via Synthetic Reasoning Data

KernelBench: Can LLMs Write Efficient GPU Kernels?

Depth Degeneracy in Neural Networks: Vanishing Angles in Fully Connected ReLU Networks on Initialization

MARS: Unleashing the Power of Variance Reduction for Training Large Models

Theoretically Unmasking Inference Attacks Against LDP-Protected Clients in Federated Vision Models

The Importance of Being Lazy: Scaling Limits of Continual Learning

ADIOS: Antibody Development via Opponent Shaping

Towards Practical Defect-Focused Automated Code Review

The Sharpness Disparity Principle in Transformers for Accelerating Language Model Pre-Training

Reinforcement Learning for Quantum Control under Physical Constraints

SAFER: A Calibrated Risk-Aware Multimodal Recommendation Model for Dynamic Treatment Regimes

Automated Benchmark Generation for Repository-Level Coding Tasks

Tree-Sliced Wasserstein Distance with Nonlinear Projection

Tree-Sliced Wasserstein Distance: A Geometric Perspective

Score as Action: Fine Tuning Diffusion Generative Models by Continuous-time Reinforcement Learning

SkipGPT: Each Token is One of a Kind

Equivariant Polynomial Functional Networks

Concurrent Reinforcement Learning with Aggregated States via Randomized Least Squares Value Iteration

Novelty Detection in Reinforcement Learning with World Models

Active Learning of Deep Neural Networks via Gradient-Free Cutting Planes

Rethinking Latent Redundancy in Behavior Cloning: An Information Bottleneck Approach for Robot Manipulation

Bootstrapping Self-Improvement of Language Model Programs for Zero-Shot Schema Matching

Preference Learning for AI Alignment: a Causal Perspective

Aligning with Logic: Measuring, Evaluating and Improving Logical Preference Consistency in Large Language Models

Attention-Level Speculation

Position: All Current Generative Fidelity and Diversity Metrics are Flawed

GuardAgent: Safeguard LLM Agents via Knowledge-Enabled Reasoning

AUTOCIRCUIT-RL: Reinforcement Learning-Driven LLM for Automated Circuit Topology Generation

Unified Screening for Multiple Diseases

Statistical Hypothesis Testing for Auditing Robustness in Language Models

Skip the Equations: Learning Behavior of Personalized Dynamical Systems Directly From Data

Understanding Model Reprogramming for CLIP via Decoupling Visual Prompts

AutoCATE: End-to-End, Automated Treatment Effect Estimation

Continuously Updating Digital Twins using Large Language Models

Causal Invariance-aware Augmentation for Brain Graph Contrastive Learning

Elucidating the Design Space of Multimodal Protein Language Models

Models of Heavy-Tailed Mechanistic Universality

Stochastic Encodings for Active Feature Acquisition

G-Sim: Generative Simulations with Large Language Models and Gradient-Free Calibration

Sum-of-Parts: Self-Attributing Neural Networks with End-to-End Learning of Feature Groups

Balancing Preservation and Modification: A Region and Semantic Aware Metric for Instruction-Based Image Editing

The Synergy of LLMs & RL Unlocks Offline Learning of Generalizable Language-Conditioned Policies with Low-fidelity Data

Autoformulation of Mathematical Optimization Models Using LLMs

On the Similarities of Embeddings in Contrastive Learning

MedRAX: Medical Reasoning Agent for Chest X-ray

SafeMap: Robust HD Map Construction from Incomplete Observations

On Mitigating Affinity Bias through Bandits with Evolving Biased Feedback

On the Interplay between Graph Structure and Learning Algorithms in Graph Neural Networks

High-Fidelity Simultaneous Speech-To-Speech Translation

SAH-Drive: A Scenario-Aware Hybrid Planner for Closed-Loop Vehicle Trajectory Generation

Lightspeed Geometric Dataset Distance via Sliced Optimal Transport

Latent Thought Models with Variational Bayes Inference-Time Computation

Features are fate: a theory of transfer learning in high-dimensional regression

LEVIS: Large Exact Verifiable Input Spaces for Neural Networks

Taming Diffusion for Dataset Distillation with High Representativeness

Symmetry-Robust 3D Orientation Estimation

PlaySlot: Learning Inverse Latent Dynamics for Controllable Object-Centric Video Prediction and Planning

Windows Agent Arena: Evaluating Multi-Modal OS Agents at Scale

Leveraging Skills from Unlabeled Prior Data for Efficient Online Exploration

Layer by Layer: Uncovering Hidden Representations in Language Models

Learning Safety Constraints for Large Language Models

PieClam: A Universal Graph Autoencoder Based on Overlapping Inclusive and Exclusive Communities

Projection Pursuit Density Ratio Estimation

On The Concurrence of Layer-wise Preconditioning Methods and Provable Feature Learning

World Model Implanting for Test-time Adaptation of Embodied Agents

Ringmaster ASGD: The First Asynchronous SGD with Optimal Time Complexity

STD-FD: Spatio-Temporal Distribution Fitting Deviation for AIGC Forgery Identification

The Case for Learned Provenance-based System Behavior Baseline

M³HF: Multi-agent Reinforcement Learning from Multi-phase Human Feedback of Mixed Quality

Towards the Efficient Inference by Incorporating Automated Computational Phenotypes under Covariate Shift

Parametric Scaling Law of Tuning Bias in Conformal Prediction

Contextual Linear Bandits with Delay as Payoff

Learning to Steer Learners in Games

Joint Localization and Activation Editing for Low-Resource Fine-Tuning

MoRAgent: Parameter Efficient Agent Tuning with Mixture-of-Roles

Do Bayesian Neural Networks Actually Behave Like Bayesian Models?

Graph Adaptive Autoregressive Moving Average Models

RepoAudit: An Autonomous LLM-Agent for Repository-Level Code Auditing

CodeSync: Synchronizing Large Language Models with Dynamic Code Evolution at Scale

Decoupled SGDA for Games with Intermittent Strategy Communication

DeFoG: Discrete Flow Matching for Graph Generation

One Diffusion Step to Real-World Super-Resolution via Flow Trajectory Distillation

FloE: On-the-Fly MoE Inference on Memory-constrained GPU

Observation Interference in Partially Observable Assistance Games

Fleet of Agents: Coordinated Problem Solving with Large Language Models

The underlying structures of self-attention: symmetry, directionality, and emergent dynamics in Transformer training

Robust Spatio-Temporal Centralized Interaction for OOD Learning

Communicating Activations Between Language Model Agents

Cross-regularization: Adaptive Model Complexity through Validation Gradients

Learning Dynamics in Continual Pre-Training for Large Language Models

How Expressive are Knowledge Graph Foundation Models?

CACTI: Leveraging Copy Masking and Contextual Information to Improve Tabular Data Imputation

Raptor: Scalable Train-Free Embeddings for 3D Medical Volumes Leveraging Pretrained 2D Foundation Models

MVA: Linear Attention with High-order Query-Keys Integration and Multi-level Vocabulary Decomposition

Sort Before You Prune: Improved Worst-Case Guarantees of the DiskANN Family of Graphs

Learning Mean Field Control on Sparse Graphs

Causal Attribution Analysis for Continuous Outcomes

Robust and Conjugate Spatio-Temporal Gaussian Processes

Relative Error Fair Clustering in the Weak-Strong Oracle Model

Learning Attribute-Aware Hash Codes for Fine-Grained Image Retrieval via Query Optimization

Token Signature: Predicting Chain-of-Thought Gains with Token Decoding Feature in Large Language Models

SEAD: Unsupervised Ensemble of Streaming Anomaly Detectors

BaWA: Automatic Optimizing Pruning Metric for Large Language Models with Balanced Weight and Activation

Exploring Invariance in Images through One-way Wave Equations

Linear Contextual Bandits With Interference

Agent-Centric Actor-Critic for Asynchronous Multi-Agent Reinforcement Learning

Penalizing Infeasible Actions and Reward Scaling in Reinforcement Learning with Offline Data

Online Pre-Training for Offline-to-Online Reinforcement Learning

Understanding Input Selectivity in Mamba: Impact on Approximation Power, Memorization, and Associative Recall Capacity

Improved Sample Complexity for Private Nonsmooth Nonconvex Optimization

Peri-LN: Revisiting Normalization Layer in the Transformer Architecture

Latent Preference Coding: Aligning Large Language Models via Discrete Latent Codes

Fourier Position Embedding: Enhancing Attention’s Periodic Extension for Length Generalization

Quamba2: A Robust and Scalable Post-training Quantization Framework for Selective State Space Models

Analytical Lyapunov Function Discovery: An RL-based Generative Approach

Leveraging Per-Instance Privacy for Machine Unlearning

Implicit Regularization for Tubal Tensor Factorizations via Gradient Descent

Lightweight-Mark: Rethinking Deep Learning-Based Watermarking

Improved and Oracle-Efficient Online $\ell_1$-Multicalibration

On the Resilience of LLM-Based Multi-Agent Collaboration with Faulty Agents

Right Now, Wrong Then: Non-Stationary Direct Preference Optimization under Preference Drift

Enhancing Visual Localization with Cross-Domain Image Generation

Optimal and Practical Batched Linear Bandit Algorithm

Learning Adversarial MDPs with Stochastic Hard Constraints

Causal Logistic Bandits with Counterfactual Fairness Constraints

Function-Space Learning Rates

Learning to Keep a Promise: Scaling Language Model Decoding Parallelism with Learned Asynchronous Decoding

Functional Alignment Can Mislead: Examining Model Stitching

Arbitrarily-Conditioned Multi-Functional Diffusion for Multi-Physics Emulation

Do Not Mimic My Voice : Speaker Identity Unlearning for Zero-Shot Text-to-Speech

PAC-Bayes Analysis for Recalibration in Classification

Piloting Structure-Based Drug Design via Modality-Specific Optimal Schedule

Empower Structure-Based Molecule Optimization with Gradient Guided Bayesian Flow Networks

ExtPose: Robust and Coherent Pose Estimation by Extending ViTs

Self-Improving Transformers Overcome Easy-to-Hard and Length Generalization Challenges

DTZO: Distributed Trilevel Zeroth Order Learning with Provable Non-Asymptotic Convergence

An Analysis of Quantile Temporal-Difference Learning

MixBridge: Heterogeneous Image-to-Image Backdoor Attack through Mixture of Schrödinger Bridges

Conditioning Diffusions Using Malliavin Calculus

TransPL: VQ-Code Transition Matrices for Pseudo-Labeling of Time Series Unsupervised Domain Adaptation

Unbiased Recommender Learning from Implicit Feedback via Weakly Supervised Learning

Positional Attention: Expressivity and Learnability of Algorithmic Computation

Learning multivariate Gaussians with imperfect advice

Gumiho: A Hybrid Architecture to Prioritize Early Tokens in Speculative Decoding

NoLiMa: Long-Context Evaluation Beyond Literal Matching

PTTA: Purifying Malicious Samples for Test-Time Model Adaptation

Fully Heteroscedastic Count Regression with Deep Double Poisson Networks

Linear convergence of Sinkhorn's algorithm for generalized static Schrödinger bridge

sciLaMA: A Single-Cell Representation Learning Framework to Leverage Prior Knowledge from Large Language Models

Gap-Dependent Bounds for Federated $Q$-Learning

Controlling Large Language Model with Latent Action

Emergence and Effectiveness of Task Vectors in In-Context Learning: An Encoder Decoder Perspective

Revisiting Differentially Private Algorithms for Decentralized Online Learning

Optimal Task Order for Continual Learning of Multiple Tasks

Attention Mechanisms Perspective: Exploring LLM Processing of Graph-Structured Data

A Likelihood Based Approach to Distribution Regression Using Conditional Deep Generative Models

Liger: Linearizing Large Language Models to Gated Recurrent Structures

Reinforcement Learning with Segment Feedback

Rethinking Score Distilling Sampling for 3D Editing and Generation

Reinforced Lifelong Editing for Language Models

Actor-Critics Can Achieve Optimal Sample Efficiency

Positional Encoding meets Persistent Homology on Graphs

PoisonBench: Assessing Language Model Vulnerability to Poisoned Preference Data

Zebra: In-Context Generative Pretraining for Solving Parametric PDEs

SITCOM: Step-wise Triple-Consistent Diffusion Sampling For Inverse Problems

Commute Graph Neural Networks

Polynomial Time Learning Augmented Algorithms for NP-hard Permutation Problems

Supervised Contrastive Learning from Weakly-Labeled Audio Segments for Musical Version Matching

Reward Modeling with Ordinal Feedback: Wisdom of the Crowd

Distributed Differentially Private Data Analytics via Secure Sketching

NMA-tune: Generating Highly Designable and Dynamics Aware Protein Backbones

Latent Action Learning Requires Supervision in the Presence of Distractors

Understanding the Kronecker Matrix-Vector Complexity of Linear Algebra

Multilayer Matrix Factorization via Dimension-Reducing Diffusion Variational Inference

WGFormer: An SE(3)-Transformer Driven by Wasserstein Gradient Flows for Molecular Ground-State Conformation Prediction

On the Statistical Mechanisms of Distributional Compositional Generalization

TinyMIG: Transferring Generalization from Vision Foundation Models to Single-Domain Medical Imaging

Constrained Pareto Set Identification with Bandit Feedback

ActionPiece: Contextually Tokenizing Action Sequences for Generative Recommendation

Flat-LoRA: Low-Rank Adaptation over a Flat Loss Landscape

Multi-Armed Bandits with Interference: Bridging Causal Inference and Adversarial Bandits

Principal-Agent Bandit Games with Self-Interested and Exploratory Learning Agents

Floating-Point Neural Networks Can Represent Almost All Floating-Point Functions

Certified Unlearning for Neural Networks

Deep Neural Cellular Potts Models

A Dynamical Systems-Inspired Pruning Strategy for Addressing Oversmoothing in Graph Attention Networks

Speak Easy: Eliciting Harmful Jailbreaks from LLMs with Simple Interactions

Contextures: Representations from Contexts

NestQuant: nested lattice quantization for matrix products and LLMs

Understanding the Logic of Direct Preference Alignment through Logic

Banyan: Improved Representation Learning with Explicit Structure

A Trichotomy for List Transductive Online Learning

FedClean: A General Robust Label Noise Correction for Federated Learning

Learning Likelihood-Free Reference Priors

Ultra-Resolution Adaptation with Ease

Distillation Scaling Laws

Geometry Informed Tokenization of Molecules for Language Model Generation

Adversarial Perturbations Are Formed by Iteratively Learning Linear Combinations of the Right Singular Vectors of the Adversarial Jacobian

Enhancing Diversity In Parallel Agents: A Maximum State Entropy Exploration Story

WeGeFT: Weight‑Generative Fine‑Tuning for Multi‑Faceted Efficient Adaptation of Large Models

A Theoretical Study of (Hyper) Self-Attention through the Lens of Interactions: Representation, Training, Generalization

Implicit Language Models are RNNs: Balancing Parallelization and Expressivity

Understanding the Limits of Deep Tabular Methods with Temporal Shift

AsymRnR: Video Diffusion Transformers Acceleration with Asymmetric Reduction and Restoration

Prior Knowledge Guided Neural Architecture Generation

PyTDC: A multimodal machine learning training, evaluation, and inference platform for biomedical foundation models

TimeStacker: A Novel Framework with Multilevel Observation for Capturing Nonstationary Patterns in Time Series Forecasting

Fast and Robust: Task Sampling with Posterior and Diversity Synergies for Adaptive Decision-Makers in Randomized Environments

Scalable Approximation Algorithms for $p$-Wasserstein Distance and Its Variants

Sorbet: A Neuromorphic Hardware-Compatible Transformer-Based Spiking Language Model

Conservative Offline Goal-Conditioned Implicit V-Learning

MF-LAL: Drug Compound Generation Using Multi-Fidelity Latent Space Active Learning

A Simple Model of Inference Scaling Laws

To Each Metric Its Decoding: Post-Hoc Optimal Decision Rules of Probabilistic Hierarchical Classifiers

CaDA: Cross-Problem Routing Solver with Constraint-Aware Dual-Attention

Transfer Learning for Nonparametric Contextual Dynamic Pricing

Approximately Correct Label Distribution Learning

On the Duality between Gradient Transformations and Adapters

Be a Goldfish: Forgetting Bad Conditioning in Sparse Linear Regression via Variational Autoencoders

PoisonedEye: Knowledge Poisoning Attack on Retrieval-Augmented Generation based Large Vision-Language Models

Large Language Models to Diffusion Finetuning

FDGen: A Fairness-Aware Graph Generation Model

Earley-Driven Dynamic Pruning for Efficient Structured Decoding

Reinforce LLM Reasoning through Multi-Agent Reflection

The Emperor's New Clothes in Benchmarking? A Rigorous Examination of Mitigation Strategies for LLM Benchmark Data Contamination

An Instrumental Value for Data Production and its Application to Data Pricing

$K^2$VAE: A Koopman-Kalman Enhanced Variational AutoEncoder for Probabilistic Time Series Forecasting

Algorithmic Recourse for Long-Term Improvement

The Hidden Life of Tokens: Reducing Hallucination of Large Vision-Language Models Via Visual Information Steering

On Understanding Attention-Based In-Context Learning for Categorical Data

GPEN: Global Position Encoding Network for Enhanced Subgraph Representation Learning

SeedLoRA: A Fusion Approach to Efficient LLM Fine-Tuning

Incremental Gradient Descent with Small Epoch Counts is Surprisingly Slow on Ill-Conditioned Problems

Random Policy Evaluation Uncovers Policies of Generative Flow Networks

Offline Learning for Combinatorial Multi-armed Bandits

Meta-Black-Box-Optimization through Offline Q-function Learning

Federated Incomplete Multi-view Clustering with Globally Fused Graph Guidance

The Courage to Stop: Overcoming Sunk Cost Fallacy in Deep Reinforcement Learning

Risk-Sensitive Theory of Mind: Coordinating with Agents of Unknown Bias using Cumulative Prospect Theory

PiD: Generalized AI-Generated Images Detection with Pixelwise Decomposition Residuals

SAND: One-Shot Feature Selection with Additive Noise Distortion

Reinforcement Learning Control of a Physical Robot Device for Assisted Human Walking without a Simulator

Scaling Laws for Floating–Point Quantization Training

Disentangled Graph Spectral Domain Adaptation

Controlling Neural Collapse Enhances Out-of-Distribution Detection and Transfer Learning

BILBO: BILevel Bayesian Optimization

Topology-aware Neural Flux Prediction Guided by Physics

Beyond Zero Initialization: Investigating the Impact of Non-Zero Initialization on LoRA Fine-Tuning Dynamics

Zero-shot Meta-learning for Tabular Prediction Tasks with Adversarially Pre-trained Transformer

EffiCoder: Enhancing Code Generation in Large Language Models through Efficiency-Aware Fine-tuning

Compute Optimal Inference and Provable Amortisation Gap in Sparse Autoencoders

NBDI: A Simple and Effective Termination Condition for Skill Extraction from Task-Agnostic Demonstrations

Progressively Label Enhancement for Large Language Model Alignment

Temporal Misalignment in ANN-SNN Conversion and its Mitigation via Probabilistic Spiking Neurons

All-atom inverse protein folding through discrete flow matching

X-Transfer Attacks: Towards Super Transferable Adversarial Attacks on CLIP

C2IQL: Constraint-Conditioned Implicit Q-learning for Safe Offline Reinforcement Learning

Balanced Learning for Domain Adaptive Semantic Segmentation

SERENA: A Unified Stochastic Recursive Variance Reduced Gradient Framework for Riemannian Non-Convex Optimization

Safe-EF: Error Feedback for Non-smooth Constrained Optimization

OrthoRank: Token Selection via Sink Token Orthogonality for Efficient LLM inference

Flow-based Domain Randomization for Learning and Sequencing Robotic Skills

Learning from Suboptimal Data in Continuous Control via Auto-Regressive Soft Q-Network

The Diffusion Duality

Pareto-frontier Entropy Search with Variational Lower Bound Maximization

Expert Race: A Flexible Routing Strategy for Scaling Diffusion Transformer with Mixture of Experts

GIVE: Structured Reasoning of Large Language Models with Knowledge Graph Inspired Veracity Extrapolation

Segment Anyword: Mask Prompt Inversion for Open-Set Grounded Segmentation

ProDiff: Prototype-Guided Diffusion for Minimal Information Trajectory Imputation

Phase and Amplitude-aware Prompting for Enhancing Adversarial Robustness

Robust Consensus Anchor Learning for Efficient Multi-view Subspace Clustering

Adjusting Model Size in Continual Gaussian Processes: How Big is Big Enough?

Test-Time Adaptation with Binary Feedback

TMetaNet: Topological Meta-Learning Framework for Dynamic Link Prediction

Harmonizing Geometry and Uncertainty: Diffusion with Hyperspheres

Private Model Personalization Revisited

How Transformers Learn Structured Data: Insights From Hierarchical Filtering

KABB: Knowledge-Aware Bayesian Bandits for Dynamic Expert Coordination in Multi-Agent Systems

Efficient Noise Calculation in Deep Learning-based MRI Reconstructions

Compress then Serve: Serving Thousands of LoRA Adapters with Little Overhead

Improving Continual Learning Performance and Efficiency with Auxiliary Classifiers

No Task Left Behind: Isotropic Model Merging with Common and Task-Specific Subspaces

Aligning Multimodal Representations through an Information Bottleneck

Domain-Adapted Diffusion Model for PROTAC Linker Design Through the Lens of Density Ratio in Chemical Space

Stochastic Forward–Backward Deconvolution: Training Diffusion Models with Finite Noisy Datasets

Progressive Tempering Sampler with Diffusion

SAN: Hypothesizing Long-Term Synaptic Development and Neural Engram Mechanism in Scalable Model's Parameter-Efficient Fine-Tuning

Unveiling Markov heads in Pretrained Language Models for Offline Reinforcement Learning

What Has a Foundation Model Found? Inductive Bias Reveals World Models

Decomposition of Graphic Design with Unified Multimodal Model

Self-supervised Adversarial Purification for Graph Neural Networks

Clustering via Self-Supervised Diffusion

Are Sparse Autoencoders Useful? A Case Study in Sparse Probing

SAM2Act: Integrating Visual Foundation Model with A Memory Architecture for Robotic Manipulation

Quantum Speedup for Hypergraph Sparsification

QuRe: Query-Relevant Retrieval through Hard Negative Sampling in Composed Image Retrieval

Synthetic Face Datasets Generation via Latent Space Exploration from Brownian Identity Diffusion

Ad-Hoc Human-AI Coordination Challenge

Contract Design Under Approximate Best Responses

Tensor Decomposition Based Memory-Efficient Incremental Learning

QuEST: Stable Training of LLMs with 1-Bit Weights and Activations

ShieldAgent: Shielding Agents via Verifiable Safety Policy Reasoning

Topological Signatures of Adversaries in Multimodal Alignments

Unifying 2D and 3D Vision-Language Understanding

LOCATE 3D: Real-World Object Localization via Self-Supervised Learning in 3D

From Thousands to Billions: 3D Visual Language Grounding via Render-Supervised Distillation from 2D VLMs

Reliable and Efficient Amortized Model-based Evaluation

The Lock-in Hypothesis: Stagnation by Algorithm

LMAct: A Benchmark for In-Context Imitation Learning with Long Multimodal Demonstrations

Guarantees of a Preconditioned Subgradient Algorithm for Overparameterized Asymmetric Low-rank Matrix Recovery

Enhancing Cooperative Multi-Agent Reinforcement Learning with State Modelling and Adversarial Exploration

Learning with Selectively Labeled Data from Multiple Decision-makers

Score Matching with Missing Data

Global Optimization with a Power-Transformed Objective and Gaussian Smoothing

Train for the Worst, Plan for the Best: Understanding Token Ordering in Masked Diffusions

The Value of Prediction in Identifying the Worst-Off

Mind Your Step (by Step): Chain-of-Thought can Reduce Performance on Tasks where Thinking Makes Humans Worse

Action-Dependent Optimality-Preserving Reward Shaping

Position: The AI Conference Peer Review Crisis Demands Author Feedback and Reviewer Rewards

GRU: Mitigating the Trade-off between Unlearning and Retention for LLMs

Learnware Specification via Dual Alignment

Premise-Augmented Reasoning Chains Improve Error Identification in Math reasoning with LLMs

Near Optimal Non-asymptotic Sample Complexity of 1-Identification

Sassha: Sharpness-aware Adaptive Second-order Optimization with Stable Hessian Approximation

LGDM: Latent Guidance in Diffusion Models for Perceptual Evaluations

Self-Disentanglement and Re-Composition for Cross-Domain Few-Shot Segmentation

Bellman Unbiasedness: Toward Provably Efficient Distributional Reinforcement Learning with General Value Function Approximation

Fragments to Facts: Partial-Information Fragment Inference from LLMs

Policy-labeled Preference Learning: Is Preference Enough for RLHF?

Scalable Meta-Learning via Mixed-Mode Differentiation

Uncertainty Estimation for Heterophilic Graphs Through the Lens of Information Theory

Learning-Order Autoregressive Models with Application to Molecular Graph Generation

Learning Time-Varying Multi-Region Brain Communications via Scalable Markovian Gaussian Processes

Neurosymbolic World Models for Sequential Decision Making

Adversarial Inception Backdoor Attacks against Reinforcement Learning

Model-Based Exploration in Monitored Markov Decision Processes

Compression via Pre-trained Transformers: A Study on Byte-Level Multimodal Data

Stacey: Promoting Stochastic Steepest Descent via Accelerated $\ell_p$-Smooth Nonconvex Optimization

Robust Multi-bit Text Watermark with LLM-based Paraphrasers

Invariance Makes LLM Unlearning Resilient Even to Unanticipated Downstream Fine-Tuning

Improving the Variance of Differentially Private Randomized Experiments through Clustering

Overcoming the Curse of Dimensionality in Reinforcement Learning Through Approximate Factorization

Integer Programming for Generalized Causal Bootstrap Designs

Retraining with Predicted Hard Labels Provably Increases Model Accuracy

Off-Policy Evaluation under Nonignorable Missing Data

Self-Consistency Preference Optimization

Scalable Equilibrium Sampling with Sequential Boltzmann Generators

Theoretical Limitations of Ensembles in the Age of Overparameterization

LAuReL: Learned Augmented Residual Layer

Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction

Structured Preconditioners in Adaptive Optimization: A Unified Analysis

Binary Hypothesis Testing for Softmax Models and Leverage Score Models

"Why Is There a Tumor?": Tell Me the Reason, Show Me the Evidence

Fundamental Limits of Visual Autoregressive Transformers: Universal Approximation Abilities

Antidote: Post-fine-tuning Safety Alignment for Large Language Models against Harmful Fine-tuning Attack

Reducing Variance of Stochastic Optimization for Approximating Nash Equilibria in Normal-Form Games

Structure-informed Risk Minimization for Robust Ensemble Learning

Learn to Vaccinate: Combining Structure Learning and Effective Vaccination for Epidemic and Outbreak Control

SDP-CROWN: Efficient Bound Propagation for Neural Network Verification with Tightness of Semidefinite Programming

Offline-to-Online Reinforcement Learning with Classifier-Free Diffusion Generation

Is Your Model Fairly Certain? Uncertainty-Aware Fairness Evaluation for LLMs

AutoEval Done Right: Using Synthetic Data for Model Evaluation

Separating Knowledge and Perception with Procedural Data

SADA: Stability-guided Adaptive Diffusion Acceleration

Two Tickets are Better than One: Fair and Accurate Hiring Under Strategic LLM Manipulations

Ca2-VDM: Efficient Autoregressive Video Diffusion Model with Causal Generation and Cache Sharing

Mind the Gap: a Spectral Analysis of Rank Collapse and Signal Propagation in Attention Layers

Latent Score-Based Reweighting for Robust Classification on Imbalanced Tabular Data

Learning from True-False Labels via Multi-modal Prompt Retrieving

On the Power of Learning-Augmented Search Trees

Sparse Video-Gen: Accelerating Video Diffusion Transformers with Spatial-Temporal Sparsity

QuantSpec: Self-Speculative Decoding with Hierarchical Quantized KV Cache

Info-Coevolution: An Efficient Framework for Data Model Coevolution

Directed Graph Grammars for Sequence-based Learning

Feynman-Kac Correctors in Diffusion: Annealing, Guidance, and Product of Experts

In-Context Deep Learning via Transformer Models

Learning dynamics in linear recurrent neural networks

Enhancing Spectral GNNs: From Topology and Perturbation Perspectives

Structure Is All You Need: Structural Representation Learning on Hyper-Relational Knowledge Graphs

VerbalTS: Generating Time Series from Texts

Are LLMs Prescient? A Continuous Evaluation using Daily News as the Oracle

Aligning Protein Conformation Ensemble Generation with Physical Feedback

Stability and Generalization Capability of Subgraph Reasoning Models for Inductive Knowledge Graph Completion

Imitation Learning from a Single Temporally Misaligned Video

Inverse problems with experiment-guided AlphaFold

Multi-Turn Code Generation Through Single-Step Rewards

CTBench: A Library and Benchmark for Certified Training

Average Certified Radius is a Poor Metric for Randomized Smoothing

Generalization Principles for Inference over Text-Attributed Graphs with Large Language Models

Preference Adaptive and Sequential Text-to-Image Generation

Discrepancy Minimization in Input-Sparsity Time

Discovering Latent Causal Graphs from Spatiotemporal Data

Multidimensional Adaptive Coefficient for Inference Trajectory Optimization in Flow and Diffusion

Activation Space Interventions Can Be Transferred Between Large Language Models

On Differential Privacy for Adaptively Solving Search Problems via Sketching

Introducing 3D Representation for Dense Volume-to-Volume Translation via Score Fusion

AnalogGenie-Lite: Enhancing Scalability and Precision in Circuit Topology Discovery through Lightweight Graph Modeling

Optimal Survey Design for Private Mean Estimation

Deterministic Sparse Fourier Transform for Continuous Signals with Frequency Gap

MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency

Adaptive Learn-then-Test: Statistically Valid and Efficient Hyperparameter Selection

OWLS: Scaling Laws for Multilingual Speech Recognition and Translation Models

any4: Learned 4-bit Numeric Representation for LLMs

Universal Length Generalization with Turing Programs

Distribution-aware Fairness Learning in Medical Image Segmentation From A Control-Theoretic Perspective

Cavia: Camera-controllable Multi-view Video Diffusion with View-Integrated Attention

LDMol: A Text-to-Molecule Diffusion Model with Structurally Informative Latent Space Surpasses AR Models

Learning Invariant Causal Mechanism from Vision-Language Models

Measuring Representational Shifts in Continual Learning: A Linear Transformation Perspective

Finding Wasserstein Ball Center: Efficient Algorithm and The Applications in Fairness

Self-Organizing Visual Prototypes for Non-Parametric Representation Learning

Prompt-based Depth Pruning of Large Language Models

Everything Everywhere All at Once: LLMs can In-Context Learn Multiple Tasks in Superposition

RAGGED: Towards Informed Design of Scalable and Stable RAG Systems

Fast Min-$\epsilon$ Segmented Regression using Constant-Time Segment Merging

Local Pan-privacy for Federated Analytics

Overcoming Vocabulary Mismatch: Vocabulary-agnostic Teacher Guided Language Modeling

Scalable First-order Method for Certifying Optimal k-Sparse GLMs

Revisiting Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model

No Metric to Rule Them All: Toward Principled Evaluations of Graph-Learning Datasets

Visual and Domain Knowledge for Professional-level Graph-of-Thought Medical Reasoning

A Bregman Proximal Viewpoint on Neural Operators

LotteryCodec: Searching the Implicit Representation in a Random Network for Low-Complexity Image Compression

Autoencoder-Based Hybrid Replay for Class-Incremental Learning

Internal Causal Mechanisms Robustly Predict Language Model Out-of-Distribution Behaviors

Geometric Representation Condition Improves Equivariant Molecule Generation

From Weight-Based to State-Based Fine-Tuning: Further Memory Reduction on LoRA with Parallel Control

Learning Soft Sparse Shapes for Efficient Time-Series Classification

HyperIMTS: Hypergraph Neural Network for Irregular Multivariate Time Series Forecasting

Rejecting Hallucinated State Targets during Planning

Policy Filtration for RLHF to Mitigate Noise in Reward Models

DPO Meets PPO: Reinforced Token Optimization for RLHF

Revisiting Chain-of-Thought in Code Generation: Do Language Models Need to Learn Reasoning before Coding?

Byzantine-Resilient Federated Alternating Gradient Descent and Minimization for Partly-Decoupled Low Rank Matrix Learning

A Multi-Region Brain Model to Elucidate the Role of Hippocampus in Spatially Embedded Decision-Making

Robust Conformal Outlier Detection under Contaminated Reference Data

Scaling Large Motion Models with Million-Level Human Motions

Optimal Algorithm for Max-Min Fair Bandit

Learning Minimum-Size BDDs: Towards Efficient Exact Algorithms

Sable: a Performant, Efficient and Scalable Sequence Model for MARL

Do Multiple Instance Learning Models Transfer?

Homophily Enhanced Graph Domain Adaptation

An Interpretable N-gram Perplexity Threat Model for Large Language Model Jailbreaks

"Who experiences large model decay and why?" A Hierarchical Framework for Diagnosing Heterogeneous Performance Drift

Great Models Think Alike and this Undermines AI Oversight

Importance Corrected Neural JKO Sampling

Learn Singularly Perturbed Solutions via Homotopy Dynamics

Value-Based Deep RL Scales Predictably

Predicting the Susceptibility of Examples to Catastrophic Forgetting

Scalable Gaussian Processes with Latent Kronecker Structure

CAD-Editor: A Locate-then-Infill Framework with Automated Training Data Synthesis for Text-Based CAD Editing

AKRMap: Adaptive Kernel Regression for Trustworthy Visualization of Cross-Modal Embeddings

Learning to Stop: Deep Learning for Mean Field Optimal Stopping

Finite-Sample Convergence Bounds for Trust Region Policy Optimization in Mean Field Games

On the Impact of Hard Adversarial Instances on Overfitting in Adversarial Training

Catoni Contextual Bandits are Robust to Heavy-tailed Rewards

Logarithmic Regret for Online KL-Regularized Reinforcement Learning

Understanding Overadaptation in Supervised Fine-Tuning: The Role of Ensemble Methods

MA-LoT: Model-Collaboration Lean-based Long Chain-of-Thought Reasoning enhances Formal Theorem Proving

EmbodiedBench: Comprehensive Benchmarking Multi-modal Large Language Models for Vision-Driven Embodied Agents

Probabilistic Factorial Experimental Design for Combinatorial Interventions

The Global Convergence Time of Stochastic Gradient Descent in Non-Convex Landscapes: Sharp Estimates via Large Deviations

Softmax is not Enough (for Sharp Size Generalisation)

Towards Black-Box Membership Inference Attack for Diffusion Models

Learning Imperfect Information Extensive-form Games with Last-iterate Convergence under Bandit Feedback

Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction

Implicit degree bias in the link prediction task

Conformal Prediction with Cellwise Outliers: A Detect-then-Impute Approach

Breaking Silos: Adaptive Model Fusion Unlocks Better Time Series Forecasting

Aequa: Fair Model Rewards in Collaborative Learning via Slimmable Networks

Not all solutions are created equal: An analytical dissociation of functional and representational similarity in deep linear neural networks

Improving Out-of-Distribution Detection with Markov Logic Networks

TRUST-VLM: Thorough Red-Teaming for Uncovering Safety Threats in Vision-Language Models

TOPLOC: A Locality Sensitive Hashing Scheme for Trustless Verifiable Inference

Understanding Complexity in VideoQA via Visual Program Generation

Why Is Spatial Reasoning Hard for VLMs? An Attention Mechanism Perspective on Focus Areas

SPHINX: Structural Prediction using Hypergraph Inference Network

Regress, Don't Guess: A Regression-like Loss on Number Tokens for Language Models

DiLQR: Differentiable Iterative Linear Quadratic Regulator via Implicit Differentiation

Scalable Non-Equivariant 3D Molecule Generation via Rotational Alignment

Federated Oriented Learning: A Practical One-Shot Personalized Federated Learning Framework

Procurement Auctions via Approximately Optimal Submodular Optimization

Slimming the Fat-Tail: Morphing-Flow for Adaptive Time Series Modeling

Context is Key: A Benchmark for Forecasting with Essential Textual Information

Geometric Median (GM) Matching for Robust k-Subset Selection from Noisy Data

Moirai-MoE: Empowering Time Series Foundation Models with Sparse Mixture of Experts

Generalizing Causal Effects from Randomized Controlled Trials to Target Populations across Diverse Environments

The Harder Path: Last Iterate Convergence for Uncoupled Learning in Zero-Sum Games with Bandit Feedback

Position: Stop treating `AGI' as the north-star goal of AI research

COSDA: Counterfactual-based Susceptibility Risk Framework for Open-Set Domain Adaptation

Beyond the Permutation Symmetry of Transformers: The Role of Rotation for Model Fusion

Oracle-MoE: Locality-preserving Routing in the Oracle Space for Memory-constrained Large Language Model Inference

Objective drives the consistency of representational similarity across datasets

Linear $Q$-Learning Does Not Diverge in $L^2$: Convergence Rates to a Bounded Set

PokéChamp: an Expert-level Minimax Language Agent

Securing Equal Share: A Principled Approach for Learning Multiplayer Symmetric Games

Unlocking the Capabilities of Large Vision-Language Models for Generalizable and Explainable Deepfake Detection

How Do Large Language Monkeys Get Their Power (Laws)?

An Architecture Search Framework for Inference-Time Techniques

FlowDrag: 3D-aware Drag-based Image Editing with Mesh-guided Deformation Vector Flow Fields

FairICP: Encouraging Equalized Odds via Inverse Conditional Permutation

Position: Human Baselines in Model Evaluations Need Rigor and Transparency (With Recommendations & Reporting Checklist)

Modified K-means Algorithm with Local Optimality Guarantees

Archetypal SAE: Adaptive and Stable Dictionary Learning for Concept Extraction in Large Vision Models

Reward-Guided Prompt Evolving in Reinforcement Learning for LLMs

Focus On This, Not That! Steering LLMs with Adaptive Feature Specification

The Polynomial Stein Discrepancy for Assessing Moment Convergence

Active feature acquisition via explainability-driven ranking

REG: Rectified Gradient Guidance for Conditional Diffusion Models

Learning with Expected Signatures: Theory and Applications

Improving Compositional Generation with Diffusion Models Using Lift Scores

Hyper-Transforming Latent Diffusion Models

From Jack of All Trades to Master of One: Specializing LLM-based Autoraters to a Test Set

SpikF: Spiking Fourier Network for Efficient Long-term Prediction

Safety Certificate against Latent Variables with Partially Unidentifiable Dynamics

ZipAR: Parallel Autoregressive Image Generation through Spatial Locality

Uncertainty Quantification for LLM-Based Survey Simulations

An Entropy-Based Model for Hierarchical Learning

Compressed and distributed least-squares regression: convergence rates with applications to federated learning

MindCustomer: Multi-Context Image Generation Blended with Brain Signal

SDE Matching: Scalable and Simulation-Free Training of Latent Stochastic Differential Equations

Understanding Nonlinear Implicit Bias via Region Counts in Input Space

CPCF: A Cross-Prompt Contrastive Framework for Referring Multimodal Large Language Models

Improved Off-policy Reinforcement Learning in Biological Sequence Design

MimicMotion: High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance

Taming Knowledge Conflicts in Language Models

Kernel-based Unsupervised Embedding Alignment for Enhanced Visual Representation in Vision-language Models

Exploring Vision Semantic Prompt for Efficient Point Cloud Understanding

Density Ratio Estimation-based Bayesian Optimization with Semi-Supervised Learning

Beyond Self-Repellent Kernels: History-Driven Target Towards Efficient Nonlinear MCMC on General Graphs

SAFE: Finding Sparse and Flat Minima to Improve Pruning

From Mechanistic Interpretability to Mechanistic Biology: Training, Evaluating, and Interpreting Sparse Autoencoders on Protein Language Models

MindAligner: Explicit Brain Functional Alignment for Cross-Subject Visual Decoding from Limited fMRI Data

Hierarchical Reinforcement Learning with Uncertainty-Guided Diffusional Subgoals

Mitigating Local Cohesion and Global Sparseness in Graph Contrastive Learning with Fuzzy Boundaries

Towards a Unified Framework of Clustering-based Anomaly Detection

Test-Time Multimodal Backdoor Detection by Contrastive Prompting

Looking Beyond the Top-1: Transformers Determine Top Tokens in Order

Doubly Robust Conformalized Survival Analysis with Right-Censored Data

Does One-shot Give the Best Shot? Mitigating Model Inconsistency in One-shot Federated Learning

Score-Based Diffusion Policy Compatible with Reinforcement Learning via Optimal Transport

On Measuring Long-Range Interactions in Graph Neural Networks

Weight matrices compression based on PDB model in deep neural networks

IMTS is Worth Time $\times$ Channel Patches: Visual Masked Autoencoders for Irregular Multivariate Time Series Prediction

L-Diffusion: Laplace Diffusion for Efficient Pathology Image Segmentation

Geometric Generative Modeling with Noise-Conditioned Graph Networks

DyCodeEval: Dynamic Benchmarking of Reasoning Capabilities in Code Large Language Models Under Data Contamination

A Peer-review Look on Multi-modal Clustering: An Information Bottleneck Realization Method

Improving LLM Video Understanding with 16 Frames Per Second

Generalized additive models via direct optimization of regularized decision stump forests

Algorithm Development in Neural Networks: Insights from the Streaming Parity Task

Wolfpack Adversarial Attack for Robust Multi-Agent Reinforcement Learning

Improving Parallel Program Performance with LLM Optimizers via Agent-System Interfaces

HaploVL: A Single-Transformer Baseline for Multi-Modal Understanding

Off-Policy Actor-Critic for Adversarial Observation Robustness: Virtual Alternative Training via Symmetric Policy Evaluation

Towards Understanding Fine-Tuning Mechanisms of LLMs via Circuit Analysis

Origin Identification for Text-Guided Image-to-Image Diffusion Models

Preference Controllable Reinforcement Learning with Advanced Multi-Objective Optimization

Long-Form Speech Generation with Spoken Language Models

Sampling Binary Data by Denoising through Score Functions

Sparse Autoencoders for Hypothesis Generation

DVI:A Derivative-based Vision Network for INR

Random Registers for Cross-Domain Few-Shot Learning

Demystifying Singular Defects in Large Language Models

ToMA: Token Merge with Attention for Diffusion Models

Restoring Calibration for Aligned Large Language Models: A Calibration-Aware Fine-Tuning Approach

ELITE: Enhanced Language-Image Toxicity Evaluation for Safety

Cowpox: Towards the Immunity of VLM-based Multi-Agent Systems

Interpreting CLIP with Hierarchical Sparse Autoencoders

Learning Optimal Multimodal Information Bottleneck Representations

ROPO: Robust Preference Optimization for Large Language Models

Graph Inverse Style Transfer for Counterfactual Explainability

De-mark: Watermark Removal in Large Language Models

Differentially Private Analysis for Binary Response Models: Optimality, Estimation, and Inference

Parameter-Efficient Fine-Tuning of State Space Models

Controlling Underestimation Bias in Constrained Reinforcement Learning for Safe Exploration

Average Sensitivity of Hierarchical $k$-Median Clustering

SHIELD: Multi-task Multi-distribution Vehicle Routing Solver with Sparsity and Hierarchy

Efficient Length-Generalizable Attention via Causal Retrieval for Long-Context Language Modeling

Towards a General Time Series Forecasting Model with Unified Representation and Adaptive Transfer

Sounding that Object: Interactive Object-Aware Image to Audio Generation

Efficient Multi-modal Long Context Learning for Training-free Adaptation

Enhancing Foundation Models with Federated Domain Knowledge Infusion

SyncMind: Measuring Agent Out-of-Sync Recovery in Collaborative Software Engineering

Geometric and Physical Constraints Synergistically Enhance Neural PDE Surrogates

HybridGS: High-Efficiency Gaussian Splatting Data Compression using Dual-Channel Sparse Representation and Point Cloud Encoder

CostFilter-AD: Enhancing Anomaly Detection through Matching Cost Filtering

Cross-Modal Alignment via Variational Copula Modelling

Polynomial-Delay MAG Listing with Novel Locally Complete Orientation Rules

Graph-Assisted Stitching for Offline Hierarchical Reinforcement Learning

RAPID: Long-Context Inference with Retrieval-Augmented Speculative Decoding

A Manifold Perspective on the Statistical Generalization of Graph Neural Networks

Weak-to-Strong Jailbreaking on Large Language Models

Kinetic Langevin Diffusion for Crystalline Materials Generation

Privacy Attacks on Image AutoRegressive Models

Prompt-to-Leaderboard: Prompt-Adaptive LLM Evaluations

Tool Unlearning for Tool-Augmented LLMs

CSG-ODE: ControlSynth Graph ODE For Modeling Complex Evolution of Dynamic Graphs

Extreme Value Policy Optimization for Safe Reinforcement Learning

Mechanistic PDE Networks for Discovery of Governing Equations

Confounder-Free Continual Learning via Recursive Feature Normalization

One Arrow, Two Hawks: Sharpness-aware Minimization for Federated Learning via Global Model Trajectory

The Energy Loss Phenomenon in RLHF: A New Perspective on Mitigating Reward Hacking

Flexibility-conditioned protein structure design with flow matching

MPO: An Efficient Post-Processing Framework for Mixing Diverse Preference Alignment

Decision-aware Training of Spatiotemporal Forecasting Models to Select a Top-K Subset of Sites for Intervention

GRADEO: Towards Human-Like Evaluation for Text-to-Video Generation via Multi-Step Reasoning

Label Distribution Propagation-based Label Completion for Crowdsourcing

DiTAR: Diffusion Transformer Autoregressive Modeling for Speech Generation

Mechanistic Unlearning: Robust Knowledge Unlearning and Editing via Mechanistic Localization

Heavy-Tailed Linear Bandits: Huber Regression with One-Pass Update

Compressing tree ensembles through Level-wise Optimization and Pruning

Kernel Quantile Embeddings and Associated Probability Metrics

TeLoGraF: Temporal Logic Planning via Graph-encoded Flow Matching

A Generalizable Physics-Enhanced State Space Model for Long-Term Dynamics Forecasting in Complex Environments

OptMATH: A Scalable Bidirectional Data Synthesis Framework for Optimization Modeling

Cut out and Replay: A Simple yet Versatile Strategy for Multi-Label Online Continual Learning

Improving Memory Efficiency for Training KANs via Meta Learning

Towards the Causal Complete Cause of Multi-Modal Representation Learning

Field Matching: an Electrostatic Paradigm to Generate and Transfer Data

Ab Initio Nonparametric Variable Selection for Scalable Symbolic Regression with Large $p$

Lightweight Dataset Pruning without Full Training via Example Difficulty and Prediction Uncertainty

Can We Predict Performance of Large Models across Vision-Language Tasks?

A Unified Approach to Routing and Cascading for LLMs

Towards characterizing the value of edge embeddings in Graph Neural Networks

Score-based Pullback Riemannian Geometry: Extracting the Data Manifold Geometry using Anisotropic Flows

IBCircuit: Towards Holistic Circuit Discovery with Information Bottleneck

Memory Layers at Scale

DexScale: Automating Data Scaling for Sim2Real Generalizable Robot Control

Reward-Guided Speculative Decoding for Efficient LLM Reasoning

Quantifying Prediction Consistency Under Fine-tuning Multiplicity in Tabular LLMs

Learning Imbalanced Data with Beneficial Label Noise

3D-LMVIC: Learning-based Multi-View Image Compression with 3D Gaussian Geometric Priors

Multi-View Graph Clustering via Node-Guided Contrastive Encoding

MoMa: Modulating Mamba for Adapting Image Foundation Models to Video Recognition

Near-Optimal Consistency-Robustness Trade-Offs for Learning-Augmented Online Knapsack Problems

Robust ML Auditing using Prior Knowledge

Diffuse Everything: Multimodal Diffusion Models on Arbitrary State Spaces

Exogenous Isomorphism for Counterfactual Identifiability

FEAT-KD: Learning Concise Representations for Single and Multi-Target Regression via TabNet Knowledge Distillation

Optimizing Adaptive Attacks against Watermarks for Language Models

Distributed Nonparametric Estimation: from Sparse to Dense Samples per Terminal

Teaching Physical Awareness to LLMs through Sounds

Optimization over Sparse Support-Preserving Sets: Two-Step Projection with Global Optimality Guarantees

Knowledge Swapping via Learning and Unlearning

FRUGAL: Memory-Efficient Optimization by Reducing State Overhead for Scalable Training

Enhancing Foundation Models for Time Series Forecasting via Wavelet-based Tokenization

Learning from others' mistakes: Finetuning machine translation models with span-level error annotations

Quantum Speedups in Regret Analysis of Infinite Horizon Average-Reward Markov Decision Processes

Predictive Performance of Deep Quantum Data Re-uploading Models

Generative Data Mining with Longtail-Guided Diffusion

A Selective Learning Method for Temporal Graph Continual Learning

Forest-of-Thought: Scaling Test-Time Compute for Enhancing LLM Reasoning

CoSER: Coordinating LLM-Based Persona Simulation of Established Roles

Multi-Session Budget Optimization for Forward Auction-based Federated Learning

Wait-Less Offline Tuning and Re-solving for Online Decision Making

Accelerated Diffusion Models via Speculative Sampling

DCBM: Data-Efficient Visual Concept Bottleneck Models

Diffusion-based Adversarial Purification from the Perspective of the Frequency Domain

Demystifying the Paradox of Importance Sampling with an Estimated History-Dependent Behavior Policy in Off-Policy Evaluation

MetricEmbedding: Accelerate Metric Nearness by Tropical Inner Product

Strategic A/B testing via Maximum Probability-driven Two-armed Bandit

Devil is in the Details: Density Guidance for Detail-Aware Generation with Flow Models

Best Subset Selection: Optimal Pursuit for Feature Selection and Elimination

Optimal Sensor Scheduling and Selection for Continuous-Discrete Kalman Filtering with Auxiliary Dynamics

The Illusion of Role Separation: Hidden Shortcuts in LLM Role Learning (and How to Fix Them)

Playmate: Flexible Control of Portrait Animation via 3D-Implicit Space Guided Diffusion

LaRA: Benchmarking Retrieval-Augmented Generation and Long-Context LLMs – No Silver Bullet for LC or RAG Routing

Cache Me If You Must: Adaptive Key-Value Quantization for Large Language Models

SWAN: SGD with Normalization and Whitening Enables Stateless LLM Training

More Than Meets the Eye: Enhancing Multi-Object Tracking Even with Prolonged Occlusions

Generalization Performance of Ensemble Clustering: From Theory to Algorithm

LapSum - One Method to Differentiate Them All: Ranking, Sorting and Top-k Selection

Near-Optimal Sample Complexity for MDPs via Anchoring

Rethinking Aleatoric and Epistemic Uncertainty

Sample Complexity of Branch-length Estimation by Maximum Likelihood

Comparing Comparisons: Informative and Easy Human Feedback with Distinguishability Queries

Tensor Product Neural Networks for Functional ANOVA Model

A Comprehensive Framework for Analyzing the Convergence of Adam: Bridging the Gap with SGD

EARTH: Epidemiology-Aware Neural ODE with Continuous Disease Transmission Graph

Conformal Anomaly Detection in Event Sequences

Not All Tokens Matter All The Time: Dynamic Token Aggregation Towards Efficient Detection Transformers

Deep Sturm–Liouville: From Sample-Based to 1D Regularization with Learnable Orthogonal Basis Functions

DINO-WM: World Models on Pre-trained Visual Features enable Zero-shot Planning

Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning

Robust Automatic Modulation Classification with Fuzzy Regularization

Fixing the Loose Brake: Exponential-Tailed Stopping Time in Best Arm Identification

AdvPrompter: Fast Adaptive Adversarial Prompting for LLMs

Efficient Robotic Policy Learning via Latent Space Backward Planning

Measuring Diversity in Synthetic Datasets

An Empirical Study on Configuring In-Context Learning Demonstrations for Unleashing MLLMs' Sentimental Perception Capability

Monte-Carlo Tree Search with Uncertainty Propagation via Optimal Transport

Diffusion Instruction Tuning

From Low Rank Gradient Subspace Stabilization to Low-Rank Weights: Observations, Theories, and Applications

A Closer Look at Backdoor Attacks on CLIP

Agent-as-a-Judge: Evaluate Agents with Agents

RobustZero: Enhancing MuZero Reinforcement Learning Robustness to State Perturbations

Metadata Conditioning Accelerates Language Model Pre-training

Auto-reconfiguration for Latency Minimization in CPU-based DNN Serving

Angle Domain Guidance: Latent Diffusion Requires Rotation Rather Than Extrapolation

Stochastic Control for Fine-tuning Diffusion Models: Optimality, Regularity, and Convergence

Monte Carlo Tree Search for Comprehensive Exploration in LLM-Based Automatic Heuristic Design

Provable Maximum Entropy Manifold Exploration via Diffusion Models

FIC-TSC: Learning Time Series Classification with Fisher Information Constraint

Perceptually Constrained Precipitation Nowcasting Model

One-Step Diffusion Policy: Fast Visuomotor Policies via Diffusion Distillation

Understanding Mode Connectivity via Parameter Space Symmetry

Hierarchical Refinement: Optimal Transport to Infinity and Beyond

Differentially Private Federated $k$-Means Clustering with Server-Side Data

Causality-Aware Contrastive Learning for Robust Multivariate Time-Series Anomaly Detection

QUTE: Quantifying Uncertainty in TinyML models with Early-exit-assisted ensembles for model-monitoring

Connecting Thompson Sampling and UCB: Towards More Efficient Trade-offs Between Privacy and Regret

Explaining the role of Intrinsic Dimensionality in Adversarial Training

Vulnerability-Aware Alignment: Mitigating Uneven Forgetting in Harmful Fine-Tuning

AlphaDPO: Adaptive Reward Margin for Direct Preference Optimization

Uncertainty-Based Extensible Codebook for Discrete Federated Learning in Heterogeneous Data Silos

SK-VQA: Synthetic Knowledge Generation at Scale for Training Context-Augmented Multimodal LLMs

Towards an Explainable Comparison and Alignment of Feature Embeddings

Discrete and Continuous Difference of Submodular Minimization

Statistical Query Hardness of Multiclass Linear Classification with Random Classification Noise

Automatically Interpreting Millions of Features in Large Language Models

Universal Approximation Theorem of Deep Q-Networks

Wrapped Gaussian on the manifold of Symmetric Positive Definite Matrices

Flow-of-Options: Diversified and Improved LLM Reasoning by Thinking Through Options

Integrating Intermediate Layer Optimization and Projected Gradient Descent for Solving Inverse Problems with Diffusion Models

Mitigating Plasticity Loss in Continual Reinforcement Learning by Reducing Churn

HashAttention: Semantic Sparsity for Faster Inference

On-Device Collaborative Language Modeling via a Mixture of Generalists and Specialists

Modeling Multi-Task Model Merging as Adaptive Projective Gradient Descent

TAROT: Targeted Data Selection via Optimal Transport

Benchmarking Quantum Reinforcement Learning

Energy-Based Preference Model Offers Better Offline Alignment than the Bradley-Terry Preference Model

Demeaned Sparse: Efficient Anomaly Detection by Residual Estimate

Layer-wise Alignment: Examining Safety Alignment Across Image Encoder Layers in Vision Language Models

Eigen Analysis of Conjugate Kernel and Neural Tangent Kernel

Is Complex Query Answering Really Complex?

CoDy: Counterfactual Explainers for Dynamic Graphs

Physics-Informed Generative Modeling of Wireless Channels

On Path to Multimodal Generalist: General-Level and General-Bench

Learn from Downstream and Be Yourself in Multimodal Large Language Models Fine-Tuning

BOPO: Neural Combinatorial Optimization via Best-anchored and Objective-guided Preference Optimization

Grokking Beyond the Euclidean Norm of Model Parameters

CombiMOTS: Combinatorial Multi-Objective Tree Search for Dual-Target Molecule Generation

Improved Online Confidence Bounds for Multinomial Logistic Bandits

From Pixels to Perception: Interpretable Predictions via Instance-wise Grouped Feature Selection

Representation Shattering in Transformers: A Synthetic Study with Knowledge Editing

Modalities Contribute Unequally: Enhancing Medical Multi-modal Learning through Adaptive Modality Token Re-balancing

Variance as a Catalyst: Efficient and Transferable Semantic Erasure Adversarial Attack for Customized Diffusion Models

$\texttt{I$^2$MoE}$: Interpretable Multimodal Interaction-aware Mixture-of-Experts

Occult: Optimizing Collaborative Communications across Experts for Accelerated Parallel MoE Training and Inference

LowRA: Accurate and Efficient LoRA Fine-Tuning of LLMs under 2 Bits

Near-optimal Regret Using Policy Optimization in Online MDPs with Aggregate Bandit Feedback

Stabilizing Sample Similarity in Representation via Mitigating Random Consistency

On Explaining Equivariant Graph Networks via Improved Relevance Propagation

Smoothed Preference Optimization via ReNoise Inversion for Aligning Diffusion Models with Varied Human Preferences

Normalizing Flows are Capable Generative Models

LRA-QViT: Integrating Low-Rank Approximation and Quantization for Robust and Efficient Vision Transformers

Towards Rationale-Answer Alignment of LVLMs via Self-Rationale Calibration

Tightening Causal Bounds via Covariate-Aware Optimal Transport

Learning Classifiers That Induce Markets

A Parametric Contextual Online Learning Theory of Brokerage

Diss-l-ECT: Dissecting Graph Data with Local Euler Characteristic Transforms

Learning Smooth and Expressive Interatomic Potentials for Physical Property Prediction

Adversaries Can Misuse Combinations of Safe Models

Generalized Random Forests Using Fixed-Point Trees

Orthogonal Subspace Decomposition for Generalizable AI-Generated Image Detection

Benign Samples Matter! Fine-tuning On Outlier Benign Samples Severely Breaks Safety

Does Generation Require Memorization? Creative Diffusion Models using Ambient Diffusion

Ergodic Generative Flows

KGMark: A Diffusion Watermark for Knowledge Graphs

Improving Consistency Models with Generator-Augmented Flows

WATCH: Adaptive Monitoring for AI Deployments via Weighted-Conformal Martingales

Risk and cross validation in ridge regression with correlated samples

Lego Sketch: A Scalable Memory-augmented Neural Network for Sketching Data Streams

Active Reward Modeling: Adaptive Preference Labeling for Large Language Model Alignment

Upcycling Text-to-Image Diffusion Models for Multi-Task Capabilities

TimeBase: The Power of Minimalism in Efficient Long-term Time Series Forecasting

Deep Streaming View Clustering

AuPair: Golden Example Pairs for Code Repair

Towards Robustness and Explainability of Automatic Algorithm Selection

Feasible Action Search for Bandit Linear Programs via Thompson Sampling

KoNODE: Koopman-Driven Neural Ordinary Differential Equations with Evolving Parameters for Time Series Analysis

GLGENN: A Novel Parameter-Light Equivariant Neural Networks Architecture Based on Clifford Geometric Algebras

MetaOptimize: A Framework for Optimizing Step Sizes and Other Meta-parameters

Conformal Tail Risk Control for Large Language Model Alignment

Leveraging Model Guidance to Extract Training Data from Personalized Diffusion Models

Scaling Laws for Pre-training Agents and World Models

A Variational Framework for Improving Naturalness in Generative Spoken Language Models

Laplace Transform Based Low-Complexity Learning of Continuous Markov Semigroups

Positive-unlabeled AUC Maximization under Covariate Shift

Efficient Bisection Projection to Ensure Neural-Network Solution Feasibility for Optimization over General Set

Learning Robust Neural Processes with Risk-Averse Stochastic Optimization

LAION-C: An Out-of-Distribution Benchmark for Web-Scale Vision Models

Overcoming Non-monotonicity in Transducer-based Streaming Generation

Measuring Variable Importance in Heterogeneous Treatment Effects with Confidence

Position: Language model developers should report train-test overlap

Position: In-House Evaluation Is Not Enough. Towards Robust Third-Party Evaluation and Flaw Disclosure for General-Purpose AI

LEAPS: A discrete neural sampler via locally equivariant networks

Towards a Mechanistic Explanation of Diffusion Model Generalization

CoMemo: LVLMs Need Image Context with Image Memory

FairPFN: A Tabular Foundation Model for Causal Fairness

DiMa: Understanding the Hardness of Online Matching Problems via Diffusion Models

AffinityFlow: Guided Flows for Antibody Affinity Maturation

Foundation Molecular Grammar: Multi-Modal Foundation Models Induce Interpretable Molecular Graph Languages

Learngene Tells You How to Customize: Task-Aware Parameter Initialization at Flexible Scales

Attention-Only Transformers via Unrolled Subspace Denoising

Catch Your Emotion: Sharpening Emotion Perception in Multimodal Large Language Models

Improving Transformer World Models for Data-Efficient RL

Feature Importance Metrics in the Presence of Missing Data

Understanding Chain-of-Thought in LLMs through Information Theory

Causal Abstraction Learning based on the Semantic Embedding Principle

A Large Recurrent Action Model: xLSTM enables Fast Inference for Robotics Tasks

Are Large Brainwave Foundation Models Capable Yet ? Insights from Fine-Tuning

Enhancing Decision-Making of Large Language Models via Actor-Critic

Graph Neural Network Generalization With Gaussian Mixture Model Based Augmentation

Enhancing Target-unspecific Tasks through a Features Matrix

Symmetry-Aware GFlowNets

FireFlow: Fast Inversion of Rectified Flow for Image Semantic Editing

Deep Ridgelet Transform and Unified Universality Theorem for Deep and Shallow Joint-Group-Equivariant Machines

Direct Prediction Set Minimization via Bilevel Conformal Classifier Training

Test-Time Canonicalization by Foundation Models for Robust Perception

CLOVER: Cross-Layer Orthogonal Vectors Pruning

Visual Autoregressive Modeling for Image Super-Resolution

How Distributed Collaboration Influences the Diffusion Model Training? A Theoretical Perspective

Revisiting Cooperative Off-Policy Multi-Agent Reinforcement Learning

Trustworthy Machine Learning through Data-Specific Indistinguishability

Fast Estimation of Partial Dependence Functions using Trees

PDUDT: Provable Decentralized Unlearning under Dynamic Topologies

A Parameter-Free and Near-Optimal Zeroth-Order Algorithm for Stochastic Convex Optimization

DUNIA: Pixel-Sized Embeddings via Cross-Modal Alignment for Earth Observation Applications

Tackling Dimensional Collapse toward Comprehensive Universal Domain Adaptation

Disparate Conditional Prediction in Multiclass Classifiers

Adaptive Estimation and Learning under Temporal Distribution Shift

Perception in Reflection

Reconstructing Cell Lineage Trees from Phenotypic Features with Metric Learning

Optimization Proxies using Limited Labeled Data and Training Time -- A Semi-Supervised Bayesian Neural Network Approach

SHE: Streaming-media Hashing Retrieval

polybasic Speculative Decoding Through a Theoretical Perspective

Neural Interpretable PDEs: Harmonizing Fourier Insights with Attention for Scalable and Interpretable Physics Discovery

QuEst: Enhancing Estimates of Quantile-Based Distributional Measures Using Model Predictions

ABKD: Pursuing a Proper Allocation of the Probability Mass in Knowledge Distillation via $\alpha$-$\beta$-Divergence

Preference Optimization for Combinatorial Optimization Problems

LLM Alignment as Retriever Optimization: An Information Retrieval Perspective

Test-Time Adaptation for Online Vision-Language Navigation with Feedback-based Reinforcement Learning

Improving Model Alignment Through Collective Intelligence of Open-Source Models

Graph-Supported Dynamic Algorithm Configuration for Multi-Objective Combinatorial Optimization

GradPS: Resolving Futile Neurons in Parameter Sharing Network for Multi-Agent Reinforcement Learning

COExpander: Adaptive Solution Expansion for Combinatorial Optimization

Towards Understanding Parametric Generalized Category Discovery on Graphs

Do We Really Need Message Passing in Brain Network Modeling?

SMART-PC: Skeletal Model Adaptation for Robust Test-Time Training in Point Clouds

An Expressive and Self-Adaptive Dynamical System for Efficient Function Learning

UniMoMo: Unified Generative Modeling of 3D Molecules for De Novo Binder Design

Flow Q-Learning

Hybrid Quantum-Classical Multi-Agent Pathfinding

FuseUNet: A Multi-Scale Feature Fusion Method for U-like Networks

Towards Memorization Estimation: Fast, Formal and Free

LipsNet++: Unifying Filter and Controller into a Policy Network

Counting in Small Transformers: The Delicate Interplay between Attention and Feed-Forward Layers

How Far Is Video Generation from World Model: A Physical Law Perspective

Improving LLM Safety Alignment with Dual-Objective Optimization

SPMC: Self-Purifying Federated Backdoor Defense via Margin Contribution

Stochastic Deep Restoration Priors for Imaging Inverse Problems

Optimal Transfer Learning for Missing Not-at-Random Matrix Completion

Self-Supervised Learning of Intertwined Content and Positional Features for Object Detection

Provably Improving Generalization of Few-shot models with Synthetic Data

On the Importance of Embedding Norms in Self-Supervised Learning

FreeMesh: Boosting Mesh Generation with Coordinates Merging

From Passive to Active Reasoning: Can Large Language Models Ask the Right Questions under Incomplete Information?

Expected Variational Inequalities

Armijo Line-search Can Make (Stochastic) Gradient Descent Provably Faster

InfAlign: Inference-aware language model alignment

Power Mean Estimation in Stochastic Continuous Monte-Carlo Tree Search

HALoS: Hierarchical Asynchronous Local SGD over Slow Networks for Geo-Distributed Large Language Model Training

Offline Opponent Modeling with Truncated Q-driven Instant Policy Refinement

On the Power of Context-Enhanced Learning in LLMs

Context-Informed Neural ODEs Unexpectedly Identify Broken Symmetries: Insights from the Poincaré–Hopf Theorem

Enhancing Treatment Effect Estimation via Active Learning: A Counterfactual Covering Perspective

Oscillation-Reduced MXFP4 Training for Vision Transformers

xLSTM 7B: A Recurrent LLM for Fast and Efficient Inference

KAN-AD: Time Series Anomaly Detection with Kolmogorov–Arnold Networks

The Underlying Universal Statistical Structure of Natural Datasets

A Unified Framework for Entropy Search and Expected Improvement in Bayesian Optimization

Generalizing from SIMPLE to HARD Visual Reasoning: Can We Mitigate Modality Imbalance in VLMs?

Deep Bayesian Filter for Bayes-Faithful Data Assimilation

Re-ranking Reasoning Context with Tree Search Makes Large Vision-Language Models Stronger

Learning to Generate Projections for Reducing Dimensionality of Heterogeneous Linear Programming Problems

Robust Offline Reinforcement Learning with Linearly Structured $f$-Divergence Regularization

Fundamental Bias in Inverting Random Sampling Matrices with Application to Sub-sampled Newton

Does Data Scaling Lead to Visual Compositional Generalization?

Navigating Semantic Drift in Task-Agnostic Class-Incremental Learning

Federated Generalised Variational Inference: A Robust Probabilistic Federated Learning Framework

When and How Does CLIP Enable Domain and Compositional Generalization?

Core Context Aware Transformers for Long Context Language Modeling

Soup-of-Experts: Pretraining Specialist Models via Parameters Averaging

IT$^3$: Idempotent Test-Time Training

Mirror, Mirror of the Flow: How Does Regularization Shape Implicit Bias?

Impossible Videos

Learning to Incentivize in Repeated Principal-Agent Problems with Adversarial Agent Arrivals

MAPLE: Many-Shot Adaptive Pseudo-Labeling for In-Context Learning

BackSlash: Rate Constrained Optimized Training of Large Language Models

How Much Can Transfer? BRIDGE: Bounded Multi-Domain Graph Foundation Model with Generalization Guarantees

Wasserstein Flow Matching: Generative Modeling Over Families of Distributions

Do NOT Think That Much for 2+3=? On the Overthinking of Long Reasoning Models

Supercharging Graph Transformers with Advective Diffusion

Improved Theoretically-Grounded Evolutionary Algorithms for Subset Selection with a Linear Cost Constraint

Towards Better-than-2 Approximation for Constrained Correlation Clustering

Scaling Sparse Feature Circuits For Studying In-Context Learning

Overestimation in LLM Evaluation: A Controlled Large-Scale Study on Data Contamination’s Impact on Machine Translation

Erwin: A Tree-based Hierarchical Transformer for Large-scale Physical Systems

Learnings from Scaling Visual Tokenizers for Reconstruction and Generation

Confidential Guardian: Cryptographically Prohibiting the Abuse of Model Abstention

VinePPO: Refining Credit Assignment in RL Training of LLMs

Learning Strategic Language Agents in the Werewolf Game with Iterative Latent Space Policy Optimization

Distributional Diffusion Models with Scoring Rules

ABNet: Adaptive explicit-Barrier Net for Safe and Scalable Robot Learning

Identifying biological perturbation targets through causal differential networks

DA-KD: Difficulty-Aware Knowledge Distillation for Efficient Large Language Models

ADDQ: Adaptive distributional double Q-learning

DS-VLM: Diffusion Supervision Vision Language Model

MissScore: High-Order Score Estimation in the Presence of Missing Data

From Feature Interaction to Feature Generation: A Generative Paradigm of CTR Prediction Models

MGD$^3$ : Mode-Guided Dataset Distillation using Diffusion Models

Adaptive kernel predictors from feature-learning infinite limits of neural networks

Learning State-Based Node Representations from a Class Hierarchy for Fine-Grained Open-Set Detection

COMRECGC: Global Graph Counterfactual Explainer through Common Recourse

MERIT: Maximum-normalized Element-wise Ratio for Language Model Large-batch Training

Online Laplacian-Based Representation Learning in Reinforcement Learning

RestoreGrad: Signal Restoration Using Conditional Denoising Diffusion Models with Jointly Learned Prior

Triple-Optimistic Learning for Stochastic Contextual Bandits with General Constraints

Provable and Practical Online Learning Rate Adaptation with Hypergradient Descent

Pixel-level Certified Explanations via Randomized Smoothing

Refining Adaptive Zeroth-Order Optimization at Ease

Improving Value Estimation Critically Enhances Vanilla Policy Gradient

DMM: Distributed Matrix Mechanism for Differentially-Private Federated Learning Based on Constant-Overhead Linear Secret Resharing

An Efficient Matrix Multiplication Algorithm for Accelerating Inference in Binary and Ternary Neural Networks

Differentially Private Space-Efficient Algorithms for Counting Distinct Elements in the Turnstile Model

When Do LLMs Help With Node Classification? A Comprehensive Analysis

The dark side of the forces: assessing non-conservative force models for atomistic machine learning

Steer LLM Latents for Hallucination Detection

WyckoffDiff -- A Generative Diffusion Model for Crystal Symmetry

Direct Discriminative Optimization: Your Likelihood-Based Visual Generative Model is Secretly a GAN Discriminator

Fundamental limits of learning in sequence multi-index models and deep attention networks: high-dimensional asymptotics and sharp thresholds

Nested Expectations with Kernel Quadrature

Demystifying Long Chain-of-Thought Reasoning

Unified Breakdown Analysis for Byzantine Robust Gossip

Aligned Multi Objective Optimization

Collaborative Mean Estimation Among Heterogeneous Strategic Agents: Individual Rationality, Fairness, and Truthful Contribution

OW-VAP: Visual Attribute Parsing for Open World Object Detection

Smooth Interpolation for Improved Discrete Graph Generative Models

One Image is Worth a Thousand Words: A Usability Preservable Text-Image Collaborative Erasing Framework

Algorithms with Calibrated Machine Learning Predictions

Reinforced Learning Explicit Circuit Representations for Quantum State Characterization from Local Measurements

Hypothesis Testing for Generalized Thurstone Models

Fast Inference with Kronecker-Sparse Matrices

Multi-Objective Causal Bayesian Optimization

An Efficient Private GPT Never Autoregressively Decodes

A Theory for Conditional Generative Modeling on Multiple Data Sources

Heterogeneous Treatment Effect in Time-to-Event Outcomes: Harnessing Censored Data with Recursively Imputed Trees

Scalable Generation of Spatial Transcriptomics from Histology Images via Whole-Slide Flow Matching

DistiLLM-2: A Contrastive Approach Boosts the Distillation of LLMs

Staged and Physics-Grounded Learning Framework with Hyperintensity Prior for Pre-Contrast MRI Synthesis

Differential Coding for Training-Free ANN-to-SNN Conversion

Correlation Clustering Beyond the Pivot Algorithm

Subgroups Matter for Robust Bias Mitigation

Synthesizing Software Engineering Data in a Test-Driven Manner

From Complex to Atomic: Enhancing Augmented Generation via Knowledge-Aware Dual Rewriting and Reasoning

Demystifying Catastrophic Forgetting in Two-Stage Incremental Object Detector

SSHR: More Secure Generative Steganography with High-Quality Revealed Secret Images

Inducing, Detecting and Characterising Neural Modules: A Pipeline for Functional Interpretability in Reinforcement Learning

SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models

SpeCache: Speculative Key-Value Caching for Efficient Generation of LLMs

MiraGe: Editable 2D Images using Gaussian Splatting

SToFM: a Multi-scale Foundation Model for Spatial Transcriptomics

FlexControl: Computation-Aware Conditional Control with Differentiable Router for Text-to-Image Generation

Adaptive Median Smoothing: Adversarial Defense for Unlearned Text-to-Image Diffusion Models at Inference Time

Explanatory Instructions: Towards Unified Vision Tasks Understanding and Zero-shot Generalization

Tuning Sequential Monte Carlo Samplers via Greedy Incremental Divergence Minimization

Towards Trustworthy Federated Learning with Untrusted Participants

PROTOCOL: Partial Optimal Transport-enhanced Contrastive Learning for Imbalanced Multi-view Clustering

New Bounds for Sparse Variational Gaussian Processes

Channel Normalization for Time Series Channel Identification

Conformity Score Averaging for Classification

Zero-Shot Cyclic Peptide Design via Composable Geometric Constraints

Emoji Attack: Enhancing Jailbreak Attacks Against Judge LLM Detection

GraphCL: Graph-based Clustering for Semi-Supervised Medical Image Segmentation

Optimizing Social Network Interventions via Hypergradient-Based Recommender System Design

Revisiting the Predictability of Performative, Social Events

NExtLong: Toward Effective Long-Context Training without Long Documents

Edge-Colored Clustering in Hypergraphs: Beyond Minimizing Unsatisfied Edges

Provable Zero-Shot Generalization in Offline Reinforcement Learning

The Double-Ellipsoid Geometry of CLIP

Handling Imbalanced Pseudolabels for Vision-Language Models with Concept Alignment and Confusion-Aware Calibrated Margin

Nonconvex Theory of $M$-estimators with Decomposable Regularizers

Online Clustering of Dueling Bandits

RE-IMAGINE: Symbolic Benchmark Synthesis for Reasoning Evaluation

Evolving Minds: Logic-Informed Inference from Temporal Action Patterns

Federated In-Context Learning: Iterative Refinement for Improved Answer Quality

Star Attention: Efficient LLM Inference over Long Sequences

Boosting Protein Graph Representations through Static-Dynamic Fusion

GPTAQ: Efficient Finetuning-Free Quantization for Asymmetric Calibration

Hidden No More: Attacking and Defending Private Third-Party LLM Inference

Causal-PIK: Causality-based Physical Reasoning with a Physics-Informed Kernel

Federated Causal Structure Learning with Non-identical Variable Sets

Rapid Overfitting of Multi-Pass SGD in Stochastic Convex Optimization

NETS: A Non-equilibrium Transport Sampler

KBQA-o1: Agentic Knowledge Base Question Answering with Monte Carlo Tree Search

FedOne: Query-Efficient Federated Learning for Black-box Discrete Prompt Learning

Action-Minimization Meets Generative Modeling: Efficient Transition Path Sampling with the Onsager-Machlup Functional

Beyond Entropy: Region Confidence Proxy for Wild Test-Time Adaptation

Scalable Attribute-Missing Graph Clustering via Neighborhood Differentiation

GEFA: A General Feature Attribution Framework Using Proxy Gradient Estimation

Generalizable Multi-Camera 3D Object Detection from a Single Source via Fourier Cross-View Learning

Cape: Context-Aware Prompt Perturbation Mechanism with Differential Privacy

Zero-Shot Generalization of GNNs over Distinct Attribute Domains

How to set AdamW's weight decay as you scale model and dataset size

Toward Robust Hyper-Detailed Image Captioning: A Multiagent Approach and Dual Evaluation Metrics for Factuality and Coverage

PISA Experiments: Exploring Physics Post-Training for Video Diffusion Models by Watching Stuff Drop

AAAR-1.0: Assessing AI’s Potential to Assist Research

Loss Functions and Operators Generated by f-Divergences

H-Tuning: Toward Low-Cost and Efficient ECG-based Cardiovascular Disease Detection with Pre-Trained Models

Joint Learning of Energy-based Models and their Partition Function

PINNsAgent: Automated PDE Surrogation with Large Language Models

A Market for Accuracy: Classification Under Competition

From Debate to Equilibrium: Belief‑Driven Multi‑Agent LLM Reasoning via Bayesian Nash Equilibrium

Residual Matrix Transformers: Scaling the Size of the Residual Stream

BEST-Route: Adaptive LLM Routing with Test-Time Optimal Compute

Puzzle: Distillation-Based NAS for Inference-Optimized LLMs

Which Agent Causes Task Failures and When? On Automated Failure Attribution of LLM Multi-Agent Systems

Learning Compact Semantic Information for Incomplete Multi-View Missing Multi-Label Classification

ConceptAttention: Diffusion Transformers Learn Highly Interpretable Features

Haste Makes Waste: A Simple Approach for Scaling Graph Neural Networks

Efficient Network Automatic Relevance Determination

An Efficient Search-and-Score Algorithm for Ancestral Graphs using Multivariate Information Scores for Complex Non-linear and Categorical Data

SCISSOR: Mitigating Semantic Bias through Cluster-Aware Siamese Networks for Robust Classification

Almost Optimal Fully Dynamic $k$-Center Clustering with Recourse

Customizing the Inductive Biases of Softmax Attention using Structured Matrices

DSBRouter: End-to-end Global Routing via Diffusion Schr\"{o}dinger Bridge

Chaos Meets Attention: Transformers for Large-Scale Dynamical Prediction

RocketKV: Accelerating Long-Context LLM Inference via Two-Stage KV Cache Compression

Selective Preference Aggregation

Enhancing Certified Robustness via Block Reflector Orthogonal Layers and Logit Annealing Loss

The Role of Sparsity for Length Generalization in LLMs

Directly Forecasting Belief for Reinforcement Learning with Delays

Provable Efficiency of Guidance in Diffusion Models for General Data Distribution

A Theoretical Framework For Overfitting In Energy-based Modeling

Deep Linear Network Training Dynamics from Random Initialization: Data, Width, Depth, and Hyperparameter Transfer

Fast Large Language Model Collaborative Decoding via Speculation

KIND: Knowledge Integration and Diversion for Training Decomposable Models

On the Vulnerability of Applying Retrieval-Augmented Generation within Knowledge-Intensive Application Domains

Learning Distances from Data with Normalizing Flows and Score Matching

Rectifying Conformity Scores for Better Conditional Coverage

DocKS-RAG: Optimizing Document-Level Relation Extraction through LLM-Enhanced Hybrid Prompt Tuning

Synthesizing Images on Perceptual Boundaries of ANNs for Uncovering and Manipulating Human Perceptual Variability

The Perils of Optimizing Learned Reward Functions: Low Training Error Does Not Guarantee Low Regret

Simultaneous Multi-Robot Motion Planning with Projected Diffusion Models

Regret-Free Reinforcement Learning for Temporal Logic Specifications

Towards Theoretical Understanding of Sequential Decision Making with Preference Feedback

What makes an Ensemble (Un) Interpretable?

When Bad Data Leads to Good Models

Counterfactual Voting Adjustment for Quality Assessment and Fairer Voting in Online Platforms with Helpfulness Evaluation

ResKoopNet: Learning Koopman Representations for Complex Dynamics with Spectral Residuals

LLM-SRBench: A New Benchmark for Scientific Equation Discovery with Large Language Models

A Rescaling-Invariant Lipschitz Bound Based on Path-Metrics for Modern ReLU Network Parameterizations

Optimal Fair Learning Robust to Adversarial Distribution Shift

Retrieval-Augmented Language Model for Knowledge-aware Protein Encoding

ROS: A GNN-based Relax-Optimize-and-Sample Framework for Max-$k$-Cut Problems

Learning Latent Graph Structures and their Uncertainty

Locality Preserving Markovian Transition for Instance Retrieval

Jacobian Sparse Autoencoders: Sparsify Computations, Not Just Activations

Unlocking the Power of SAM 2 for Few-Shot Segmentation

Mind the Gap: A Practical Attack on GGUF Quantization

Exploiting Similarity for Computation and Communication-Efficient Decentralized Optimization

Non-stationary Online Learning for Curved Losses: Improved Dynamic Regret via Mixability

Proxy-FDA: Proxy-based Feature Distribution Alignment for Fine-tuning Vision Foundation Models without Forgetting

Adapting Precomputed Features for Efficient Graph Condensation

TSP: A Two-Sided Smoothed Primal-Dual Method for Nonconvex Bilevel Optimization

HyperNear: Unnoticeable Node Injection Attacks on Hypergraph Neural Networks

LAST SToP for Modeling Asynchronous Time Series

One-Step Generalization Ratio Guided Optimization for Domain Generalization

Fast Exact Unlearning for In-Context Learning Data for LLMs

Understanding the difficulties of posterior predictive estimation

Learning to Route LLMs with Confidence Tokens

Trusted Multi-View Classification with Expert Knowledge Constraints

Fast Video Generation with Sliding Tile Attention

Sleeping Reinforcement Learning

UnHiPPO: Uncertainty-aware Initialization for State Space Models

Preference learning made easy: Everything should be understood through win rate

Super Deep Contrastive Information Bottleneck for Multi-modal Clustering

Probably Approximately Global Robustness Certification

Achieving Linear Speedup and Near-Optimal Complexity for Decentralized Optimization over Row-stochastic Networks

Can DBNNs Robust to Environmental Noise for Resource-constrained Scenarios?

Natural Perturbations for Black-box Training of Neural Networks by Zeroth-Order Optimization

Gradient Boosting Reinforcement Learning

Teaching Language Models to Critique via Reinforcement Learning

LLM Enhancers for GNNs: An Analysis from the Perspective of Causal Mechanism Identification

Self-Consuming Generative Models with Adversarially Curated Data

EQ-VAE: Equivariance Regularized Latent Space for Improved Generative Image Modeling

Rethinking Benign Overfitting in Two-Layer Neural Networks

Distributed Retraction-Free and Communication-Efficient Optimization on the Stiefel Manifold

Training a Generally Curious Agent

Efficient Core-set Selection for Deep Learning Through Squared Loss Minimization

Beyond Bradley-Terry Models: A General Preference Model for Language Model Alignment

Improving Out-of-Distribution Detection via Dynamic Covariance Calibration

Decision Theoretic Foundations for Conformal Prediction: Optimal Uncertainty Quantification for Risk-Averse Agents

OmniArch: Building Foundation Model for Scientific Computing

Towards Robust Influence Functions with Flat Validation Minima

Universal Sparse Autoencoders: Interpretable Cross-Model Concept Alignment

ESPFormer: Doubly-Stochastic Attention with Expected Sliced Transport Plans

Point Cloud Dataset Distillation

TabSDS: a Lightweight, Fully Non-Parametric, and Model Free Approach for Generating Synthetic Tabular Data

ConText: Driving In-context Learning for Text Removal and Segmentation

BoA: Attention-aware Post-training Quantization without Backpropagation

Reaction Graph: Towards Reaction-Level Modeling for Chemical Reactions with 3D Structures

UP-VLA: A Unified Understanding and Prediction Model for Embodied Agent

Finite-Time Analysis of Discrete-Time Stochastic Interpolants

Online Linear Classification with Massart Noise

The Four Color Theorem for Cell Instance Segmentation

Stay-Positive: A Case for Ignoring Real Image Features in Fake Image Detection

Learning Monotonic Probabilities with a Generative Cost Model

A Closer Look at Multimodal Representation Collapse

Expressive Score-Based Priors for Distribution Matching with Geometry-Preserving Regularization

Scalable Sobolev IPM for Probability Measures on a Graph

LLM-Assisted Semantically Diverse Teammate Generation for Efficient Multi-agent Coordination

PF3plat: Pose-Free Feed-Forward 3D Gaussian Splatting for Novel View Synthesis

AdaDecode: Accelerating LLM Decoding with Adaptive Layer Parallelism

InfoSAM: Fine-Tuning the Segment Anything Model from An Information-Theoretic Perspective

UniSim: A Unified Simulator for Time-Coarsened Dynamics of Biomolecules

Advancing Personalized Learning with Neural Collapse for Long-Tail Challenge

When Diffusion Models Memorize: Inductive Biases in Probability Flow of Minimum-Norm Shallow Neural Nets

Learning the Electronic Hamiltonian of Large Atomic Structures

Online Learning in the Random-Order Model

Dueling Convex Optimization with General Preferences

Computing Optimal Transport Maps and Wasserstein Barycenters Using Conditional Normalizing Flows

Improving Zero-Shot Adversarial Robustness in Vision-Language Models by Closed-form Alignment of Adversarial Path Simplices

D-Fusion: Direct Preference Optimization for Aligning Diffusion Models with Visually Consistent Samples

Interpreting the Repeated Token Phenomenon in Large Language Models

Approximating Latent Manifolds in Neural Networks via Vanishing Ideals

Hybrid Spiking Vision Transformer for Object Detection with Event Cameras

Generating Hypotheses of Dynamic Causal Graphs in Neuroscience: Leveraging Generative Factor Models of Observed Time Series

Beyond Atoms: Enhancing Molecular Pretrained Representations with 3D Space Modeling

Reducing Tool Hallucination via Reliability Alignment

Task-Aware Virtual Training: Enhancing Generalization in Meta-Reinforcement Learning for Out-of-Distribution Tasks

Diffusion Counterfactual Generation with Semantic Abduction

Towards Escaping from Class Dependency Modeling for Multi-Dimensional Classification

Learning Initial Basis Selection for Linear Programming via Duality-Inspired Tripartite Graph Representation and Comprehensive Supervision

Right Time to Learn: Promoting Generalization via Bio-inspired Spacing Effect in Knowledge Distillation

A Unified View on Learning Unnormalized Distributions via Noise-Contrastive Estimation

Diving into Self-Evolving Training for Multimodal Reasoning

Feature Shift Localization Network

Learning Safe Control via On-the-Fly Bandit Exploration

When to retrain a machine learning model

Retrieval-Augmented Perception: High-resolution Image Perception Meets Visual RAG

Non-Asymptotic and Non-Lipschitzian Bounds on Optimal Values in Stochastic Optimization Under Heavy Tails

BAME: Block-Aware Mask Evolution for Efficient N:M Sparse Training

Shielded Diffusion: Generating Novel and Diverse Images using Sparse Repellency

DeepLayout: Learning Neural Representations of Circuit Placement Layout

Speculate, then Collaborate: Fusing Knowledge of Language Models during Decoding

FOCoOp: Enhancing Out-of-Distribution Robustness in Federated Prompt Learning for Vision-Language Models

Convergence Analysis of Policy Gradient Methods with Dynamic Stochasticity

A Unified Theoretical Analysis of Private and Robust Offline Alignment: from RLHF to DPO

A Non-Asymptotic Convergent Analysis for Scored-Based Graph Generative Model via a System of Stochastic Differential Equations

LIFT the Veil for the Truth: Principal Weights Emerge after Rank Reduction for Reasoning-Focused Supervised Fine-Tuning

Habitizing Diffusion Planning for Efficient and Effective Decision Making

Fast Tensor Completion via Approximate Richardson Iteration

K$^2$IE: Kernel Method-based Kernel Intensity Estimators for Inhomogeneous Poisson Processes

Large Continual Instruction Assistant

When Dynamic Data Selection Meets Data Augmentation: Achieving Enhanced Training Acceleration

RelGNN: Composite Message Passing for Relational Deep Learning

Non-Stationary Predictions May Be More Informative: Exploring Pseudo-Labels with a Two-Phase Pattern of Training Dynamics

Understanding Sharpness Dynamics in NN Training with a Minimalist Example: The Effects of Dataset Difficulty, Depth, Stochasticity, and More

Quadruple Attention in Many-body Systems for Accurate Molecular Property Predictions

Adversarial Cooperative Rationalization: The Risk of Spurious Correlations in Even Clean Datasets

Neighbour-Driven Gaussian Process Variational Autoencoders for Scalable Structured Latent Modelling

Monte Carlo Tree Diffusion for System 2 Planning

Bayesian Neural Scaling Law Extrapolation with Prior-Data Fitted Networks

Beyond Induction Heads: In-Context Meta Learning Induces Multi-Phase Circuit Emergence

BlockDialect: Block-wise Fine-grained Mixed Format Quantization for Energy-Efficient LLM Inference

Weakly-Supervised Contrastive Learning for Imprecise Class Labels

Maintaining Proportional Committees with Dynamic Candidate Sets

OV-MER: Towards Open-Vocabulary Multimodal Emotion Recognition

Solving Satisfiability Modulo Counting Exactly with Probabilistic Circuits

Adaptive Flow Matching for Resolving Small-Scale Physics

Exact Upper and Lower Bounds for the Output Distribution of Neural Networks with Random Inputs

Reward Translation via Reward Machine in Semi-Alignable MDPs

Controlled Generation with Equivariant Variational Flow Matching

Transformer-Based Spatial-Temporal Counterfactual Outcomes Estimation

Beyond Low-rank Decomposition: A Shortcut Approach for Efficient On-Device Learning

Thickness-aware E(3)-Equivariant 3D Mesh Neural Networks

Disentangling Invariant Subgraph via Variance Contrastive Estimation under Distribution Shifts

TUMTraf VideoQA: Dataset and Benchmark for Unified Spatio-Temporal Video Understanding in Traffic Scenes

Lexico: Extreme KV Cache Compression via Sparse Coding over Universal Dictionaries

Symmetric Reinforcement Learning Loss for Robust Learning on Diverse Tasks and Model Scales

ProSec: Fortifying Code LLMs with Proactive Security Alignment

Catching Two Birds with One Stone: Reward Shaping with Dual Random Networks for Balancing Exploration and Exploitation

PASS: Private Attributes Protection with Stochastic Data Substitution

Variational Control for Guidance in Diffusion Models

Settling the Maximin Share Fairness for Scheduling among Groups of Machines

LightGTS: A Lightweight General Time Series Forecasting Model

ETTA: Elucidating the Design Space of Text-to-Audio Models

Equivalence is All: A Unified View for Self-supervised Graph Learning

Scaling Laws for Upcycling Mixture-of-Experts Language Models

SENSEI: Semantic Exploration Guided by Foundation Models to Learn Versatile World Models

Can Classic GNNs Be Strong Baselines for Graph-level Tasks? Simple Architectures Meet Excellence

Refined generalization analysis of the Deep Ritz Method and Physics-Informed Neural Networks

Flow-field inference from neural data using deep recurrent networks

MARGE: Improving Math Reasoning with Guided Exploration

Do Vision-Language Models Really Understand Visual Language?

Physics Aware Neural Networks for Unsupervised Binding Energy Prediction

Don't Restart, Just Reuse: Reoptimizing MILPs with Dynamic Parameters

Low-Dimension-to-High-Dimension Generalization and Its Implications for Length Generalization

Concept-Based Unsupervised Domain Adaptation

CABS: Conflict-Aware and Balanced Sparsification for Enhancing Model Merging

NegMerge: Sign-Consensual Weight Merging for Machine Unlearning

You Get What You Give: Reciprocally Fair Federated Learning

On the Out-of-Distribution Generalization of Self-Supervised Learning

MoHAVE: Mixture of Hierarchical Audio-Visual Experts for Robust Speech Recognition

Leveraging Diffusion Model as Pseudo-Anomalous Graph Generator for Graph-Level Anomaly Detection

Neural Collapse Beyond the Unconstrained Features Model: Landscape, Dynamics, and Generalization in the Mean-Field Regime

Stealing That Free Lunch: Exposing the Limits of Dyna-Style Reinforcement Learning

In-Context Learning as Conditioned Associative Memory Retrieval

Efficient Logit-based Knowledge Distillation of Deep Spiking Neural Networks for Full-Range Timestep Deployment

Unraveling the Interplay between Carryover Effects and Reward Autocorrelations in Switchback Experiments

Representation Surgery in Model Merging with Probabilistic Modeling

The Limits of Predicting Agents from Behaviour

Stochastic Smoothed Primal-Dual Algorithms for Nonconvex Optimization with Linear Inequality Constraints

Provable Benefits of Unsupervised Pre-training and Transfer Learning via Single-Index Models

Verification Learning: Make Unsupervised Neuro-Symbolic System Feasible

Merge-Friendly Post-Training Quantization for Multi-Target Domain Adaptation

Efficient Motion Prompt Learning for Robust Visual Tracking

Selective Prompt Anchoring for Code Generation

Training Dynamics of In-Context Learning in Linear Attention

STAIR: Improving Safety Alignment with Introspective Reasoning

Long-Term TalkingFace Generation via Motion-Prior Conditional Diffusion Model

AnyEdit: Edit Any Knowledge Encoded in Language Models

AtlasD: Automatic Local Symmetry Discovery

Gradual Transition from Bellman Optimality Operator to Bellman Operator in Online Reinforcement Learning

Unveiling AI's Blind Spots: An Oracle for In-Domain, Out-of-Domain, and Adversarial Errors

Adversarial Reasoning at Jailbreaking Time

Calibrating Video Watch-time Predictions with Credible Prototype Alignment

TabFSBench: Tabular Benchmark for Feature Shifts in Open Environments

MoE-SVD: Structured Mixture-of-Experts LLMs Compression via Singular Value Decomposition

$\infty$-Video: A Training-Free Approach to Long Video Understanding via Continuous-Time Memory Consolidation

Enhancing the Influence of Labels on Unlabeled Nodes in Graph Convolutional Networks

Non-stationary Diffusion For Probabilistic Time Series Forecasting

ROME is Forged in Adversity: Robust Distilled Datasets via Information Bottleneck

Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale

Learning-Augmented Algorithms for MTS with Bandit Access to Multiple Predictors

Compositional Risk Minimization

The Devil Is in the Details: Tackling Unimodal Spurious Correlations for Generalizable Multimodal Reward Models

Splitting & Integrating: Out-of-Distribution Detection via Adversarial Gradient Attribution

IRBridge: Solving Image Restoration Bridge with Pre-trained Generative Diffusion Models

Efficiently Access Diffusion Fisher: Within the Outer Product Span Space

The Surprising Agreement Between Convex Optimization Theory and Learning-Rate Scheduling for Large Model Training

AssistanceZero: Scalably Solving Assistance Games

Automated Red Teaming with GOAT: the Generative Offensive Agent Tester

Language Models as Implicit Tree Search

The Price of Linear Time: Error Analysis of Structured Kernel Interpolation

Language Models May Verbatim Complete Text They Were Not Explicitly Trained On

InfoSEM: A Deep Generative Model with Informative Priors for Gene Regulatory Network Inference

A Machine Learning Approach to Duality in Statistical Physics

Regularized Langevin Dynamics for Combinatorial Optimization

Polynomial-Time Approximability of Constrained Reinforcement Learning

DSP: Dynamic Sequence Parallelism for Multi-Dimensional Transformers

Asymmetric Decision-Making in Online Knowledge Distillation: Unifying Consensus and Divergence

Benchmarking Abstract and Reasoning Abilities Through A Theoretical Perspective

On the Dynamic Regret of Following the Regularized Leader: Optimism with History Pruning

Privacy Amplification by Structured Subsampling for Deep Differentially Private Time Series Forecasting

Test-Time Training Provably Improves Transformers as In-context Learners

Organize the Web: Constructing Domains Enhances Pre-Training Data Curation

Vision-Language Model Selection and Reuse for Downstream Adaptation

A Meta-learner for Heterogeneous Effects in Difference-in-Differences

Video Prediction Policy: A Generalist Robot Policy with Predictive Visual Representations

No Soundness in the Real World: On the Challenges of the Verification of Deployed Neural Networks

ReVISE: Learning to Refine at Test-Time via Intrinsic Self-Verification

A Computationally Efficient Algorithm for Infinite-Horizon Average-Reward Linear MDPs

Efficient and Separate Authentication Image Steganography Network

Learning Fused State Representations for Control from Multi-View Observations

An Augmentation-Aware Theory for Self-Supervised Contrastive Learning

Riemann Tensor Neural Networks: Learning Conservative Systems with Physics-Constrained Networks

Compressed Image Generation with Denoising Diffusion Codebook Models

ConfPO: Exploiting Policy Model Confidence for Critical Token Selection in Preference Optimization

CFP-Gen: Combinatorial Functional Protein Generation via Diffusion Language Models

Large Language Models are Demonstration Pre-Selectors for Themselves

WAVE: Weighted Autoregressive Varying Gate for Time Series Forecasting

Provably Cost-Sensitive Adversarial Defense via Randomized Smoothing

Large Language-Geometry Model: When LLM meets Equivariance

Closed-Loop Long-Horizon Robotic Planning via Equilibrium Sequence Modeling

AdaSplash: Adaptive Sparse Flash Attention

Open Materials Generation with Stochastic Interpolants

LV-XAttn: Distributed Cross-Attention for Long Visual Inputs in Multimodal Large Language Models

Adversarial Robustness via Deformable Convolution with Stochasticity

MuseControlLite: Multifunctional Music Generation with Lightweight Conditioners

KV Shifting Attention Enhances Language Modeling

Contrastive Localized Language-Image Pre-Training

Universal Approximation of Mean-Field Models via Transformers

The Noisy Laplacian: a Threshold Phenomenon for Non-Linear Dimension Reduction

µnit Scaling: Simple and Scalable FP8 LLM Training

AGAV-Rater: Adapting Large Multimodal Model for AI-Generated Audio-Visual Quality Assessment

RATE: Causal Explainability of Reward Models with Imperfect Counterfactuals

Efficient Diffusion Models for Symmetric Manifolds

Human Body Restoration with One-Step Diffusion Model and A New Benchmark

GuidedQuant: Large Language Model Quantization via Exploiting End Loss Guidance

HyperIV: Real-time Implied Volatility Smoothing

Attributes Shape the Embedding Space of Face Recognition Models

Weakly Supervised Anomaly Detection via Dual-Tailed Kernel

MindLLM: A Subject-Agnostic and Versatile Model for fMRI-to-text Decoding

Generative Human Trajectory Recovery via Embedding-Space Conditional Diffusion

Optimal Auction Design in the Joint Advertising

LASER: Attention with Exponential Transformation

Collapse or Thrive: Perils and Promises of Synthetic Data in a Self-Generating World

Linear Transformers as VAR Models: Aligning Autoregressive Attention Mechanisms with Autoregressive Forecasting

Textual Unlearning Gives a False Sense of Unlearning

Provable Benefit of Random Permutations over Uniform Sampling in Stochastic Coordinate Descent

Model Swarms: Collaborative Search to Adapt LLM Experts via Swarm Intelligence

One Example Shown, Many Concepts Known! Counterexample-Driven Conceptual Reasoning in Mathematical LLMs

Probing Visual Language Priors in VLMs

Probabilistic Interactive 3D Segmentation with Hierarchical Neural Processes

HPS: Hard Preference Sampling for Human Preference Alignment

Control and Realism: Best of Both Worlds in Layout-to-Image without Training

The Sparse-Plus-Low-Rank Quasi-Newton Method for Entropic-Regularized Optimal Transport

Does Low Rank Adaptation Lead to Lower Robustness against Training-Time Attacks?

Overtrained Language Models Are Harder to Fine-Tune

SECOND: Mitigating Perceptual Hallucination in Vision-Language Models via Selective and Contrastive Decoding

Boost-and-Skip: A Simple Guidance-Free Diffusion for Minority Generation

AMPO: Active Multi Preference Optimization for Self-play Preference Selection

Sketch to Adapt: Fine-Tunable Sketches for Efficient LLM Adaptation

Eigenspectrum Analysis of Neural Networks without Aspect Ratio Bias

Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs

Distinguishing Cause from Effect with Causal Velocity Models

Isolated Causal Effects of Natural Language

Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism

Data Mixing Optimization for Supervised Fine-Tuning of Large Language Models

Synthesizing Privacy-Preserving Text Data via Finetuning *without* Finetuning Billion-Scale LLMs

Improving the Diffusability of Autoencoders

Guided Structural Inference: Leveraging Priors with Soft Gating Mechanisms

Beyond Minimax Rates in Group Distributionally Robust Optimization via a Novel Notion of Sparsity

Enhancing Rating-Based Reinforcement Learning to Effectively Leverage Feedback from Large Vision-Language Models

Knowledge Retention in Continual Model-Based Reinforcement Learning

BiMark: Unbiased Multilayer Watermarking for Large Language Models

Ranked from Within: Ranking Large Multimodal Models Without Labels

Improved Approximations for Hard Graph Problems using Predictions

Breaking the $n^{1.5}$ Additive Error Barrier for Private and Efficient Graph Sparsification via Private Expander Decomposition

Randomized Dimensionality Reduction for Euclidean Maximization and Diversity Measures

Residual TPP: A Unified Lightweight Approach for Event Stream Data Analysis

Adversarial Combinatorial Semi-bandits with Graph Feedback

RepLoRA: Reparameterizing Low-rank Adaptation via the Perspective of Mixture of Experts

Tracking Most Significant Shifts in Infinite-Armed Bandits

Improving Generalization with Flat Hilbert Bayesian Inference

Speculative Prefill: Turbocharging TTFT with Lightweight and Training-Free Token Importance Estimation

When to Forget? Complexity Trade-offs in Machine Unlearning

Idiosyncrasies in Large Language Models

R3DM: Enabling Role Discovery and Diversity Through Dynamics Models in Multi-agent Reinforcement Learning

Direct Density Ratio Optimization: A Statistically Consistent Approach to Aligning Large Language Models

Metastable Dynamics of Chain-of-Thought Reasoning: Provable Benefits of Search, RL and Distillation

Mixture of Experts Provably Detect and Learn the Latent Cluster Structure in Gradient-Based Learning

Propagation of Chaos for Mean-Field Langevin Dynamics and its Application to Model Ensemble

Quantifying Memory Utilization with Effective State-Size

Provable In-Context Vector Arithmetic via Retrieving Task Concepts

On the Role of Label Noise in the Feature Learning Process

EgoPrivacy: What Your First-Person Camera Says About You?

TimeStep Master: Asymmetrical Mixture of Timestep LoRA Experts for Versatile and Efficient Diffusion Models in Vision

Improved Discretization Complexity Analysis of Consistency Models: Variance Exploding Forward Process and Decay Discretization Scheme

Assessing Safety Risks and Quantization-aware Safety Patching for Quantized Large Language Models

PipeOffload: Improving Scalability of Pipeline Parallelism with Memory Optimization

LEMoN: Label Error Detection using Multimodal Neighbors

On the Guidance of Flow Matching

M2PDE: Compositional Generative Multiphysics and Multi-component PDE Simulation

From Uncertain to Safe: Conformal Adaptation of Diffusion Models for Safe PDE Control

Near-optimal Sketchy Natural Gradients for Physics-Informed Neural Networks

Defending LVLMs Against Vision Attacks Through Partial-Perception Supervision

Neural Event-Triggered Control with Optimal Scheduling

Diverse Prototypical Ensembles Improve Robustness to Subpopulation Shift

Towards Efficient Online Tuning of VLM Agents via Counterfactual Soft Reinforcement Learning

GraphGPT: Generative Pre-trained Graph Eulerian Transformer

Generalization Analysis for Controllable Learning

Tight and Fast Bounds for Multi-Label Learning

Spherical Rotation Dimension Reduction with Geometric Loss Functions

Semi-Supervised Blind Quality Assessment with Confidence-quantifiable Pseudo-label Learning for Authentic Images

COKE: Core Kernel for More Efficient Approximation of Kernel Weights in Multiple Kernel Clustering

Improving the Scaling Laws of Synthetic Data with Deliberate Practice

Socialized Coevolution: Advancing a Better World through Cross-Task Collaboration

Analytical Construction on Geometric Architectures: Transitioning from Static to Temporal Link Prediction

Test-time Correlation Alignment

MITIGATING OVER-EXPLORATION IN LATENT SPACE OPTIMIZATION USING LES

Widening the Network Mitigates the Impact of Data Heterogeneity on FedAvg

Federated Disentangled Tuning with Textual Prior Decoupling and Visual Dynamic Adaptation

Enhancing Graph Invariant Learning from a Negative Inference Perspective

OOD-Chameleon: Is Algorithm Selection for OOD Generalization Learnable?

Complete-Tree Space Favors Data-Efficient Link Prediction

DLP: Dynamic Layerwise Pruning in Large Language Models

ATA: Adaptive Task Allocation for Efficient Resource Management in Distributed Machine Learning

Scaling Video-Language Models to 10K Frames via Hierarchical Differential Distillation

E-LDA: Toward Interpretable LDA Topic Models with Strong Guarantees in Logarithmic Parallel Time

PEINR: A Physics-enhanced Implicit Neural Representation for High-Fidelity Flow Field Reconstruction

Learning to Reuse Policies in State Evolvable Environments

Optimistic Algorithms for Adaptive Estimation of the Average Treatment Effect

Kona: An Efficient Privacy-Preservation Framework for KNN Classification by Communication Optimization

Understanding Fixed Predictions via Confined Regions

La RoSA: Enhancing LLM Efficiency via Layerwise Rotated Sparse Activation

Balancing Model Efficiency and Performance: Adaptive Pruner for Long-tailed Data

Understanding High-Dimensional Bayesian Optimization

Learning Configurations for Data-Driven Multi-Objective Optimization

Mutual Learning for SAM Adaptation: A Dual Collaborative Network Framework for Source-Free Domain Transfer

End-to-End Learning Framework for Solving Non-Markovian Optimal Control

Iterative Vectors: In-Context Gradient Steering without Backpropagation

Doubly Robust Fusion of Many Treatments for Policy Learning

SANA 1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion Transformer

KoopSTD: Reliable Similarity Analysis between Dynamical Systems via Approximating Koopman Spectrum with Timescale Decoupling

Improving Reward Model Generalization from Adversarial Process Enhanced Preferences

Interchangeable Token Embeddings for Extendable Vocabulary and Alpha-Equivalence

Going Deeper into Locally Differentially Private Graph Neural Networks

Overcoming Multi-step Complexity in Multimodal Theory-of-Mind Reasoning: A Scalable Bayesian Planner

SlimLLM: Accurate Structured Pruning for Large Language Models

Adaptive Exploration for Multi-Reward Multi-Policy Evaluation

The Empirical Mean is Minimax Optimal for Local Glivenko-Cantelli

Federated Node-Level Clustering Network with Cross-Subgraph Link Mending

Causal Effect Identification in lvLiNGAM from Higher-Order Cumulants

Design Considerations in Offline Preference-based RL

Near Optimal Best Arm Identification for Clustered Bandits

CellFlux: Simulating Cellular Morphology Changes via Flow Matching

In-Context Linear Regression Demystified: Training Dynamics and Mechanistic Interpretability of Multi-Head Softmax Attention

Efficient LiDAR Reflectance Compression via Scanning Serialization

SHARP-Distill: A 68× Faster Recommender System with Hypergraph Neural Networks and Language Models

RE-Bench: Evaluating Frontier AI R&D Capabilities of Language Model Agents against Human Experts

AlphaQCM: Alpha Discovery in Finance with Distributional Reinforcement Learning

Multi-Domain Graph Foundation Models: Robust Knowledge Transfer via Topology Alignment

HyperTree Planning: Enhancing LLM Reasoning via Hierarchical Thinking

TimeFilter: Patch-Specific Spatial-Temporal Graph Filtration for Time Series Forecasting

Topology-Aware Dynamic Reweighting for Distribution Shifts on Graph

Do We Need to Verify Step by Step? Rethinking Process Supervision from a Theoretical Perspective

An Analysis for Reasoning Bias of Language Models with Small Initialization

Winner-takes-all for Multivariate Probabilistic Time Series Forecasting

GAPrompt: Geometry-Aware Point Cloud Prompt for 3D Vision Model

SLiM: One-shot Quantization and Sparsity with Low-rank Approximation for LLM Weight Compression

Inference-Time Decomposition of Activations (ITDA): A Scalable Approach to Interpreting Large Language Models

Linearization Turns Neural Operators into Function-Valued Gaussian Processes

Compositional Flows for 3D Molecule and Synthesis Pathway Co-design

Decision Mixer: Integrating Long-term and Local Dependencies via Dynamic Token Selection for Decision-Making

What Limits Virtual Agent Application? OmniBench: A Scalable Multi-Dimensional Benchmark for Essential Virtual Agent Capabilities

CoPINN: Cognitive Physics-Informed Neural Networks

Adversarial Robust Generalization of Graph Neural Networks

Open Your Eyes: Vision Enhances Message Passing Neural Networks in Link Prediction

Hierarchical Overlapping Clustering on Graphs: Cost Function, Algorithm and Scalability

Efficient Personalized Adaptation for Physiological Signal Foundation Model

TabPFN Unleashed: A Scalable and Effective Solution to Tabular Classification Problems

VisionTS: Visual Masked Autoencoders Are Free-Lunch Zero-Shot Time Series Forecasters

Compelling ReLU Networks to Exhibit Exponentially Many Linear Regions at Initialization and During Training

Pivoting Factorization: A Compact Meta Low-Rank Representation of Sparsity for Efficient Inference in Large Language Models

Hide & Seek: Transformer Symmetries Obscure Sharpness & Riemannian Geometry Finds It

Graph Attention is Not Always Beneficial: A Theoretical Analysis of Graph Attention Mechanisms via Contextual Stochastic Block Models

Improving Your Model Ranking on Chatbot Arena by Vote Rigging

Prune 'n Predict: Optimizing LLM Decision-making with Conformal Prediction

EGPlace: An Efficient Macro Placement Method via Evolutionary Search with Greedy Repositioning Guided Mutation

Riemannian Diffusion Adaptation for Distributed Optimization on Manifolds

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

EduLLM: Leveraging Large Language Models and Framelet-Based Signed Hypergraph Neural Networks for Student Performance Prediction

Pruning for GNNs: Lower Complexity with Comparable Expressiveness

Cradle: Empowering Foundation Agents towards General Computer Control

Test-Time Selective Adaptation for Uni-Modal Distribution Shift in Multi-Modal Data

Cannot See the Forest for the Trees: Invoking Heuristics and Biases to Elicit Irrational Choices of LLMs

Efficient Distributed Optimization under Heavy-Tailed Noise

Targeted Unlearning with Single Layer Unlearning Gradient

Targeted control of fast prototyping through domain-specific interface

Multinoulli Extension: A Lossless Yet Effective Probabilistic Framework for Subset Selection over Partition Constraints

Self-supervised Masked Graph Autoencoder via Structure-aware Curriculum

Flex3D: Feed-Forward 3D Generation with Flexible Reconstruction Model and Input View Curation

Stochastic Layer-Wise Shuffle for Improving Vision Mamba Training

Imagine While Reasoning in Space: Multimodal Visualization-of-Thought

Mask-Enhanced Autoregressive Prediction: Pay Less Attention to Learn More

SpargeAttention: Accurate and Training-free Sparse Attention Accelerating Any Model Inference

Vision-Language Models Create Cross-Modal Task Representations

CSV-Occ: Fusing Multi-frame Alignment for Occupancy Prediction with Temporal Cross State Space Model and Central Voting Mechanism

Weisfeiler and Leman Go Gambling: Why Expressive Lottery Tickets Win

TruthFlow: Truthful LLM Generation via Representation Flow Correction

Omni-Angle Assault: An Invisible and Powerful Physical Adversarial Attack on Face Recognition

Confidence Difference Reflects Various Supervised Signals in Confidence-Difference Classification

Event-Customized Image Generation

A Lens into Interpretable Transformer Mistakes via Semantic Dependency

The Geometry of Refusal in Large Language Models: Concept Cones and Representational Independence

Test-Time Graph Neural Dataset Search With Generative Projection

OmiAD: One-Step Adaptive Masked Diffusion Model for Multi-class Anomaly Detection via Adversarial Distillation

All-atom Diffusion Transformers: Unified generative modelling of molecules and materials

Autonomy-of-Experts Models

HetSSNet: Spatial-Spectral Heterogeneous Graph Learning Network for Panchromatic and Multispectral Images Fusion

(How) Do Language Models Track State?

Free Process Rewards without Process Labels

It's My Data Too: Private ML for Datasets with Multi-User Training Examples

A Certified Unlearning Approach without Access to Source Data

Maximum Total Correlation Reinforcement Learning

Multiple-policy Evaluation via Density Estimation

From Spectrum-free towards Baseline-view-free: Double-track Proximity Driven Multi-view Clustering

Can RLHF be More Efficient with Imperfect Reward Models? A Policy Coverage Perspective

Ehrenfeucht-Haussler Rank and Chain of Thought

BinauralFlow: A Causal and Streamable Approach for High-Quality Binaural Speech Synthesis with Flow Matching Models

Clustering Items through Bandit Feedback: Finding the Right Feature out of Many

Nesterov Method for Asynchronous Pipeline Parallel Optimization

MaskTwins: Dual-form Complementary Masking for Domain-Adaptive Image Segmentation

Efficient Optimization with Orthogonality Constraint: a Randomized Riemannian Submanifold Method

Pairwise Maximum Likelihood For Multi-Class Logistic Regression Model With Multiple Rare Classes

Feature-Mapping Topology Optimization with Neural Heaviside Signed Distance Functions

RZ-NAS: Enhancing LLM-guided Neural Architecture Search via Reflective Zero-Cost Strategy

One-dimensional Path Convolution

TCP-Diffusion: A Multi-modal Diffusion Model for Global Tropical Cyclone Precipitation Forecasting with Change Awareness

FedSSI: Rehearsal-Free Continual Federated Learning with Synergistic Synaptic Intelligence

Efficient ANN-SNN Conversion with Error Compensation Learning

Reward-Guided Iterative Refinement in Diffusion Models at Test-Time with Applications to Protein and DNA Design

An End-to-End Model for Logits-Based Large Language Models Watermarking

OneForecast: A Universal Framework for Global and Regional Weather Forecasting

DPCore: Dynamic Prompt Coreset for Continual Test-Time Adaptation

Near-Optimal Decision Trees in a SPLIT Second

Transolver++: An Accurate Neural Solver for PDEs on Million-Scale Geometries

Nearly Optimal Algorithms for Contextual Dueling Bandits from Adversarial Feedback

RLTHF: Targeted Human Feedback for LLM Alignment

GenZSL: Generative Zero-Shot Learning Via Inductive Variational Autoencoder

Learning Progress Driven Multi-Agent Curriculum

Test-Time Preference Optimization: On-the-Fly Alignment via Iterative Textual Feedback

Generalization of noisy SGD in unbounded non-convex settings

DIME: Diffusion-Based Maximum Entropy Reinforcement Learning

Steerable Transformers for Volumetric Data

DyPolySeg: Taylor Series-Inspired Dynamic Polynomial Fitting Network for Few-shot Point Cloud Semantic Segmentation

Safety Reasoning with Guidelines

When Can Proxies Improve the Sample Complexity of Preference Learning?

Towards Graph Foundation Models: Learning Generalities Across Graphs via Task-Trees

Heterogeneous Label Shift: Theory and Algorithm

X-Hacking: The Threat of Misguided AutoML

EAGLES: Towards Effective, Efficient, and Economical Federated Graph Learning via Unified Sparsification

RBench: Graduate-level Multi-disciplinary Benchmarks for LLM & MLLM Complex Reasoning Evaluation

OpenworldAUC: Towards Unified Evaluation and Optimization for Open-world Prompt Tuning

TLLC: Transfer Learning-based Label Completion for Crowdsourcing

A Online Statistical Framework for Out-of-Distribution Detection

PolyConf: Unlocking Polymer Conformation Generation through Hierarchical Generative Models

Multi-Stage Manipulation with Demonstration-Augmented Reward, Policy, and World Model Learning

Revisiting Noise Resilience Strategies in Gesture Recognition: Short-Term Enhancement in sEMG Analysis

Arrow: Accelerator for Time Series Causal Discovery with Time Weaving

Which Attention Heads Matter for In-Context Learning?

Clients Collaborate: Flexible Differentially Private Federated Learning with Guaranteed Improvement of Utility-Privacy Trade-off

From Black Boxes to Transparent Minds: Evaluating and Enhancing the Theory of Mind in Multimodal Large Language Models

Evaluating Judges as Evaluators: The JETTS Benchmark of LLM-as-Judges as Test-Time Scaling Evaluators

Gaussian Mixture Flow Matching Models

Learning Utilities from Demonstrations in Markov Decision Processes

Learning Input Encodings for Kernel-Optimal Implicit Neural Representations

MultiPDENet: PDE-embedded Learning with Multi-time-stepping for Accelerated Flow Simulation

Adjustment for Confounding using Pre-Trained Representations

CHATS: Combining Human-Aligned Optimization and Test-Time Sampling for Text-to-Image Generation

Pareto Merging: Multi-Objective Optimization for Preference-Aware Model Merging

Distilling the Knowledge in Data Pruning

Scaling Laws for Differentially Private Language Models

Zero-Inflated Bandits

Logits are All We Need to Adapt Closed Models

BARK: A Fully Bayesian Tree Kernel for Black-box Optimization

Transfer Q-Learning with Composite MDP Structures

FlexTok: Resampling Images into 1D Token Sequences of Flexible Length

CollabLLM: From Passive Responders to Active Collaborators

Spectral-Aware Reservoir Computing for Fast and Accurate Time Series Classification

When Will It Fail?: Anomaly to Prompt for Forecasting Future Anomalies in Time Series

The Complexity of Learning Sparse Superposed Features with Feedback

Constrained Online Convex Optimization with Polyak Feasibility Steps

Algorithms and Hardness for Active Learning on Graphs

Sidechain conditioning and modeling for full-atom protein sequence design with FAMPNN

Bifurcate then Alienate: Incomplete Multi-view Clustering via Coupled Distribution Learning with Linear Overhead

Local Manifold Approximation and Projection for Manifold-Aware Diffusion Planning

Core Knowledge Deficits in Multi-Modal Language Models

MuLan: Adapting Multilingual Diffusion Models for Hundreds of Languages with Negligible Cost

Unbiased Evaluation of Large Language Models from a Causal Perspective

am-ELO: A Stable Framework for Arena-based LLM Evaluation

Posterior Inference with Diffusion Models for High-dimensional Black-box Optimization

UGPhysics: A Comprehensive Benchmark for Undergraduate Physics Reasoning with Large Language Models

Emotional Face-to-Speech

DiffMS: Diffusion Generation of Molecules Conditioned on Mass Spectra

In-Context Denoising with One-Layer Transformers: Connections between Attention and Associative Memory Retrieval

Identifying Causal Direction via Variational Bayesian Compression

A Theoretical Justification for Asymmetric Actor-Critic Algorithms

CASE-Bench: Context-Aware SafEty Benchmark for Large Language Models

AutoAdvExBench: Benchmarking Autonomous Exploitation of Adversarial Example Defenses

Generative Modeling Reinvents Supervised Learning: Label Repurposing with Predictive Consistency Learning

Pfeife: Automatic Pipeline Parallelism for PyTorch

O-MAPL: Offline Multi-agent Preference Learning

Improving Flow Matching by Aligning Flow Divergence

Faster Global Minimum Cut with Predictions

PEAKS: Selecting Key Training Examples Incrementally via Prediction Error Anchored by Kernel Similarity

UncertainSAM: Fast and Efficient Uncertainty Quantification of the Segment Anything Model

A Unified Comparative Study with Generalized Conformity Scores for Multi-Output Conformal Regression

ITFormer: Bridging Time Series and Natural Language for Multi-Modal QA with Large-Scale Multitask Dataset

ProofAug: Efficient Neural Theorem Proving via Fine-grained Proof Structure Analysis

A Variational Information Theoretic Approach to Out-of-Distribution Detection

Trajectory Inference with Smooth Schrödinger Bridges

Shortcut-connected Expert Parallelism for Accelerating Mixture of Experts

Retrieval Augmented Time Series Forecasting

Protriever: End-to-End Differentiable Protein Homology Search for Fitness Prediction

Learning Parametric Distributions from Samples and Preferences

LSCD: Lomb--Scargle Conditioned Diffusion for Time series Imputation

Morse: Dual-Sampling for Lossless Acceleration of Diffusion Models

Efficient Quantification of Multimodal Interaction at Sample Level

Adaptive Multi-prompt Contrastive Network for Few-shot Out-of-distribution Detection

Diagonal Symmetrization of Neural Network Solvers for the Many-Electron Schrödinger Equation

Probabilistic Group Mask Guided Discrete Optimization for Incremental Learning

False Coverage Proportion Control for Conformal Prediction

Dendritic Localized Learning: Toward Biologically Plausible Algorithm

Multiobjective distribution matching

Fixed-Confidence Multiple Change Point Identification under Bandit Feedback

ReFrame: Layer Caching for Accelerated Inference in Real-Time Rendering

Efficient Parallel Training Methods for Spiking Neural Networks with Constant Time Complexity

UniMC: Taming Diffusion Transformer for Unified Keypoint-Guided Multi-Class Image Generation

De-AntiFake: Rethinking the Protective Perturbations Against Voice Cloning Attacks

WMarkGPT: Watermarked Image Understanding via Multimodal Large Language Models

What If We Recaption Billions of Web Images with LLaMA-3?

Improving Generalization in Federated Learning with Highly Heterogeneous Data via Momentum-Based Stochastic Controlled Weight Averaging

Automatic Reward Shaping from Confounded Offline Data

Contrastive Learning with Simplicial Convolutional Networks for Short-Text Classification

Kandinsky Conformal Prediction: Beyond Class- and Covariate-Conditional Coverage

FlipAttack: Jailbreak LLMs via Flipping

BSLoRA: Enhancing the Parameter Efficiency of LoRA with Intra-Layer and Inter-Layer Sharing

ExLM: Rethinking the Impact of $\texttt{[MASK]}$ Tokens in Masked Language Models

Stochastic Online Conformal Prediction with Semi-Bandit Feedback

3D Question Answering via only 2D Vision-Language Models

AdvI2I: Adversarial Image Attack on Image-to-Image Diffusion Models

MedXpertQA: Benchmarking Expert-Level Medical Reasoning and Understanding

Diff-MoE: Diffusion Transformer with Time-Aware and Space-Adaptive Experts

Bi-perspective Splitting Defense: Achieving Clean-Seed-Free Backdoor Security

Variational Learning of Fractional Posteriors

Step-DAD: Semi-Amortized Policy-Based Bayesian Experimental Design

Spherical-Nested Diffusion Model for Panoramic Image Outpainting

CSTrack: Enhancing RGB-X Tracking via Compact Spatiotemporal Features

Learning Event Completeness for Weakly Supervised Video Anomaly Detection

FlowAR: Scale-wise Autoregressive Image Generation Meets Flow Matching

Compositional Scene Understanding through Inverse Generative Modeling

A Mixed-Curvature based Pre-training Paradigm for Multi-Task Vehicle Routing Solver

Understanding and Mitigating Memorization in Diffusion Models for Tabular Data

Learning Changes in Graphon Attachment Network Models

IntLoRA: Integral Low-rank Adaptation of Quantized Diffusion Models

Efficient and Scalable Density Functional Theory Hamiltonian Prediction through Adaptive Sparsity

XAttention: Block Sparse Attention with Antidiagonal Scoring

Learning Along the Arrow of Time: Hyperbolic Geometry for Backward-Compatible Representation Learning

From Crowdsourced Data to High-quality Benchmarks: Arena-Hard and Benchbuilder Pipeline

Subgoal-Guided Policy Heuristic Search with Learned Subgoals

A Hitchhiker's Guide to Scaling Law Estimation

LoRA-One: One-Step Full Gradient Could Suffice for Fine-Tuning Large Language Models, Provably and Efficiently

A Recipe for Causal Graph Regression: Confounding Effects Revisited

Decoding Rewards in Competitive Games: Inverse Game Theory with Entropy Regularization

Flexible and Efficient Grammar-Constrained Decoding

On the Diversity of Adversarial Ensemble Learning

A Novel Characterization of the Population Area Under the Risk Coverage Curve (AURC) and Rates of Finite Sample Estimators

A Near Linear Query Lower Bound for Submodular Maximization

Accelerating Quantum Reinforcement Learning with a Quantum Natural Policy Gradient Based Approach

Clipping Improves Adam-Norm and AdaGrad-Norm when the Noise Is Heavy-Tailed

Generalized Smooth Bilevel Optimization with Nonconvex Lower-Level

Self-Bootstrapping for Versatile Test-Time Adaptation

Hyperband-based Bayesian Optimization for Black-box Prompt Selection

G-Designer: Architecting Multi-agent Communication Topologies via Graph Neural Networks

Function Encoders: A Principled Approach to Transfer Learning in Hilbert Spaces

LLMScan: Causal Scan for LLM Misbehavior Detection

Signed Laplacians for Constrained Graph Clustering

Pointwise Information Measures as Confidence Estimators in Deep Neural Networks: A Comparative Study

SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator

ReinboT: Amplifying Robot Visual-Language Manipulation with Reinforcement Learning

Trust-Region Twisted Policy Improvement

Reasoning Limitations of Multimodal Large Language Models. A case study of Bongard Problems

Policy Guided Tree Search for Enhanced LLM Reasoning

Towards Universal Offline Black-Box Optimization via Learning Language Model Embeddings

HarmoniCa: Harmonizing Training and Inference for Better Feature Caching in Diffusion Transformer Acceleration

Agent Workflow Memory

CogReact: A Reinforced Framework to Model Human Cognitive Reaction Modulated by Dynamic Intervention

LLM-Augmented Chemical Synthesis and Design Decision Programs

Mitigating Over-Squashing in Graph Neural Networks by Spectrum-Preserving Sparsification

Sparse Spectral Training and Inference on Euclidean and Hyperbolic Neural Networks

Underestimated Privacy Risks for Minority Populations in Large Language Model Unlearning

When Model Knowledge meets Diffusion Model: Diffusion-assisted Data-free Image Synthesis with Alignment of Domain and Class

Optimal Transport Barycenter via Nonconvex-Concave Minimax Optimization

$\mathcal{V}ista\mathcal{DPO}$: Video Hierarchical Spatial-Temporal Direct Preference Optimization for Large Video Models

CurvGAD: Leveraging Curvature for Enhanced Graph Anomaly Detection

Local Identifying Causal Relations in the Presence of Latent Variables

A Versatile Influence Function for Data Attribution with Non-Decomposable Loss

Physics-informed Temporal Alignment for Auto-regressive PDE Foundation Models

Boosting Virtual Agent Learning and Reasoning: A Step-Wise, Multi-Dimensional, and Generalist Reward Model with Benchmark

Learning In-context $n$-grams with Transformers: Sub-$n$-grams Are Near-Stationary Points

Auditing $f$-differential privacy in one run

Q-VDiT: Towards Accurate Quantization and Distillation of Video-Generation Diffusion Transformers

EnIGMA: Interactive Tools Substantially Assist LM Agents in Finding Security Vulnerabilities

Token Coordinated Prompt Attention is Needed for Visual Prompting

From Local Details to Global Context: Advancing Vision-Language Models with Attention-Based Selection

Revisiting Continuity of Image Tokens for Cross-domain Few-shot Learning

ExPLoRA: Parameter-Efficient Extended Pre-Training to Adapt Vision Transformers under Domain Shifts

Enhancing Adversarial Robustness with Conformal Prediction: A Framework for Guaranteed Model Reliability

BiAssemble: Learning Collaborative Affordance for Bimanual Geometric Assembly

Gradient-based Explanations for Deep Learning Survival Models

A Closer Look at Generalized BH Algorithm for Out-of-Distribution Detection

Meta-Reinforcement Learning with Adaptation from Human Feedback via Preference-Order-Preserving Task Embedding

Model Uncertainty Quantification by Conformal Prediction in Continual Learning

Learning to Plan & Reason for Evaluation with Thinking-LLM-as-a-Judge

Improving Rationality in the Reasoning Process of Language Models through Self-playing Game

QEM-Bench: Benchmarking Learning-based Quantum Error Mitigation and QEMFormer as a Multi-ranged Context Learning Baseline

e-GAI: e-value-based Generalized $\alpha$-Investing for Online False Discovery Rate Control

How Much Can We Forget about Data Contamination?

De-coupled NeuroGF for Shortest Path Distance Approximations on Large Terrain Graphs

Representative Ranking for Deliberation in the Public Sphere

A Generalization Result for Convergence in Learning-to-Optimize

One-Shot Heterogeneous Federated Learning with Local Model-Guided Diffusion Models

Human Cognition-Inspired Hierarchical Fuzzy Learning Machine

SynEVO: A neuro-inspired spatiotemporal evolutional framework for cross-domain adaptation

Solving Linear-Gaussian Bayesian Inverse Problems with Decoupled Diffusion Sequential Monte Carlo

Componential Prompt-Knowledge Alignment for Domain Incremental Learning

REINFORCE Adversarial Attacks on Large Language Models: An Adaptive, Distributional, and Semantic Objective

Propagate and Inject: Revisiting Propagation-Based Feature Imputation for Graphs with Partially Observed Features

Implicit Subgraph Neural Network

ALMTokenizer: A Low-bitrate and Semantic-rich Audio Codec Tokenizer for Audio Language Modeling

Set Valued Predictions For Robust Domain Generalization

RollingQ: Reviving the Cooperation Dynamics in Multimodal Transformer

Bongard in Wonderland: Visual Puzzles that Still Make AI Go Mad?

Sharp Optimality of Simple, Plug-in Estimation of the Fisher Information of a Smoothed Density

Distributed Conformal Prediction via Message Passing

Advancing Constrained Monotonic Neural Networks: Achieving Universal Approximation Beyond Bounded Activations

EpiCoder: Encompassing Diversity and Complexity in Code Generation

Categorical Schrödinger Bridge Matching

Neural Representational Consistency Emerges from Probabilistic Neural-Behavioral Representation Alignment

Adapting to Evolving Adversaries with Regularized Continual Robust Training

Bivariate Causal Discovery with Proxy Variables: Integral Solving and Beyond

Variational Rectified Flow Matching

Instruct2See: Learning to Remove Any Obstructions Across Distributions

Temperature-Annealed Boltzmann Generators

MASS: Mathematical Data Selection via Skill Graphs for Pretraining Large Language Models

Dual Feature Reduction for the Sparse-group Lasso and its Adaptive Variant

Targeted Low-rank Refinement: Enhancing Sparse Language Models with Precision

SAE-V: Interpreting Multimodal Models for Enhanced Alignment

TopoTune: A Framework for Generalized Combinatorial Complex Neural Networks

Sparsing Law: Towards Large Language Models with Greater Activation Sparsity

DriveGPT: Scaling Autoregressive Behavior Models for Driving

Analyze Feature Flow to Enhance Interpretation and Steering in Language Models

Selective Response Strategies for GenAI

Breaking the Barrier of Hard Samples: A Data-Centric Approach to Synthetic Data for Medical Tasks

DANCE: Dual Unbiased Expansion with Group-acquired Alignment for Out-of-distribution Graph Fairness Learning

Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment

Time Series Representations with Hard-Coded Invariances

Differential Privacy Under Class Imbalance: Methods and Empirical Insights

Boosting Multi-Domain Fine-Tuning of Large Language Models through Evolving Interactions between Samples

Does Graph Prompt Work? A Data Operation Perspective with Theoretical Analysis

A Physics-Informed Machine Learning Framework for Safe and Optimal Control of Autonomous Systems

Graph Diffusion for Robust Multi-Agent Coordination

Tokenized Bandit for LLM Decoding and Alignment

TGDPO: Harnessing Token-Level Reward Guidance for Enhancing Direct Preference Optimization

Distributionally Robust Policy Learning under Concept Drifts

A Physics-Augmented Deep Learning Framework for Classifying Single Molecule Force Spectroscopy Data

Return of the Latent Space COWBOYS: Re-thinking the use of VAEs for Bayesian Optimisation of Structured Spaces

Learning Curves of Stochastic Gradient Descent in Kernel Regression

OTTER: A Vision-Language-Action Model with Text-Aware Visual Feature Extraction

MM-RLHF: The Next Step Forward in Multimodal LLM Alignment

Privacy-Preserving Federated Convex Optimization: Balancing Partial-Participation and Efficiency via Noise Cancellation

RWKVQuant: Quantizing the RWKV Family with Proxy Guided Hybrid of Scalar and Vector Quantization

The Generalized Skew Spectrum of Graphs

Noise Conditional Variational Score Distillation

Maximizing Intermediate Checkpoint Value in LLM Pretraining with Bayesian Optimization

FedSMU: Communication-Efficient and Generalization-Enhanced Federated Learning through Symbolic Model Updates

BSO: Binary Spiking Online Optimization Algorithm

Beyond The Rainbow: High Performance Deep Reinforcement Learning on a Desktop PC

Sub-Sequential Physics-Informed Learning with State Space Model

Enhancing Performance of Explainable AI Models with Constrained Concept Refinement

The Hidden Joules: Evaluating the Energy Consumption of Vision Backbones for Progress Towards More Efficient Model Inference

Towards Lifelong Model Editing via Simulating Ideal Editor

A Forget-and-Grow Strategy for Deep Reinforcement Learning Scaling in Continuous Control

AdaptiveStep: Automatically Dividing Reasoning Step through Model Confidence

Counterfactual Contrastive Learning with Normalizing Flows for Robust Treatment Effect Estimation

Contradiction Retrieval via Contrastive Learning with Sparsity

Consensus Based Stochastic Optimal Control

LightningDrag: Lightning Fast and Accurate Drag-based Image Editing Emerging from Videos

DreamDPO: Aligning Text-to-3D Generation with Human Preferences via Direct Preference Optimization

PhySpec: Physically Consistent Spectral Reconstruction via Orthogonal Subspace Decomposition and Self-Supervised Meta-Auxiliary Learning

Fast Incomplete Multi-view Clustering by Flexible Anchor Learning

SafetyAnalyst: Interpretable, Transparent, and Steerable Safety Moderation for AI Behavior

Non-Asymptotic Length Generalization

TabNAT: A Continuous-Discrete Joint Generative Framework for Tabular Data

Open-Det: An Efficient Learning Framework for Open-Ended Detection

Tracking The Best Expert Privately

Towards Understanding Catastrophic Forgetting in Two-layer Convolutional Neural Networks

Enhancing Statistical Validity and Power in Hybrid Controlled Trials: A Randomization Inference Approach with Conformal Selective Borrowing

Faster and Stronger: When ANN-SNN Conversion Meets Parallel Spiking Calculation

UniMate: A Unified Model for Mechanical Metamaterial Generation, Property Prediction, and Condition Confirmation

KEA: Keeping Exploration Alive by Proactively Coordinating Exploration Strategies

SOLD: Slot Object-Centric Latent Dynamics Models for Relational Manipulation Learning from Pixels

SUICA: Learning Super-high Dimensional Sparse Implicit Neural Representations for Spatial Transcriptomics

Speeding up Policy Simulation in Supply Chain RL

A Memory Efficient Randomized Subspace Optimization Method for Training Large Language Models

Test-time Adapted Reinforcement Learning with Action Entropy Regularization

LongVU: Spatiotemporal Adaptive Compression for Long Video-Language Understanding

Efficient Time Series Processing for Transformers and State-Space Models through Token Merging

Gandalf the Red: Adaptive Security for LLMs

Unified K-Means Clustering with Label-Guided Manifold Learning

Rethink the Role of Deep Learning towards Large-scale Quantum Systems

Theoretical Performance Guarantees for Partial Domain Adaptation via Partial Optimal Transport

RobustLight: Improving Robustness via Diffusion Reinforcement Learning for Traffic Signal Control

LlavaGuard: An Open VLM-based Framework for Safeguarding Vision Datasets and Models

When can in-context learning generalize out of task distribution?

Fixing the Double Penalty in Data-Driven Weather Forecasting Through a Modified Spherical Harmonic Loss Function

Incorporating Arbitrary Matrix Group Equivariance into KANs

On the Emergence of Position Bias in Transformers

Scaling Value Iteration Networks to 5000 Layers for Extreme Long-Term Planning

Parrot: Multilingual Visual Instruction Tuning

CursorCore: Assist Programming through Aligning Anything

Flexible Tails for Normalizing Flows

BounDr.E: Predicting Drug-likeness via Biomedical Knowledge Alignment and EM-like One-Class Boundary Optimization

Prediction via Shapley Value Regression

Target Concrete Score Matching: A Holistic Framework for Discrete Diffusion

Drug-TTA: Test-Time Adaptation for Drug Virtual Screening via Multi-task Meta-Auxiliary Learning

Pareto-Optimality, Smoothness, and Stochasticity in Learning-Augmented One-Max-Search

Approximation to Smooth Functions by Low-Rank Swish Networks

EvoMesh: Adaptive Physical Simulation with Hierarchical Graph Evolutions

Foundation Model Insights and a Multi-Model Approach for Superior Fine-Grained One-shot Subset Selection

Come Together, But Not Right Now: A Progressive Strategy to Boost Low-Rank Adaptation

Harnessing Heterogeneous Statistical Strength for Personalized Federated Learning via Hierarchical Bayesian Inference

Deep Fuzzy Multi-view Learning for Reliable Classification

Unlocking Post-hoc Dataset Inference with Synthetic Data

ReFocus: Visual Editing as a Chain of Thought for Structured Image Understanding

Sparse-pivot: Dynamic correlation clustering for node insertions

Transformative or Conservative? Conservation laws for ResNets and Transformers

SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song Generation

Learning Adaptive Lighting via Channel-Aware Guidance

BCE vs. CE in Deep Feature Learning

Discriminative Policy Optimization for Token-Level Reward Models

Time to Spike? Understanding the Representational Power of Spiking Neural Networks in Discrete Time

On Zero-Initialized Attention: Optimal Prompt and Gating Factor Estimation

Continuous Visual Autoregressive Generation via Score Maximization

Time-VLM: Exploring Multimodal Vision-Language Models for Augmented Time Series Forecasting

DragLoRA: Online Optimization of LoRA Adapters for Drag-based Image Editing in Diffusion Model

Beyond Message Passing: Neural Graph Pattern Machine

Function-to-Style Guidance of LLMs for Code Translation

Pixel2Feature Attack (P2FA): Rethinking the Perturbed Space to Enhance Adversarial Transferability

Flexible, Efficient, and Stable Adversarial Attacks on Machine Unlearning

Preference-CFR: Beyond Nash Equilibrium for Better Game Strategies

Robust Multi-Agent Reinforcement Learning with Stochastic Adversary

An in depth look at the Procrustes-Wasserstein distance: properties and barycenters

Intersectional Fairness in Reinforcement Learning with Large State and Constraint Spaces

AdvAgent: Controllable Blackbox Red-teaming on Web Agents

CoCoA-Mix: Confusion-and-Confidence-Aware Mixture Model for Context Optimization

LLaVA-ReID: Selective Multi-image Questioner for Interactive Person Re-Identification

Learning Policy Committees for Effective Personalization in MDPs with Diverse Tasks

LLMs can see and hear without any training

Guided Zeroth-Order Methods for Stochastic Non-convex Problems with Decision-Dependent Distributions

Comparing Few to Rank Many: Active Human Preference Learning Using Randomized Frank-Wolfe Method

Leveraging Online Olympiad-Level Math Problems for LLMs Training and Contamination-Resistant Evaluation

VIP: Vision Instructed Pre-training for Robotic Manipulation

Balancing Efficiency and Expressiveness: Subgraph GNNs with Walk-Based Centrality

Improving LLMs for Recommendation with Out-Of-Vocabulary Tokens

On the Alignment between Fairness and Accuracy: from the Perspective of Adversarial Robustness

Tuning LLM Judge Design Decisions for 1/1000 of the Cost

Revisiting Non-Acyclic GFlowNets in Discrete Environments

Grokking at the Edge of Linear Separability

Ranking with Multiple Oracles: From Weak to Strong Stochastic Transitivity

Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation

Certifiably Robust Model Evaluation in Federated Learning under Meta-Distributional Shifts

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Identifiable Object Representations under Spatial Ambiguities

Learning Bayesian Nash Equilibrium in Auction Games via Approximate Best Response

Addressing Imbalanced Domain-Incremental Learning through Dual-Balance Collaborative Experts

Safety Alignment Can Be Not Superficial With Explicit Safety Signals

Unified Analysis of Continuous Weak Features Learning with Applications to Learning from Missing Data

Policy-Regret Minimization in Markov Games with Function Approximation

Active Learning with Selective Time-Step Acquisition for PDEs

Reflection-Bench: Evaluating Epistemic Agency in Large Language Models

A Cognac Shot To Forget Bad Memories: Corrective Unlearning for Graph Neural Networks

Global curvature for second-order optimization of neural networks

On Volume Minimization in Conformal Regression

Proactive Agents for Multi-Turn Text-to-Image Generation Under Uncertainty

Elucidating Flow Matching ODE Dynamics via Data Geometry and Denoisers

Constrained Belief Updates Explain Geometric Structures in Transformer Representations

Rethinking the Temperature for Federated Heterogeneous Distillation

AutoGFM: Automated Graph Foundation Model with Adaptive Architecture Customization

A Bayesian Model Selection Criterion for Selecting Pretraining Checkpoints

Maximum Update Parametrization and Zero-Shot Hyperparameter Transfer for Fourier Neural Operators

Revisiting Neural Networks for Few-Shot Learning: A Zero-Cost NAS Perspective

To Steer or Not to Steer? Mechanistic Error Reduction with Abstention for Language Models

From Token to Rhythm: A Multi-Scale Approach for ECG-Language Pretraining

Importance Sampling for Nonlinear Models

Modeling All-Atom Glycan Structures via Hierarchical Message Passing and Multi-Scale Pre-training

Calibrated Value-Aware Model Learning with Probabilistic Environment Models

Are Large Language Models Ready for Multi-Turn Tabular Data Analysis?

Critical Tokens Matter: Token-Level Contrastive Estimation Enhances LLM’s Reasoning Capability

Learning Cascade Ranking as One Network

MIPT: Multilevel Informed Prompt Tuning for Robust Molecular Property Prediction

Enhancing Graph Contrastive Learning for Protein Graphs from Perspective of Invariance

Lightweight Online Adaption for Time Series Foundation Model Forecasts

Visual Generation Without Guidance

How does Labeling Error Impact Contrastive Learning? A Perspective from Data Dimensionality Reduction

Matrix Completion with Incomplete Side Information via Orthogonal Complement Projection

Neutral residues: revisiting adapters for model extension

FedECADO: A Dynamical System Model of Federated Learning

Adversarial Robustness in Two-Stage Learning-to-Defer: Algorithms and Guarantees

Contextual Online Decision Making with Infinite-Dimensional Functional Regression

On the Generalization Ability of Next-Token-Prediction Pretraining

Joker: Joint Optimization Framework for Lightweight Kernel Machines

On the Benefits of Active Data Collection in Operator Learning

Energy-Based Flow Matching for Generating 3D Molecular Structure

DipLLM: Fine-Tuning LLM for Strategic Decision-making in Diplomacy

Olica: Efficient Structured Pruning of Large Language Models without Retraining

Retraining-free Merging of Sparse MoE via Hierarchical Clustering

PRIME: Deep Imbalanced Regression with Proxies

Divide and Conquer: Learning Label Distribution with Subtasks

Deep Electromagnetic Structure Design Under Limited Evaluation Budgets

Low-Rank Thinning

Automatically Identify and Rectify: Robust Deep Contrastive Multi-view Clustering in Noisy Scenarios

Rethinking Causal Ranking: A Balanced Perspective on Uplift Model Evaluation

ENSUR: Equitable and Statistically Unbiased Recommendation

Random Feature Representation Boosting

FrameBridge: Improving Image-to-Video Generation with Bridge Models

Subobject-level Image Tokenization

Complex Wavelet Mutual Information Loss: A Multi-Scale Loss Function for Semantic Segmentation

Gradient Inversion of Multimodal Models

Fair Clustering via Alignment

On Learning Parallel Pancakes with Mostly Uniform Weights

Fast, Accurate Manifold Denoising by Tunneling Riemannian Optimization

Enabling Optimal Decisions in Rehearsal Learning under CARE Condition

Enforcing Idempotency in Neural Networks

LongRoPE2: Near-Lossless LLM Context Window Scaling

Learning With Multi-Group Guarantees For Clusterable Subpopulations

A Generalization Theory for Zero-Shot Prediction

Controllable Data Generation with Hierarchical Neural Representations

Stray Intrusive Outliers-Based Feature Selection on Intra-Class Asymmetric Instance Distribution or Multiple High-Density Clusters

How Do Transformers Learn Variable Binding in Symbolic Programs?

Poly2Vec: Polymorphic Fourier-Based Encoding of Geospatial Objects for GeoAI Applications

A Unified Framework for Generalization Error Analysis of Learning with Arbitrary Discrete Weak Features

Hyperspherical Normalization for Scalable Deep Reinforcement Learning

Generalized Venn and Venn-Abers Calibration with Applications in Conformal Prediction

SPRI: Aligning Large Language Models with Context-Situated Principles

Rethinking Chain-of-Thought from the Perspective of Self-Training

Aligning Spoken Dialogue Models from User Interactions

Curvature Enhanced Data Augmentation for Regression

Self-Discriminative Modeling for Anomalous Graph Detection

Focal-SAM: Focal Sharpness-Aware Minimization for Long-Tailed Classification

Ranked Entropy Minimization for Continual Test-Time Adaptation

Rhomboid Tiling for Geometric Graph Deep Learning

Learning to Match Unpaired Data with Minimum Entropy Coupling

SING: Spatial Context in Large Language Model for Next-Gen Wearables

Faster Rates for Private Adversarial Bandits

GaussMarker: Robust Dual-Domain Watermark for Diffusion Models

Learning from Loss Landscape: Generalizable Mixed-Precision Quantization via Adaptive Sharpness-Aware Gradient Aligning

Competing Bandits in Matching Markets via Super Stability

RISE: Radius of Influence based Subgraph Extraction for 3D Molecular Graph Explanation

Permutation Equivariant Neural Networks for Symmetric Tensors

Primitive Vision: Improving Diagram Understanding in MLLMs

CogMath: Assessing LLMs' Authentic Mathematical Ability from a Human Cognitive Perspective

Enhancing Logits Distillation with Plug&Play Kendall's $\tau$ Ranking Loss

Learning Dynamics under Environmental Constraints via Measurement-Induced Bundle Structures

Toward a Unified Theory of Gradient Descent under Generalized Smoothness

STAR: Learning Diverse Robot Skill Abstractions through Rotation-Augmented Vector Quantization

GHOST: Generalizable One-Shot Federated Graph Learning with Proxy-Based Topology Knowledge Retention

Hi-Patch: Hierarchical Patch GNN for Irregular Multivariate Time Series

GeoPixel: Pixel Grounding Large Multimodal Model in Remote Sensing

You Always Recognize Me (YARM): Robust Texture Synthesis Against Multi-View Corruption

An Efficient Pruner for Large Language Model with Theoretical Guarantee

Online Learning with Unknown Constraints

Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging

Splitting with Importance-aware Updating for Heterogeneous Federated Learning with Large Language Models

SPACE: Your Genomic Profile Predictor is a Powerful DNA Foundation Model

Dataflow-Guided Neuro-Symbolic Language Models for Type Inference

Can Biologically Plausible Temporal Credit Assignment Rules Match BPTT for Neural Similarity? E-prop as an Example

Holistic Physics Solver: Learning PDEs in a Unified Spectral-Physical Space

Stable Fair Graph Representation Learning with Lipschitz Constraint

Epsilon-VAE: Denoising as Visual Decoding

Minimax Optimal Regret Bound for Reinforcement Learning with Trajectory Feedback

Contrastive Private Data Synthesis via Weighted Multi-PLM Fusion

A Classification View on Meta Learning Bandits

Differentiable Solver Search for Fast Diffusion Sampling

When do neural networks learn world models?

Unifying Specialized Visual Encoders for Video Language Models

Tackling View-Dependent Semantics in 3D Language Gaussian Splatting

Potemkin Understanding in Large Language Models

On Temperature Scaling and Conformal Prediction of Deep Classifiers

Privacy-Shielded Image Compression: Defending Against Exploitation from Vision-Language Pretrained Models

Efficient Fine-Grained Guidance for Diffusion Model Based Symbolic Music Generation

Concentration Distribution Learning from Label Distributions

Accelerating Large Language Model Reasoning via Speculative Search

Generation from Noisy Examples

Fast and Low-Cost Genomic Foundation Models via Outlier Removal

Patch-wise Structural Loss for Time Series Forecasting

Stealix: Model Stealing via Prompt Evolution

Improved Expressivity of Hypergraph Neural Networks through High-Dimensional Generalized Weisfeiler-Leman Algorithms

$S^2$FGL: Spatial Spectral Federated Graph Learning

Reasoning Through Execution: Unifying Process and Outcome Rewards for Code Generation

Semantics-aware Test-time Adaptation for 3D Human Pose Estimation

UDora: A Unified Red Teaming Framework against LLM Agents by Dynamically Hijacking Their Own Reasoning

Is Noise Conditioning Necessary for Denoising Generative Models?

POROver: Improving Safety and Reducing Overrefusal in Large Language Models with Overgeneration and Preference Optimization

On the Convergence of Continuous Single-timescale Actor-critic

Bridging Layout and RTL: Knowledge Distillation based Timing Prediction

Compact Matrix Quantum Group Equivariant Neural Networks

MxMoE: Mixed-precision Quantization for MoE with Accuracy and Performance Co-Design

When Data-Free Knowledge Distillation Meets Non-Transferable Teacher: Escaping Out-of-Distribution Trap is All You Need

Matryoshka Quantization

GANQ: GPU-Adaptive Non-Uniform Quantization for Large Language Models

Information Bottleneck-guided MLPs for Robust Spatial-temporal Forecasting

Prediction models that learn to avoid missing values

Inductive Moment Matching

G-Adaptivity: optimised graph-based mesh relocation for finite element methods

TimeBridge: Non-Stationarity Matters for Long-term Time Series Forecasting

Of Mice and Machines: A Comparison of Learning Between Real World Mice and RL Agents

Beyond Task-Specific Reasoning: A Unified Conditional Generative Framework for Abstract Visual Reasoning

Finite-Time Convergence Rates in Stochastic Stackelberg Games with Smooth Algorithmic Agents

Discrepancies are Virtue: Weak-to-Strong Generalization through Lens of Intrinsic Dimension

Persistent Topological Features in Large Language Models

OmniAudio: Generating Spatial Audio from 360-Degree Video

Distributed Event-Based Learning via ADMM

Linear Mode Connectivity between Multiple Models modulo Permutation Symmetries

Stream-level Flow Matching with Gaussian Processes

Combinatorial Reinforcement Learning with Preference Feedback

A Two-Stage Learning-to-Defer Approach for Multi-Task Learning

Latent Imputation before Prediction: A New Computational Paradigm for De Novo Peptide Sequencing

Policy Design for Two-sided Platforms with Participation Dynamics

Parallel Simulation for Log-concave Sampling and Score-based Diffusion Models

Flow of Reasoning: Training LLMs for Divergent Reasoning with Minimal Examples

Bridging Protein Sequences and Microscopy Images with Unified Diffusion Models

Decision Making under the Exponential Family: Distributionally Robust Optimisation with Bayesian Ambiguity Sets

Approximate Forest Completion and Learning-Augmented Algorithms for Metric Minimum Spanning Trees

High Probability Bound for Cross-Learning Contextual Bandits with Unknown Context Distributions

TeDS: Joint Learning of Diachronic and Synchronic Perspectives in Quaternion Space for Temporal Knowledge Graph Completion

Breaking the Quadratic Barrier: Robust Cardinality Sketches for Adaptive Queries

Expressive Power of Graph Neural Networks for (Mixed-Integer) Quadratic Programs

ReferSplat: Referring Segmentation in 3D Gaussian Splatting

Compositional Generalization via Forced Rendering of Disentangled Latents

Prediction-Powered E-Values

HiRemate: Hierarchical Approach for Efficient Re-materialization of Neural Networks

Vision Graph Prompting via Semantic Low-Rank Decomposition

GRAM: A Generative Foundation Reward Model for Reward Generalization

Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM

Variational Phylogenetic Inference with Products over Bipartitions

Agent Reviewers: Domain-specific Multimodal Agents with Shared Memory for Paper Review

PANDAS: Improving Many-shot Jailbreaking via Positive Affirmation, Negative Demonstration, and Adaptive Sampling

Provably Efficient RL for Linear MDPs under Instantaneous Safety Constraints in Non-Convex Feature Spaces

Update Your Transformer to the Latest Release: Re-Basin of Task Vectors

Are High-Quality AI-Generated Images More Difficult for Models to Detect?

Memorization Sinks: Isolating Memorization during LLM Training

Aggregation Buffer: Revisiting DropEdge with a New Parameter Block

SNS-Bench: Defining, Building, and Assessing Capabilities of Large Language Models in Social Networking Services

Gamma Distribution PCA-Enhanced Feature Learning for Angle-Robust SAR Target Recognition

Plausible Token Amplification for Improving Accuracy of Differentially Private In-Context Learning Based on Implicit Bayesian Inference

Policy Optimization for CMDPs with Bandit Feedback: Learning Stochastic and Adversarial Constraints

Learning Condensed Graph via Differentiable Atom Mapping for Reaction Yield Prediction

Distributed Parallel Gradient Stacking(DPGS): Solving Whole Slide Image Stacking Challenge in Multi-Instance Learning

Demonstration Selection for In-Context Learning via Reinforcement Learning

Bayesian Inference for Correlated Human Experts and Classifiers

Long-Short Alignment for Effective Long-Context Modeling in LLMs

In-Context Learning and Occam's Razor

Deep Principal Support Vector Machines for Nonlinear Sufficient Dimension Reduction

Simple Path Structural Encoding for Graph Transformers

Heterogeneous Sufficient Dimension Reduction and Subspace Clustering

VideoRoPE: What Makes for Good Video Rotary Position Embedding?

M3-JEPA: Multimodal Alignment via Multi-gate MoE based on the Joint-Embedding Predictive Architecture

Adaptive Localization of Knowledge Negation for Continual LLM Unlearning

RuleAdapter: Dynamic Rules for training Safety Reward Models in RLHF

RUN: Reversible Unfolding Network for Concealed Object Segmentation

Noise-Guided Predicate Representation Extraction and Diffusion-Enhanced Discretization for Scene Graph Generation

LADA: Scalable Label-Specific CLIP Adapter for Continual Learning

Optimal Information Retention for Time-Series Explanations

Video-Enhanced Offline Reinforcement Learning: A Model-Based Approach

Taming Rectified Flow for Inversion and Editing

Navigating Conflicting Views: Harnessing Trust for Learning

Optimizing Large Language Model Training Using FP4 Quantization

LLMs Can Reason Faster Only If We Let Them

FlatQuant: Flatness Matters for LLM Quantization

Counting atoms faster: policy-based nuclear magnetic resonance pulse sequencing for atomic abundance measurement

Constrained Exploitability Descent: An Offline Reinforcement Learning Method for Finding Mixed-Strategy Nash Equilibrium

UniDB: A Unified Diffusion Bridge Framework via Stochastic Optimal Control

Multi-Modal Object Re-identification via Sparse Mixture-of-Experts

Few-Shot Learner Generalizes Across AI-Generated Image Detection

Can MLLMs Reason in Multimodality? EMMA: An Enhanced MultiModal ReAsoning Benchmark

RIFLEx: A Free Lunch for Length Extrapolation in Video Diffusion Transformers

CUPS: Improving Human Pose-Shape Estimators with Conformalized Deep Uncertainty

A Chaotic Dynamics Framework Inspired by Dorsal Stream for Event Signal Processing

An Error Analysis of Flow Matching for Deep Generative Modeling

S2-Track: A Simple yet Strong Approach for End-to-End 3D Multi-Object Tracking

Mitigating Heterogeneous Token Overfitting in LLM Knowledge Editing

Watch Out Your Album! On the Inadvertent Privacy Memorization in Multi-Modal Large Language Models

LOGO --- Long cOntext aliGnment via efficient preference Optimization

Towards Learning to Complete Anything in Lidar

Curvature-aware Graph Attention for PDEs on Manifolds

Clone-Robust AI Alignment

AKORN: Adaptive Knots generated Online for RegressioN splines

Latent Diffusion Planning for Imitation Learning

Certification for Differentially Private Prediction in Gradient-Based Training

EraseAnything: Enabling Concept Erasure in Rectified Flow Transformers

Feature out! Let Raw Image as Your Condition for Blind Face Restoration

Efficient Robust Conformal Prediction via Lipschitz-Bounded Networks

TtBA: Two-third Bridge Approach for Decision-Based Adversarial Attack

GrokFormer: Graph Fourier Kolmogorov-Arnold Transformers

On Exact Bit-level Reversible Transformers Without Changing Architecture

EnsLoss: Stochastic Calibrated Loss Ensembles for Preventing Overfitting in Classification

Mixture of Lookup Experts

Automatic Differentiation of Optimization Algorithms with Time-Varying Updates

Nonparametric Teaching for Graph Property Learners

DocVXQA: Context-Aware Visual Explanations for Document Question Answering

Learning Mixtures of Experts with EM: A Mirror Descent Perspective

MixMin: Finding Data Mixtures via Convex Minimization

Hierarchical Graph Tokenization for Molecule-Language Alignment

Relational Conformal Prediction for Correlated Time Series

Heads up! Large Language Models Can Perform Tasks Without Your Instruction via Selective Attention Head Masking

Orient Anything: Learning Robust Object Orientation Estimation from Rendering 3D Models

GoIRL: Graph-Oriented Inverse Reinforcement Learning for Multimodal Trajectory Prediction

Ex-VAD: Explainable Fine-grained Video Anomaly Detection Based on Visual-Language Models

PaperBench: Evaluating AI’s Ability to Replicate AI Research

Dimensionality Reduction on Complex Vector Spaces for Euclidean Distance with Dynamic Weights

Improving the Continuity of Goal-Achievement Ability via Policy Self-Regularization for Goal-Conditioned Reinforcement Learning

Whoever Started the interference Should End It: Guiding Data-Free Model Merging via Task Vectors

Variance-Reduced Forward-Reflected-Backward Splitting Methods for Nonmonotone Generalized Equations

WMAdapter: Adding WaterMark Control to Latent Diffusion Models

Vector Grimoire: Codebook-based Shape Generation under Raster Image Supervision

Interpolating Neural Network-Tensor Decomposition (INN-TD): a scalable and interpretable approach for large-scale physics-based problems

Nonparametric Modern Hopfield Models

Strategic Planning: A Top-Down Approach to Option Generation

Demystifying Cost-Efficiency in LLM Serving over Heterogeneous GPUs

FlexiClip: Locality-Preserving Free-Form Character Animation

ADHMR: Aligning Diffusion-based Human Mesh Recovery via Direct Preference Optimization

video-SALMONN-o1: Reasoning-enhanced Audio-visual Large Language Model

FedPHA: Federated Prompt Learning for Heterogeneous Client Adaptation

Modulated Diffusion: Accelerating Generative Modeling with Modulated Quantization

On the Impact of Performative Risk Minimization for Binary Random Variables

Convex Markov Games: A New Frontier for Multi-Agent Reinforcement Learning

Think Twice, Act Once: A Co-Evolution Framework of LLM and RL for Large-Scale Decision Making

How Transformers Learn Regular Language Recognition: A Theoretical Study on Training Dynamics and Implicit Bias

Divide and Conquer: Exploring Language-centric Tree Reasoning for Video Question-Answering

BiMaCoSR: Binary One-Step Diffusion Model Leveraging Flexible Matrix Compression for Real Super-Resolution

MTSTRec: Multimodal Time-Aligned Shared Token Recommender

EEG-Language Pretraining for Highly Label-Efficient Clinical Phenotyping

PatchPilot: A Cost-Efficient Software Engineering Agent with Early Attempts on Formal Verification

Generative Point Cloud Registration

Learning to Quantize for Training Vector-Quantized Networks

Understanding Model Ensemble in Transferable Adversarial Attack

TuCo: Measuring the Contribution of Fine-Tuning to Individual Responses of LLMs

Beyond Matryoshka: Revisiting Sparse Coding for Adaptive Representation

Thinking LLMs: General Instruction Following with Thought Generation

CtrlSynth: Controllable Image Text Synthesis for Data-Efficient Multimodal Learning

Cross-environment Cooperation Enables Zero-shot Multi-agent Coordination

Beyond One-Hot Labels: Semantic Mixing for Model Calibration

Generalists vs. Specialists: Evaluating LLMs on Highly-Constrained Biophysical Sequence Optimization Tasks

Lightweight Protocols for Distributed Private Quantile Estimation

Accurate and Efficient World Modeling with Masked Latent Transformers

MIRROR: Make Your Object-Level Multi-View Generation More Consistent with Training-Free Rectification

GCAL: Adapting Graph Models to Evolving Domain Shifts

ENAHPool: The Edge-Node Attention-based Hierarchical Pooling for Graph Neural Networks

STP: Self-play LLM Theorem Provers with Iterative Conjecturing and Proving

Towards LLM Unlearning Resilient to Relearning Attacks: A Sharpness-Aware Minimization Perspective and Beyond

Exploring and Mitigating Adversarial Manipulation of Voting-Based Leaderboards

Enhancing Parallelism in Decentralized Stochastic Convex Optimization

Surrogate Prompt Learning: Towards Efficient and Diverse Prompt Learning for Vision-Language Models

PARM: Multi-Objective Test-Time Alignment via Preference-Aware Autoregressive Reward Model

Dynamic Mixture of Curriculum LoRA Experts for Continual Multimodal Instruction Tuning

XAttnMark: Learning Robust Audio Watermarking with Cross-Attention

A Mathematical Framework for AI-Human Integration in Work

DAMA: Data- and Model-aware Alignment of Multi-modal LLMs

SpikeVideoFormer: An Efficient Spike-Driven Video Transformer with Hamming Attention and $\mathcal{O}(T)$ Complexity

MP-Nav: Enhancing Data Poisoning Attacks against Multimodal Learning

Position: Deep Learning is Not So Mysterious or Different

Position: Truly Self-Improving Agents Require Intrinsic Metacognitive Learning

Position: It Is Time We Test Neural Computation In Vitro

Position: Enough of Scaling LLMs! Lets Focus on Downscaling

Position: Graph Matching Systems Deserve Better Benchmarks

Position: The Right to AI

Position: The Most Expensive Part of an LLM *should* be its Training Data

Position: Retrieval-augmented systems can be dangerous medical communicators

Position: AI Should Not Be An Imitation Game: Centaur Evaluations

Position: Causal Machine Learning Requires Rigorous Synthetic Experiments for Broader Adoption

Position: LLMs Need a Bayesian Meta-Reasoning Framework for More Robust and Generalizable Reasoning

Position: AI's growing due process problem

Position: Probabilistic Modelling is Sufficient for Causal Inference

Position: Don't Use the CLT in LLM Evals With Fewer Than a Few Hundred Datapoints

Position: Contextual Integrity is Inadequately Applied to Language Models

Position: Rethinking Explainable Machine Learning as Applied Statistics

Position: Challenges and Future Directions of Data-Centric AI Alignment

Position: Generative AI Regulation Can Learn from Social Media Regulation

Position: Iterative Online-Offline Joint Optimization is Needed to Manage Complex LLM Copyright Risks

Position: Build Agent Advocates, Not Platform Agents

Position: Humanity Faces Existential Risk from Gradual Disempowerment

Position: Machine Learning Models Have a Supply Chain Problem

When Maximum Entropy Misleads Policy Optimization

Position: We Need Responsible, Application-Driven (RAD) AI Research

AxBench: Steering LLMs? Even Simple Baselines Outperform Sparse Autoencoders

Scaling Laws in Patchification: An Image Is Worth 50,176 Tokens And More

TabICL: A Tabular Foundation Model for In-Context Learning on Large Data

Stable Offline Value Function Learning with Bisimulation-based Representations

QPRL : Learning Optimal Policies with Quasi-Potential Functions for Asymmetric Traversal

Masked Autoencoders Are Effective Tokenizers for Diffusion Models

Rethinking the Bias of Foundation Model under Long-tailed Distribution

Robust Multimodal Large Language Models Against Modality Conflict

Dimension-Free Adaptive Subgradient Methods with Frequent Directions

Multi-objective Linear Reinforcement Learning with Lexicographic Rewards

In-Context Reinforcement Learning From Suboptimal Historical Data

Meta Optimality for Demographic Parity Constrained Regression via Post-Processing

PDE-Controller: LLMs for Autoformalization and Reasoning of PDEs

Tractable Transformers for Flexible Conditional Generation

Scaling Probabilistic Circuits via Monarch Matrices

Pessimism Principle Can Be Effective: Towards a Framework for Zero-Shot Transfer Reinforcement Learning

Subspace Optimization for Large Language Models with Convergence Guarantees

Elucidating the design space of language models for image generation

MCU: An Evaluation Framework for Open-Ended Game Agents

Volume-Aware Distance for Robust Similarity Learning

Let LLM Tell What to Prune and How Much to Prune

Strategy Coopetition Explains the Emergence and Transience of In-Context Learning

MODULI: Unlocking Preference Generalization via Diffusion Models for Offline Multi-Objective Reinforcement Learning

R*: Efficient Reward Design via Reward Structure Evolution and Parameter Alignment Optimization with Large Language Models

Avoiding Catastrophe in Online Learning by Asking for Help

(How) Can Transformers Predict Pseudo-Random Numbers?