Skip to yearly menu bar Skip to main content


Spotlight Poster

Prediction models that learn to avoid missing values

Lena Stempfle · Anton Matsson · Newton Mwai · Fredrik Johansson

East Exhibition Hall A-B #E-1701
[ ] [ ] [ Project Page ]
Tue 15 Jul 4:30 p.m. PDT — 7 p.m. PDT

Abstract:

Handling missing values at test time is challenging for machine learning models, especially when aiming for both high accuracy and interpretability. Established approaches often add bias through imputation or excessive model complexity via missingness indicators. Moreover, either method can obscure interpretability, making it harder to understand how the model utilizes the observed variables in predictions. We propose missingness-avoiding (MA) machine learning, a general framework for training models to rarely require the values of missing (or imputed) features at test time. We create tailored MA learning algorithms for decision trees, tree ensembles, and sparse linear models by incorporating classifier-specific regularization terms in their learning objectives. The tree-based models leverage contextual missingness by reducing reliance on missing values based on the observed context. Experiments on real-world datasets demonstrate that MA-DT, MA-LASSO, MA-RF, and MA-GBT effectively reduce the reliance on features with missing values while maintaining predictive performance competitive with their unregularized counterparts. This shows that our framework gives practitioners a powerful tool to maintain interpretability in predictions with test-time missing values.

Lay Summary:

Machine learning models often struggle when values are missing at test time—a common issue in real-world applications like healthcare or finance. Existing solutions either fill in missing values (which can introduce bias) or make models more complex (which can reduce interpretablity). This makes it harder to understand and trust model decisions.We tackle this problem by developing missingness-avoiding (MA) machine learning—a new framework that encourages models to minimize reliance on missing features during prediction. We designed custom versions of this method for several models, including decision trees, random forests, gradient boosting, and sparse linear models. Our approach includes tailored regularization that encourages models to base decisions on observed inputs rather than imputed or missing data.In experiments across several real-world datasets, our models remained accurate while becoming significantly less dependent on missing features. This work matters because it allows practitioners to make accurate, interpretable predictions, even when some values are missing, offering safer, more trustworthy ML tools for critical domains.

Chat is not available.