Poster
Fast Min-$\epsilon$ Segmented Regression using Constant-Time Segment Merging
Ansgar Lößer · Max Schlecht · Florian Schintke · Joel Witzke · Matthias Weidlich · Björn Scheuermann
East Exhibition Hall A-B #E-2000
A well-known and fundamental technique of machine learning and statistical analysis is regression. Given a dataset of samples with a known input value and a correlated measured result, it is possible to derive a function that approximates the correlation between these two variables as closely as possible. This can be used to either derive knowledge about the underlying data or to predict the output value for unseen input values. In some use cases, the dataset contains ordered samples and suddenly changes behavior at a certain point. Correctly detecting these breakpoints is quite challenging. Current state-of-the-art algorithms are either exceedingly compute-heavy with a growing number of samples or result in a significantly worse regression function.In this paper, we present and evaluate a new algorithm for segmented regression. In our evaluation, this new approach needed much fewer computational resources -- even compared to the other heuristics -- while resulting in regressions very close to the optimal solution, without creating additional breakpoints. Given that more data enables more precise models, we think that our approach enables faster analysis and much more precise models in many different fields, including time-series analysis, ecology, econometrics, and gene analysis.