Poster
in
Workshop: Methods and Opportunities at Small Scale (MOSS)
Parity Requires Unified Input Dependence and Negative Eigenvalues in SSMs
Behnoush Khavari · Jayesh Khullar · Mehran Shakerinava · Jerry Huang · Siamak Ravanbakhsh · Sarath Chandar
Keywords: [ State-tracking ] [ expressivity ] [ Linear Recurrent Neural Networks ] [ SSMs ]
Recent work has shown that LRNN models such as S4D, Mamba, and DeltaNet lack state-tracking capability due to either time-invariant transition matrices or restricted eigenvalue ranges. To address this, input-dependent transition matrices, particularly those that are complex or non-triangular, have been proposed to enhance SSM performance on such tasks. While existing theorems demonstrate that both input-independent and non-negative SSMs are incapable of solving simple state-tracking tasks like parity, regardless of depth, they do not explore whether combining these two types in a multilayer SSM could help. We investigate this question for efficient SSMs with diagonal transition matrices and show that such combinations still fail to solve parity. This implies that a recurrence layer must be both input-dependent and include negative eigenvalues. Our experiments support this conclusion by analyzing an SSM model that combines S4D and Mamba layers.