Poster
in
Workshop: 1st Workshop on Foundation Models for Structured Data (FMSD)
Lights Out, Tabs On: Advancing Row-Column Encoding for Tabular LLMs
Yi-Kai Zhang · Huai-Hong Yin · Xin Li · Haoyu Cao · Yinsong Liu · Deqiang Jiang · Xing Sun · De-Chuan Zhan · Han-Jia Ye
Large language models (LLMs) excel in understanding diverse real-world data and achieving cross-domain generalization, but struggle with row-level tabular predictions and table-level QAs.Existing tabular LLMs serialize tables into 1D text using language templates (e.g., feature name is value), which lack 2D spatial relationships, or structured formats (e.g., HTML tables), which disrupt feature name-value associations.In this paper, we introduce LoTo: Lights out, Tabs on, a novel tabular LLM equipped with the axial row-column encoder.Inspired by the "Lights Out" game, LoTo prioritizes attention on cells sharing the same row and column.It incorporates tunable 2D positional encodings to enhance structural awareness, binned embeddings to improve numerical recognition, and a fine-grained cell projector to preserve tabular information.We develop a comprehensive training and evaluation benchmark for general tabular instruction fine-tuning. Experimental results demonstrate that LoTo achieves leading performance across both row-level and table-level tasks, establishing a foundation for general tabular LLMs.