TabFlex: Scaling Tabular Learning to Millions with Linear Attention
Authors: Yuchen Zeng, Tuan Dinh, Wonjun Kang, Andreas C Mueller
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our extensive evaluations demonstrate that TABFLEX can achieve over a 2 speedup compared to TABPFN and a 1.5 speedup over XGBoost, outperforming 25 tested baselines in terms of efficiency across a diverse range of datasets. |
| Researcher Affiliation | Collaboration | 1Work done during an internship at the Gray Systems Lab, Microsoft 2University of Wisconsin-Madison 3University of California San Francisco 4Furiosa AI 5Seoul National University 6Gray System Lab, Microsoft. Correspondence to: Andreas C. Müeller <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 Conditional Model Selection Input :A dataset D with n instances of d features |
| Open Source Code | Yes | Our code is available at https: //github.com/microsoft/ticl. |
| Open Datasets | Yes | We evaluate TABFLEX s performance and speed across 115 Open ML tabular datasets (Vanschoren et al., 2013). |
| Dataset Splits | Yes | For each dataset, we consider ten different train/test splits, computing the score mean and standard deviation, as well as the total runtime per 1000 instances. |
| Hardware Specification | Yes | Each model is trained on a single Nvidia A100 80GB PCIe GPU. |
| Software Dependencies | No | In our implementation, we adopt a straightforward Py Torch approach to linear attention rather than an HBM-efficient method. We employ the concise two-line implementation presented in Listing 1. In the following lemma, we demonstrate that this straightforward implementation only incurs a marginal increase in HBM accesses and HBM memory usage. |
| Experiment Setup | Yes | Table 6 summarizes the hyperparameters selected for training TABFLEX-S100, TABFLEX-L100, and TABFLEX-H1K. For all three methods, we utilize the same embedding size of 512, consistent with TABPFN. |