Language Models Are Good Tabular Learners

Authors: Zhenhan Huang, Kavitha Srinivas, Horst Samulowitz, Niharika S. D'Souza, Charu C. Aggarwal, Pin-Yu Chen, Jianxi Gao

TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We examine the performance of TDTransformer on the standard tabular data benchmark Open ML 1. Extensive experiments on more than 70 tabular data sets show the superiority of TDTransformer. In summary, the main contributions of this work are as follows: (...) We also test the performance using Ro BERTa (Liu, 2019) as the backbone model (see Appendix). [CLS] embedding is used for the prediction. The training pipeline, similar to the classic pre-training fine-tuning paradigm, consists of two steps: the first step is to pre-train the model. The second step is to fine-tune the model that is initialized with pre-trained weights.
Researcher Affiliation Collaboration Zhenhan Huang EMAIL Department of Computer Science Rensselaer Polytechnic Institute Kavitha Srinivas EMAIL IBM Research Horst Samulowitz EMAIL IBM Research Niharika S. D Souza EMAIL IBM Research Charu C. Aggarwal EMAIL IBM Research Pin-Yu Chen EMAIL IBM Research Jianxi Gao EMAIL Department of Computer Science Rensselaer Polytechnic Institute
Pseudocode No The paper includes a figure (Figure 1) illustrating the TDTransformer framework pipeline, but it does not contain any structured pseudocode or algorithm blocks describing the methodology step-by-step in a code-like format.
Open Source Code Yes We release our code in https://github.com/Zhenhan-Huang/TDTransformer.
Open Datasets Yes We examine the performance of TDTransformer on the standard tabular data benchmark Open ML 1. Extensive experiments on more than 70 tabular data sets show the superiority of TDTransformer. (...) We use 76 real-world tabular classification datasets in the standard Open ML benchmark (which are manually curated for effective benchmarking). The details of the tables are given in Appendix Section A.4. Open ML benchmark: https://www.openml.org/
Dataset Splits Yes We use 76 real-world tabular classification datasets in the standard Open ML benchmark (which are manually curated for effective benchmarking). The train/validation/test splits is 72%/8%/20% for each Open ML dataset. We use accuracy as the metric to measure the performance for all classification data sets.
Hardware Specification Yes We conducted all epxeriments using a single A40 Tensor Core GPU and EPYC 7232P CPU.
Software Dependencies No TDTransformer uses pre-trained BERT tokenizer (Devlin, 2018) and Adam optimizer (Kingma, 2014) without weight decay. The hidden dimension is 512 and model depth is 12. The number of quantiles for PLE is 64. In both the pre-training and fine-tuning process, we use an early stopping strategy (Yao et al., 2007) with a patience of 10. The maximum number of training epochs is 200 with batch size of 128. The corruption parameter of pre-training process is set to 0.5. When there are empty cells in a column, we replace empty cells with the most common values in that column. While software components like BERT tokenizer and Adam optimizer are mentioned, specific version numbers for these or other crucial libraries (e.g., PyTorch, TensorFlow) are not provided.
Experiment Setup Yes The hidden dimension is 512 and model depth is 12. The number of quantiles for PLE is 64. In both the pre-training and fine-tuning process, we use an early stopping strategy (Yao et al., 2007) with a patience of 10. The maximum number of training epochs is 200 with batch size of 128. The corruption parameter of pre-training process is set to 0.5.