Small Models are LLM Knowledge Triggers for Medical Tabular Prediction
Authors: Jiahuan Yan, Jintai Chen, Chaowen Hu, Bo Zheng, Yaojun Hu, Jimeng Sun, Jian Wu
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Comprehensive experiments on widely used medical domain tabular datasets show that, without access to gold labels, applying SERSAL to Open AI GPT reasoning process attains substantial improvement compared to linguistic prompting methods, which serves as an orthogonal direction for tabular LLM, and increasing prompting bonus is observed as more powerful LLMs appear. Codes are available at https://github.com/jyansir/sersal. |
| Researcher Affiliation | Academia | 1College of Computer Science and Technology, Zhejiang University 2Thrust of Artificial Intelligence, Information Hub, HKUST (GZ) 3Computer Science Department, University of Illinois Urbana-Champaign 4The Second Affiliated Hospital Zhejiang University School of Medicine 5Zhejiang Key Laboratory of Medical Imaging Artificial Intelligence EMAIL, EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1 Unsupervised SERSAL. Line 2: LLM pseudo labeling (Sec. 2.1); Line 3-5: Small model teaching (Sec. 2.2); Line 6: Quality control (Sec. 2.3); Line 7-9: Reverse tuning (Sec. 2.4). |
| Open Source Code | Yes | Codes are available at https://github.com/jyansir/sersal. |
| Open Datasets | Yes | We evaluate on ten widely recognized medical diagnosis tabular datasets on various diseases: Heart Failure Prediction (HF, Detrano et al. (1989)), Lung Cancer Prediction (LC, Ahmad & Mayya (2020)), Early Classification of Diabetes (ECD, Islam et al. (2020)), Indian Liver Patient Records (LI, Ramana et al. (2012)), Hepatitis C Prediction (HE, Hoffmann et al. (2018)), Pima Indians Diabetes Database (PID, Smith et al. (1988)), Framingham Heart Study (FH, O Donnell & Elosua (2008)), Stroke Prediction (ST, Fedesoriano (2020)), COVID-19 Presence(CO, Hemanthhari (2020)) and Anemia Disease (AN, Kilicarslan et al. (2021)). |
| Dataset Splits | Yes | We split each tabular dataset (80 % for training and 20 % for testing), and keep the same label distribution in each split. |
| Hardware Specification | Yes | All experiments are conducted with Py Torch on Python 3.8 and run on NVIDIA RTX 3090. |
| Software Dependencies | Yes | All experiments are conducted with Py Torch on Python 3.8 and run on NVIDIA RTX 3090. |
| Experiment Setup | Yes | For the small model, we uniformly use FT-Transformer with the default model and training configurations provided in the original paper (Gorishniy et al., 2021). For SERSAL, the only adjustable hyper-parameter is the temperature of Divide Mix (Li et al., 2019) with choices of 0.5, 5.0 and 10.0 in line 5 of Algorithm 1, which is selected by the metric of the early stopping set (D(t) es in line 4 of Algorithm 1). ... Additionally, we uniformly introduce the early stopping patience m to 5. The best temperature is selected based on the training loss of early stopping subset Des. |