Generative Modeling Reinvents Supervised Learning: Label Repurposing with Predictive Consistency Learning
Authors: Yang Li, Jiale Ma, Yebin Yang, Qitian Wu, Hongyuan Zha, Junchi Yan
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on vision, text, and graph tasks show the superiority of PCL over conventional supervised training in complex label prediction tasks. (...) 5. Experiments (...) Table 1. Prediction error ( 10 2) of SL and PCL on top of graph models on various types of N-body simulation systems. (...) Table 3. Results on Semantic Segmentation. (...) Table 4. Evaluation on LLM fine-tuning. |
| Researcher Affiliation | Academia | 1School of Artificial Intelligence, Shanghai Jiao Tong University 2Shanghai Innovation Institute 3Broad Institute of MIT and Harvard 4The Chinese University of Hong Kong, Shenzhen. |
| Pseudocode | Yes | Algorithm 1 Predictive Consistency Training (...) Algorithm 2 Multistep Prediction |
| Open Source Code | Yes | Code available at github repository. |
| Open Datasets | Yes | We utilize the ADE20K dataset (Zhou et al., 2019) (...) The Alpaca (Taori et al., 2023) dataset is based on the self-instruct method (Wang et al., 2022) |
| Dataset Splits | Yes | We collect 5000 trajectories for training, 2000 for validation, and 2000 for testing for each configuration. |
| Hardware Specification | Yes | Experiments for constrained n-body simulation are conducted on a single GPU of NVIDIA RTX 4090. For semantic segmentation, a single NVIDIA H100 GPU was employed, and experiments for next-token prediction are performed on 8 GPUs of NVIDIA H800. |
| Software Dependencies | No | The paper does not explicitly provide specific software names with version numbers. |
| Experiment Setup | Yes | LPCL(θ) =E λ1d fθ(x, yt, t), y + λ1d fθ(x, yt , t ), y) + λ2d fθ(x, yt, t), fθ(x, yt , t ) (...) we adopt 4 graph neural layers (...) with a maximum noise step of 1000 and 5 iterations, the time steps t are set as [1000, 800, 600, 400, 200], ensuring a gradual reduction of noise over the course of iterations. |