reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

T-JEPA: Augmentation-Free Self-Supervised Learning for Tabular Data

Authors: Hugo Thimonier, José Lucas De Melo Costa, Fabrice Popineau, Arpad Rimmel, Bich-Liên DOAN

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experimental results demonstrate a substantial improvement in both classification and regression tasks, outperforming models trained directly on samples in their original data space. 4 EXPERIMENTS 4.1 EXPERIMENTAL SETTING Datasets Following previous work (Ye et al., 2024), we experiments on 7 datasets with heterogeneous features to test the effectiveness of T-JEPA. We test our approach on several supervised tabular deep learning tasks such as binary and multi-class classification, as well as regression. We use as performance metrics Accuracy ( ) and RMSE ( ) for classification and regression respectively. Table 1: Performance metrics for different downstream models trained on the original dataspace and the generated T-JEPA representation across datasets.
Researcher Affiliation	Collaboration	1 Université Paris-Saclay, CNRS, Centrale Supélec, Laboratoire Interdisciplinaire des Sciences du Numérique, 91190, Gif-sur-Yvette, France. 2 Emobot, France. {name}.{surname}@centralesupelec.fr
Pseudocode	No	The paper describes the T-JEPA training pipeline (Figure 1) and the formal equations for its components (equations 1-5). It outlines the steps of the method in Section 3, but it does not present a structured, explicitly labeled pseudocode block or algorithm.
Open Source Code	Yes	Each experiments detailed in the present work can be reproduced using the following code: : https://github.com/jose-melo/t-jepa
Open Datasets	Yes	The datasets we include in our experiments are Adult (AD) (Kohavi et al., 1996), Higgs (HI) (Vanschoren et al., 2014), Helena (HE) (Guyon et al., 2019), Jannis (JA) (Guyon et al., 2019), ALOI (AL) (Geusebroek et al., 2005) and California housing (CA) (Pace and Barry, 1997). We also add MNIST (interpreted as a tabular data) to our benchmark following Yoon et al. (2020).
Dataset Splits	Yes	T-JEPA Training We split each dataset into training/validation/test sets (80/10/10) which were used for selecting both the hyperparameters of T-JEPA and of the models used for the downstream task.
Hardware Specification	Yes	The training was done on a single NVIDIA HGX A100 GPU with 40GB of memory.
Software Dependencies	Yes	Table 4: Main libraries used in the project. Library Description Python v3.12.2 The programming language used for the project einops v0.8.0 A flexible and powerful tool for tensor operations matplotlib v3.8.4 A library for creating static, animated, and interactive plots numpy v2.1.0 Fundamental package for scientific computing with arrays pandas v2.2.2 Data manipulation and analysis tool pytorch_lightning v2.2.1 A Py Torch wrapper for high-performance deep learning research scikit_learn v1.4.1.post1 Machine learning library for Python scipy v1.14.1 Library for scientific and technical computing torch v2.3.0.post301 Py Torch deep learning library torchinfo v1.8.0 Module to show model summaries in Py Torch tqdm v4.66.2 Progress bar utility for Python xgboost v2.1.1 Optimized gradient boosting library
Experiment Setup	Yes	We employed Bayesian optimization to tune the hyperparameters of T-JEPA. The batch size was fixed at 512 for all configurations, while the exponential moving average (EMA) decay rate was set to vary from 0.996 to 1. Additionally, we used four prediction masks throughout the training process. For optimization, we selected the Adam W optimizer (Loshchilov and Hutter, 2019) due to its proven robustness in large-scale models. The learning rate was adaptively adjusted using a cosine annealing scheduler (Loshchilov and Hutter, 2017), which gradually reduced it from the initial value to a minimum, ηmin = 0. Table 5: Hyperparameter Configuration for Bayesian Optimization