TAROT: Targeted Data Selection via Optimal Transport

Authors: Lan Feng, Fan Nie, Yuejiang Liu, Alexandre Alahi

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate TAROT across multiple tasks, including semantic segmentation, motion prediction, and instruction tuning. Results consistently show that TAROT outperforms state-of-the-art methods, highlighting its versatility across various deep learning tasks.
Researcher Affiliation Academia 1EPFL, Switzerlanzd 2Stanford, USA. Correspondence to: Yuejiang Liu <EMAIL>, Alexandre Alahi <EMAIL>.
Pseudocode Yes Algorithm 1 Fixed-Size Selection Algorithm 2 OT-Distance Minimization Selection (OTM)
Open Source Code Yes Code is available at: https: //github.com/vita-epfl/TAROT.
Open Datasets Yes Following Park et al. (2023), we evaluate image classification using Res Net-9 classifiers trained on the CIFAR-10 dataset. For motion prediction, we adopt Auto Bots (Girgis et al., 2021), training on the nu Scenes (Caesar et al., 2020) dataset (32k samples) and validating on 9k target samples. The GTA5 dataset (Richter et al., 2016) serves as the candidate dataset, while the Cityscapes (Cordts et al., 2016) training split (2975 samples) is used as the target dataset, with its validation split for evaluation. We use the Uni Traj framework (Feng et al., 2024) for unified training and evaluation across multiple datasets, including Waymo Open Motion (WOMD) (Ettinger et al., 2021), Argoverse 2 (Wilson et al., 2021), nu Scenes (Caesar et al., 2020), and nu Plan (H. Caesar, 2021). We utilize the same candidate dataset, comprising FLAN V2 (Longpre et al., 2023), COT (Wei et al., 2022), DOLLY (Conover et al., 2023) and OPEN ASSISTANT 1 (K opf et al., 2024), with MMLU (Hendrycks et al., 2021b;a) and BBH (Suzgun ets al., 2023) serving as the target tasks for evaluation.
Dataset Splits Yes The Cityscapes (Cordts et al., 2016) training split (2975 samples) is used as the target dataset, with its validation split for evaluation. The nu Scenes training set (32k samples) serves as the target dataset, while the candidate pool comprises WOMD, Argoverse 2, and nu Plan. From nu Plan, we filter trajectories with a moving distance over 2 meters, yielding 1000k samples. We use the official training splits of WOMD and Argoverse 2, including 2000k samples. The evaluation is conducted on the nu Scenes validation set. We evaluate selection ratios of 5%, 20%, 50% and OTM (Section 3.3). OTM selects approximately 24% of the data. OT-Distance Minimization Selection (OTM): The target dataset Dt is randomly split into k equal subsets. In each fold, 1/k of Dt is used for selection, while the OT distance is evaluated against the remaining (k 1)/k data. ... In our experiments, k = 10 ensures a good match with the target distribution while avoiding overfitting.
Hardware Specification Yes Table 8: The wall clock runtime (measured as single H100 GPU hours) of TAROT compared with LESS and TSDS on instruction tuning task.
Software Dependencies No The paper mentions specific models and frameworks used (Deep Lab V3, Res Net50, Auto Bots, Wayformer, LLAMA-3.1-8B, QWEN-2.5-7B) but does not provide specific version numbers for underlying software dependencies like programming languages (e.g., Python), libraries (e.g., PyTorch, TensorFlow), or CUDA versions.
Experiment Setup Yes Table 3: Training Hyperparameters for Semantic Segmentation Table 4: Experiment Settings for Motion Prediction Table 5: Training Hyperparameters for Instruction Tuning