reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Rethinking Knowledge Transfer in Learning Using Privileged Information

Authors: Danil Provodin, Bram van den Akker, Christina Katsimerou, Maurits Clemens Kaptein, Mykola Pechenizkiy

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments for a wide variety of application domains further demonstrate that state-of-the-art LUPI approaches fail to effectively transfer knowledge from PI. Thus, we advocate for practitioners to exercise caution when working with PI to avoid unintended inductive biases. Our contribution Our key contributions can be summarized as follows: We conduct experiments on four real-world datasets from various application domains and find out that no improvement from PI model is observed, which adds evidence to the limited contribution of LUPI in practical applications.
Researcher Affiliation	Collaboration	Danil Provodin EMAIL Eindhoven University of Technology Bram van den Akker EMAIL Booking.com Christina Katsimerou EMAIL Booking.com Maurits Kaptein EMAIL Eindhoven University of Technology Mykola Pechenizkiy EMAIL Eindhoven University of Technology
Pseudocode	No	The paper describes methods like Generalized distillation and Marginalization with weight sharing using mathematical equations and descriptive text, but it does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	The source code of the experiments can be found at https://github.com/danilprov/rethinking_lupi.
Open Datasets	Yes	Repeat Buyers (Alibaba, 2024) ... Heart Disease (BRFSS, 2024) ... NASA-NEO (NASA, 2024) ... Smoker or Drinker (Soo, 2024) ... All datasets are distributed under CC BY-NC 4.0 license.
Dataset Splits	Yes	We perform a timestamp-based train test split and use 70% of data for training each model and 30% of data for reporting performance.
Hardware Specification	Yes	We distribute all runs across 6 CPU nodes (Intel(R) CPU i7-10750H) and 1 GPU Nvidia Quadro T1000 per run for experiments.
Software Dependencies	No	The paper mentions optimizers (rmsprop, Adam) and loss functions (mean squared error, cross-entropy) but does not provide specific version numbers for any programming languages, libraries, or frameworks (e.g., Python, PyTorch, TensorFlow, scikit-learn).
Experiment Setup	Yes	For both Experiment 1 and Experiment 3, as a no-PI, student, and teacher models, we use 1 linear layer of dimension 50, with softmax activation. The networks were trained using an rmsprop optimizer with a mean squared error loss function. The temperature and imitation parameters for Generalized distillation were set to 1. For MNIST and SARCOS experiments, we use two-layer fully connected neural networks of dimension 20, with Re LU hidden activations and softmax output activation for the no-PI, student, and teacher models. The networks were trained using an rmsprop optimizer with a mean squared error loss function. The temperature and imitation parameters for Generalized distillation in the MNIST experiment were set to 10 and 1, respectively, as the best parameter set from the original paper Lopez-Paz et al. (2016). For both of them, as a no-PI model, we use two-layer fully connected neural networks of dimension 64, with tanh hidden activations and linear output activation for regression and sigmoid for classification. TRAM model has an extra hidden layer of size 64 with tanh activation function in the PI head. Both TRAM and no-PI networks are fit using the Adam optimizer Kingma & Ba (2017) with mean squared error loss function. All models are trained for 50 epochs with cross-entropy loss function and Adam optimizer with a base learning rate of 0.001, β1 = 0.9, β1 = 0.95, ϵ = 1e 07. All models are trained with L2 weight regularization with a decay weight of 0.1.