reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Personalized Representation from Personalized Generation

Authors: Shobhita Sundaram, Julia Chae, Yonglong Tian, Sara Beery, Phillip Isola

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our learned representations for four downstream tasks: classification, retrieval, detection, and segmentation, and find that performance universally improves. We show that our method improves personalized representation learning for diverse downstream tasks, from recognition to segmentation, and analyze characteristics of image generation approaches that are key to this gain.
Researcher Affiliation	Collaboration	Shobhita Sundaram1 Julia Chae1 Yonglong Tian2 Sara Beery1 Phillip Isola1 1MIT 2Open AI
Pseudocode	No	The paper describes methods and pipelines in prose and with figures (e.g., Figure 2 for the training pipeline, Section 3.4 for the info NCE loss), but it does not contain any explicit pseudocode blocks or algorithms labeled as such.
Open Source Code	Yes	Our website (https://personalized-rep.github.io/) and Github repository (https: //github.com/ssundaram21/personalized-rep) contain the source code for our work, including the necessary metadata to reproduce results, such as the LLM-generated captions used for dataset synthesis.
Open Datasets	Yes	We introduce a new dataset, PODS (Personal Object Discrimination Suite). PODS features common personal and household objects, enabling instance-level evaluation across classification, retrieval, detection, and segmentation tasks. We release our new dataset, PODS, and the reformulated Deep Fashion2 and Dog Face Net datasets.
Dataset Splits	Yes	All datasets are split such that for each object there are exactly 3 training images and at least 3 test images. Each dataset is randomly divided class-wise into a validation set (30 classes), and test set (varying size).
Hardware Specification	Yes	We report the wall-clock runtimes of synthetic data generation methods, using a single NVIDIA A100 GPU, in Table 3.
Software Dependencies	Yes	We generate personalized data from DR using Stable Diffusion 1.5, a T2I model, as our generator gθ. We adapt gθ using Dream Booth (Ruiz et al., 2022) to generate novel images of c when conditioned on an identifier token. Following prior works, we generate image captions with GPT-4 (Open AI, 2023).
Experiment Setup	Yes	We fine-tune via Low-Rank Adaptation (Lo RA), which is more parameter-efficient than full fine-tuning (Hu et al., 2021). We Lo RA finetune with the info NCE loss for 2 epochs over 4500 anchor-positive pairs, drawn from 450 synthetic positives and 1000 synthetic negatives. We use the following hyperparameters to Lo RA fine-tune each backbone: Learning rate: 0.0003, Batch size: 16, Lo RA rank: 16, Lo RA alpha: 0.5, Lo RA dropout: 0.3.