reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Lifelong Learning in StyleGAN through Latent Subspaces

Authors: Adarsh Kappiyath, ANMOL GARG, Ramya Hebbalaguppe, Prathosh AP

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	4 Experiments and Results We employ Style GAN2 Karras et al. (2020a) as the base architecture for all experiments. Style CL is compared to GAN Memory, CAM-GAN with task similarity learning, and Mer GAN. We implemented all the baselines with the Style GAN2 backbone to ensure a fair comparison. Evaluation metrics include the Fréchet Inception Distance (FID) Heusel et al. (2017), Density, and Coverage Naeem et al. (2020), which are pivotal for assessing the quality and diversity of generated images. Additionally, we examine the computational and memory overhead using FLOPs and parameter count Dehghani et al. (2022), essential for evaluating the scalability of continual learning approaches. 4.2 Results for perceptually distant tasks Qualitative comparisons of generated samples (Fig. 2) demonstrate that Style CL consistently produces superior image quality compared to the baselines. We provide additional qualitative results in Appendix B.2 of supplementary materials. Quantitative results, detailed in Tab. 1, confirm that Style CL surpasses the baseline methods on most fronts according to FID, Density, and Coverage metrics. Additionally, Tab. 2 illustrates Style CL s reduction in parameters and the marginal increase in FLOPs, training and inference time for each of the methods. 5 Analysis and Ablations 5.2 Ablation Studies on Latent Dictionary and Feature Adaptors
Researcher Affiliation	Collaboration	Adarsh Kappiyath EMAIL TCS Research Delhi, India Anmol Garg EMAIL Indian Institute of Science (IISc), Bengaluru, India TCS Research, Delhi, India Ramya Hebbalaguppe EMAIL TCS Research Delhi, India Dr. Prathosh A. P EMAIL Indian Institute of Science (IISc), Bengaluru, India
Pseudocode	Yes	Algorithm 1 Style CL: Training Procedure Input: {X t}T t=1: Stream of T datasets. Output: Learned parameters: {Ut, bt}T t=2, generator G1, and adaptors {ϕt}T t=2 1: Train Style GAN2 with dataset X 1 to obtain G1 2: for t = 2 to T do 3: Initialize discriminator ψ, Ut, bt 4: Optimize Ut and bt using L1(Dψ, G1) as in equation 8 5: Determine most similar task k using equation 3 to get Gk 6: for each training iteration do 7: Sample latent vector wt using equation 1 and equation 2 8: Optimize Ut and bt using L1(Dψ, Gk) as in equation 8 9: end for 10: Calculate task similarity sim(t, k) using equation 5 11: Initialize feature adaptor parameters ϕt and αt 12: if sim(t, k) > 0 then 13: Obtain f t m of Gt using equation 7 14: Initialise Ut, bt using weights from step 8 15: else 16: Obtain f t m of Gt using equation 6 17: Initialise Ut, bt using weights from step 4 18: end if 19: Optimize Ut, bt, and ϕt using L1(Dψ, Gt) as in equation 8 20: end for
Open Source Code	Yes	Code for this work is available at link
Open Datasets	Yes	Adhering to protocols established in CAM-GAN and GAN-Memory, our initial model training employs the Celeb A-HQ Karras et al. (2018) dataset. This is followed by sequential training on six datasets that are significantly varied in visual content, thus perceptually distinct : Oxford 102 Flowers Nilsback & Zisserman (2008), LSUN Church Yu et al. (2015), LSUN Cats Yu et al. (2015), Brain MRI Cheng et al. (2016), Chest X-Ray Kermany et al. (2018), and Anime Faces 2 https://github.com/jayleicn/anime GAN
Dataset Splits	No	The paper mentions using several datasets (Celeb A-HQ, Oxford 102 Flowers, LSUN Church, LSUN Cats, Brain MRI, Chest X-Ray, Anime Faces, Image Net) but does not provide explicit details on how these datasets were split into training, validation, or test sets, nor does it refer to specific standard splits by name or citation within the main text.
Hardware Specification	No	The paper does not provide specific details about the hardware used to run the experiments, such as specific GPU or CPU models, memory configurations, or cloud computing instance types.
Software Dependencies	No	We employ Style GAN2 Karras et al. (2020a) as the base architecture for all experiments. Additionally, we examine the computational and memory overhead using FLOPs and parameter count Dehghani et al. (2022), essential for evaluating the scalability of continual learning approaches. As pointed out in Dehghani et al. (2022), data from major cloud service providers indicate that 85-90% of ML workload is in inference processing, considering this we measure FLOPs during inference stage using fvcore package1. 1https://github.com/facebookresearch/fvcore/blob/main/docs/flop_count.md The paper mentions using the 'fvcore package' for FLOPs measurement, but it does not specify a version number for this package or any other software dependencies.
Experiment Setup	No	We follow the training paradigm in Karras et al. (2020a), which includes adversarial loss L1. We also utilize the Perceptual Path Regularizer (PPL) Karras et al. (2020b) and R1 regularization Mescheder et al. (2018) as in Karras et al. (2020b) to ensure smoothness and facilitate better convergence. Algorithm 1 Style CL: Training Procedure Input: {X t}T t=1: Stream of T datasets. Output: Learned parameters: {Ut, bt}T t=2, generator G1, and adaptors {ϕt}T t=2 1: Train Style GAN2 with dataset X 1 to obtain G1 2: for t = 2 to T do 3: Initialize discriminator ψ, Ut, bt 4: Optimize Ut and bt using L1(Dψ, G1) as in equation 8 5: Determine most similar task k using equation 3 to get Gk 6: for each training iteration do 7: Sample latent vector wt using equation 1 and equation 2 8: Optimize Ut and bt using L1(Dψ, Gk) as in equation 8 9: end for 10: Calculate task similarity sim(t, k) using equation 5 11: Initialize feature adaptor parameters ϕt and αt 12: if sim(t, k) > 0 then 13: Obtain f t m of Gt using equation 7 14: Initialise Ut, bt using weights from step 8 15: else 16: Obtain f t m of Gt using equation 6 17: Initialise Ut, bt using weights from step 4 18: end if 19: Optimize Ut, bt, and ϕt using L1(Dψ, Gt) as in equation 8 20: end for The parameter K, representing the number of dictionary vectors per style block, is a critical design variable. An in-depth analysis of K s influence on the generative quality and the selection rationale behind it is presented in Section C.2 of the supplementary materials. While the paper outlines the overall training procedure in Algorithm 1 and mentions using specific regularization techniques and loss functions, it does not provide concrete hyperparameter values such as learning rates, batch sizes, number of epochs, or optimizer settings in the main text. It defers detailed analysis of some parameters (like K) to supplementary materials.