Weakly Supervised Disentangled Generative Causal Representation Learning
Authors: Xinwei Shen, Furui Liu, Hanze Dong, Qing Lian, Zhitang Chen, Tong Zhang
JMLR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide theoretical justification on the identifiability and asymptotic convergence of the proposed method. We conduct extensive experiments on both synthesized and real data sets to demonstrate the effectiveness of DEAR in causal controllable generation, and the benefits of the learned representations for downstream tasks in terms of sample efficiency and distributional robustness. |
| Researcher Affiliation | Collaboration | Xinwei Shen EMAIL Department of Mathematics The Hong Kong University of Science and Technology Hong Kong, China Furui Liu EMAIL Zhejiang Laboratory Hangzhou, China Hanze Dong EMAIL Department of Mathematics The Hong Kong University of Science and Technology Hong Kong, China Qing Lian EMAIL Department of Computer Science The Hong Kong University of Science and Technology Hong Kong, China Zhitang Chen EMAIL Huawei Noah s Ark Lab Shenzhen, China Tong Zhang EMAIL Department of Computer Science and Mathematics The Hong Kong University of Science and Technology Hong Kong, China |
| Pseudocode | Yes | Algorithm 1: Disentangled g Enerative c Ausal Representation (DEAR) Learning |
| Open Source Code | Yes | The code and data sets are available at https://github.com/xwshen51/DEAR. |
| Open Datasets | Yes | The code and data sets are available at https://github.com/xwshen51/DEAR. We evaluate our methods on two data sets where the ground-truth generative factors are causally related... Pendulum, similar to the one in Yang et al. (2021)... Celeb A (Liu et al., 2015)... In addition, we test our method on benchmark data sets (Gondal et al., 2019) where the generative factors are independent. The results are given in Appendix E. |
| Dataset Splits | Yes | The sample sizes for the training, validation and test set are all 6,724. The second one is a real human face data set, Celeb A (Liu et al., 2015), with 40 labeled binary attributes... The sample sizes for the training, validation and test set are 162,770, 19,867, and 19,962. |
| Hardware Specification | Yes | Models were trained for around 150 epochs on Celeb A, 600 epochs on Pendulum, and 50 epochs on MPI3D on NVIDIA RTX 2080 Ti. |
| Software Dependencies | No | We adopt Adam with β1 = 0, β2 = 0.999, and a learning rate of 1 × 10−4 for D, 5 × 10−5 for E, G and F, and 1 × 10−3 for the weighted adjacency matrix A. We use a mini-batch size of 128. We follow the architectures used in Shen et al. (2020). Specifically, for such realistic data, we adopt the SAGAN (Zhang et al., 2019) architecture for D and G. |
| Experiment Setup | Yes | We pre-process the images by taking center crops of 128 × 128 for Celeb A and resizing all images in Celeb A and Pendulum to the 64 × 64 resolution. We adopt Adam with β1 = 0, β2 = 0.999, and a learning rate of 1 × 10−4 for D, 5 × 10−5 for E, G and F, and 1 × 10−3 for the weighted adjacency matrix A. We use a mini-batch size of 128. For adversarial training in Algorithm 1, we train the D once on each mini-batch. The coefficient λ of the supervised regularizer is set to 5 unless indicated otherwise. ... Models were trained for around 150 epochs on Celeb A, 600 epochs on Pendulum, and 50 epochs on MPI3D... In downstream tasks, for BGMs with an encoder, we train a two-level MLP classifier with 100 hidden nodes using Adam with a learning rate of 1 × 10−2 and a mini-batch size of 128. |