reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Optimization-based Causal Estimation from Heterogeneous Environments

Authors: Mingzhang Yin, Yixin Wang, David M. Blei

JMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We describe the theoretical foundations of this approach and demonstrate its effectiveness on simulated and real datasets. Compared to classical ML and existing methods, Co Co provides more accurate estimates of the causal model and more accurate predictions under interventions. Keywords: Causal estimation, Robust prediction, Constrained optimization, Directional derivative, Interventional data
Researcher Affiliation	Academia	Mingzhang Yin EMAIL Warrington College of Business University of Florida Gainesville, FL, 32611, USA Yixin Wang EMAIL Department of Statistics University of Michigan Ann Arbor, MI, 48109, USA David M. Blei EMAIL Department of Computer Science and Department of Statistics Columbia University New York, NY, 10027, USA
Pseudocode	Yes	Algorithm 1 Co Co with known exogenous variables input : Data De = {Ye, Xe}, Xe Rne p; the risk function Re for each environment e E; the set of known non-descendant variables C; the predictor f( ). output : Coeﬃcient estimation α with causal interpretation. Initialize α randomly while not converged do for e in E do Compute the gradient of the empirical risk: i=1 Re(α; ye i , ˆye i ), ˆye i = f(xe i; α) Set α = α (1 1C) + 1C Compute the optimization objective: Le(α) = ge(α) α 2 end Update α α η e E Le(α) with step size η end
Open Source Code	Yes	Code implementations for the empirical studies are available at https://github. com/mingzhang-yin/Co Co.
Open Datasets	Yes	7.3 Colored MNIST (CMNIST) CMNIST is a semi-synthetic data set for binary classiﬁcation introduced in Arjovsky et al. (2019). Based on the MNIST data set... 7.4 Natural image classiﬁcation In this example, following Cloudera (2020), we adapt the i Wild Cam 2019 dataset (Beery et al., 2019) that contains wildlife images taken in the wild.
Dataset Splits	Yes	We generate two environments with γe {0.5, 2.0}, each environment with 10,000 data points. As required, the DGPs leave the causal coeﬃcient invariant. ... For the training with pe {0.1, 0.2}, for the validation with pe = 0.5 and for the testing with pe = 0.9. ... Based on the setting of Cloudera (2020), we use images from two locations as the training data and images from another location as the test data. We use images from an additional location as the validation data.
Hardware Specification	No	No specific hardware details (like GPU/CPU models or cloud instance types) were mentioned for running the experiments. The paper only mentions using ResNet18 features, which is a model, not hardware.
Software Dependencies	No	No specific software versions (e.g., Python 3.x, PyTorch 1.x) were mentioned. The paper mentions using 'Adam optimizer' and referring to an 'IRM implementation' but without version numbers for these or other core libraries.
Experiment Setup	Yes	For the algorithms with tuning parameter λ, we report the best result for IRM with λ {2, 20, 200}, for V-REx and RVP with λ {10, 102, 103, 104}. We choose stepsize from {0.01, 0.1} that produces the lowest objective for each method. For all methods, the algorithm is considered to converge if the mean absolute diﬀerence between the parameters in consecutive iterations is less than 10 3 and the total iterations are over 104. ... For both Co Co and IRM, the penalty weight is chosen from ten values equally spaced from 1 to 100 on a log-scale using the validation environments. The weight on the empirical risk term is reduced to 0 after 5k iterations. ... We use Adam optimizer (Kingma and Ba, 2014) with learning rate 10 4. ... We set the weak condition weight λw = 10 4 and the risk regularizer weight λr = 1. λr is reduced to 10 5 after 100 epochs. The risk regularizer is an inductive bias to encourage nonzero solutions. After the optimizer is suﬃciently away from the zero point, annealing the risk regularizer prevents the algorithm from minimizing the objective by reducing the risk function, hence preventing it from using the spurious association. Co Co is compared with ERM, IRM (Arjovsky et al., 2019), and V-REx (Krueger et al., 2020). All methods are trained by ADAM with a learning rate 10 3.