reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

CONTRA: Conformal Prediction Region via Normalizing Flow Transformation

Authors: Zhenhan FANG, Aixin Tan, Jian Huang

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we systematically compare the performance of the proposed CONTRA and Res CONTRA to that of other conformal prediction methods reviewed in section 4, including PCP, NLE, RCP, MCQR, Dist-split and CQR. The last two were designed for one-dimensional outputs. We employ the Bonferroni approach to produce valid multi-dimensional regions, labeled as Dist-splitbon and CQRbon respectively. Throughout the experiments, the miscoverage rate is set at α = 0.1 for each prediction region, hence a nominal coverage rate of 90%. Experiments are conducted on four synthetic and six real datasets.
Researcher Affiliation	Academia	Zhenhan Fang, Aixin Tan Department of Statistics and Actuarial Science The University of Iowa Iowa City, IA 52242, USA EMAIL Jian Huang Department of Applied Mathematics The Hong Kong Polytechnic University Hongkong, China EMAIL
Pseudocode	Yes	Algorithm 1 Conformal Region via Normalizing Flow Transformation (CONTRA) 1: Data {(xi, yi)}n i=1 Rp Rq. 2: Miscoverage level α [0, 1]. 3: A CNF algorithm A with a standard Gaussian base distribution. 4: A point xn+1 that needs a prediction region for its output, yn+1. Procedure : 1: Randomly split {(xi, yi)}n i=1 into two disjoint sets D1 and D2. 2: Fit tˆθ by A(D1). 3: Obtain Zcal as in equation 4. 4: Compute r1 α and define ˆE as in equation 5. 5: Compute ˆC(xn+1) = tˆθ( ˆE, xn+1). Output : A prediction region for yn+1 is given by ˆC(xn+1).
Open Source Code	No	The paper does not provide an explicit statement about releasing source code or a link to a code repository for the described methodology.
Open Datasets	Yes	Experiments are conducted on four synthetic and six real datasets. Briefly, Bio focuses on the physicochemical properties of protein tertiary structures, derived from CASP 5-9 experiment (Rana, 2013). Energy is used to train a predictive model for heating and cooling loads given eight building-related predictors like relative compactness, roof area, surface area and so on (Tsanas & Xifara, 2012). 2D RF and 4D RF are based on the same river flow dataset, which comprises more than one year of hourly flow observations from eight sites within the Mississippi River network (Spyromitros-Xioufis et al., 2016). SCM20D contains 5000 records from the 2010 Trading Agent Competition in Supply Chain Management (Spyromitros-Xioufis et al., 2016).
Dataset Splits	Yes	The sample sizes of the training, calibration and testing sets of the mixture Gaussian and spiral examples are 3375, 1125, and 500, respectively. Each dataset was partitioned into 60% training, 20% calibration, and 20% testing. For Res CONTRA, the 60% training portion was further split into 60% for training and 40% for the first calibration.
Hardware Specification	Yes	All models were trained on an A30 GPU with 32 GB of memory.
Software Dependencies	No	The paper mentions using the Adam gradient descent method and Real NVP model, but does not provide specific version numbers for any software libraries or dependencies.
Experiment Setup	Yes	Throughout the experiments, the miscoverage rate is set at α = 0.1 for each prediction region, hence a nominal coverage rate of 90%. Specifically, we used 6 to 10 coupling layers for each CNF. Each coupling layer (details in Appendix C) involves two neural networks, each including 2 hidden layers with 512 hidden units and Re LU activation function. Optimization is done with the Adam gradient descent method (Kingma & Ba, 2014) with a learning rate of 1e-3 for most of cases. Training epochs are mostly set to be 200. To implement Res CONTRA, we selected support vector regression as the predictive model.