reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Disconfounding Fake News Video Explanation with Causal Inference

Authors: Lizhi Chen, Zhong Qian, Peifeng Li, Qiaoming Zhu

IJCAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on the Fake VE dataset demonstrate the effectiveness of CIFE, which generates more faithful explanations by mitigating object entanglement and aspect imbalance. Our code is available at https: //github.com/Lieberk/CIFE. ... Our causal framework CIFE is compatible with existing multimodal models. On the Fake VE benchmark, it achieves improvements of 16.2% in BLEU-1 and 20.1% in ROUGE-L, with extensive experiments validating the critical role of causal intervention in ensuring explanation faithfulness. ... 4 Experiments ... 4.4 Experimental Results ... 4.5 Ablation Study
Researcher Affiliation	Academia	Lizhi Chen, Zhong Qian, Peifeng Li , Qiaoming Zhu School of Computer Science and Technology, Soochow University EMAIL, EMAIL,
Pseudocode	No	The paper describes the CIFE framework and its components (IVOD, IEAM) using text, causal graphs, architectural diagrams, and mathematical equations. It does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Our code is available at https: //github.com/Lieberk/CIFE.
Open Datasets	Yes	We conducted experimental research on Fake VE [Chen et al., 2025], currently the largest and most comprehensive publicly available FNVE dataset, with brief statistical details presented in Table 1.
Dataset Splits	Yes	Table 1: Statistics of the Fake VE dataset, where Avg. . Dur. and Exp. refer to Average , Duration and Explanation , respectively. Split #of News Avg.Title Avg. Dur (s) Avg. Exp Train 2138 21.40 61.23 49.76 Val 267 16.17 63.45 50.50 Test 267 15.37 60.32 50.08 Total 2672 20.27 61.78 49.86
Hardware Specification	No	The paper does not provide specific hardware details such as GPU models, CPU types, or memory used for running the experiments. It only mentions general training parameters like batch size.
Software Dependencies	No	The paper mentions using "Adam W [Loshchilov and Hutter, 2017] as the optimizer" but does not specify version numbers for any key software components or libraries (e.g., Python, PyTorch, TensorFlow).
Experiment Setup	Yes	During training, we uniformly sample video frames with a maximum sequence length of 55 frames per video, applying pooling operations to each frame as the visual source representation. We employ Adam W [Loshchilov and Hutter, 2017] as the optimizer with a learning rate of 1e-4, a batch size of 16, and train the models for a maximum of 15 epochs.