reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Position: AI Safety Must Embrace an Antifragile Perspective

Authors: Ming Jin, Hyunin Lee

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	This position paper contends that modern AI research must adopt an antifragile perspective on safety one in which the system s capacity to guarantee long-term AI safety such as handling rare or out-of-distribution (OOD) events expands over time. ... In this position paper, we first identify key limitations of static testing, including scenario diversity, reward hacking, and over-alignment. We then explore the potential of antifragile solutions to manage rare events. ... We adapt three theorems from (Lee et al., 2025), which show that in complex multi-step settings, a non-zero gap is unavoidable. Theorem 3.2 (Trivial Cases Without Black Swan). ... Theorem 3.3 (Multi-State, Multi-Step Gaps). ... Corollary 3.4 (Robustness Gap Lower Bound).
Researcher Affiliation	Academia	Ming Jin 1 Hyunin Lee 2 1Virginia Tech 2UC Berkeley. Correspondence to: Ming Jin <EMAIL>.
Pseudocode	No	The paper describes theoretical concepts and arguments. While it refers to algorithms and methods from other works (e.g., Algorithm of Thoughts, BSAFE), it does not present its own structured pseudocode or algorithm blocks for the antifragile framework it proposes.
Open Source Code	No	The paper does not contain any explicit statement about releasing source code for the methodology described in this paper, nor does it provide links to any code repositories.
Open Datasets	No	This is a position paper that does not conduct experiments requiring its own datasets. It mentions various benchmarks and datasets used in the broader field (e.g., Adversarial NLI, Dynabench) to illustrate points, but these are not datasets generated or explicitly made available by this paper for its own research.
Dataset Splits	No	This is a position paper focusing on theoretical concepts and arguments, and does not involve experimental results with specific datasets. Therefore, information regarding training/test/validation splits is not applicable and not provided.
Hardware Specification	No	This is a position paper presenting theoretical arguments and guidelines, not empirical experiments. As such, there is no mention of specific hardware used for running experiments.
Software Dependencies	No	This is a position paper focusing on theoretical concepts and arguments, and does not describe experiments requiring specific software dependencies with version numbers.
Experiment Setup	No	This is a position paper that outlines a theoretical framework and ethical guidelines. It does not present experimental results, and therefore, details about hyperparameters, training configurations, or other experimental setup parameters are not provided.