Statistical Test for Feature Selection Pipelines by Selective Inference

Authors: Tomohiro Shiraishi, Tatsuya Matsukawa, Shuichi Nishino, Ichiro Takeuchi

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We theoretically prove that our statistical test can control the probability of false positive feature selection at any desired level, and demonstrate its validity and effectiveness through experiments on synthetic and real data. Additionally, we present an implementation framework that facilitates testing across any configuration of these feature selection pipelines without extra implementation costs.
Researcher Affiliation Academia *Equal contribution 1Nagoya University, Aichi, Japan 2RIKEN, Tokyo, Japan. Correspondence to: Ichiro Takeuchi <EMAIL>.
Pseudocode Yes The overall procedure for computing the interval [Lz, Uz] by applying the update rules in the order of the topological sorting of the DAG is summarized in Algorithm 1, where the operation pa receives the index of the target node and returns the indexes of its parent nodes, and pa(1) is set to 0. Algorithm 1 satisfies the specifications described in 4.2, i.e., the following theorem holds.
Open Source Code Yes For reproducibility, our experimental code is available at https: //github.com/shirara1016/statistical_ test_for_feature_selection_pipelines.
Open Datasets Yes We compared the proposed and w/o-pp in terms of power, for the cv pipeline on eight real-world datasets from the UCI Machine Learning Repository (all licensed under the CC BY 4.0; see Appendix D.5 for more details).
Dataset Splits Yes For the experiments to see the type I error rate, we change the number of samples n {100, 200, 300, 400} and set the number of features d to 20. ... For each configuration, we generated 10,000 null datasets (X, y)... Missing values were introduced by randomly setting each yi to Na N with a probability of 0.03. ... From each original dataset, we randomly generated 1,000 sub-sampled datasets with sample sizes of n {100, 150, 200}.
Hardware Specification Yes All numerical experiments were conducted on a computer with a 96-core 3.60GHz CPU and 512GB of memory.
Software Dependencies No import numpy as np from si4pipeline import * The paper mentions using 'numpy' and a custom package 'si4pipeline' but does not provide specific version numbers for these software dependencies.
Experiment Setup Yes In all experiments, we set the significance level α = 0.05. For the experiments to see the type I error rate, we change the number of samples n {100, 200, 300, 400} and set the number of features d to 20. ... To investigate the power, we set n = 200 and d = 20... We change the true coefficients {0.2, 0.4, 0.6, 0.8}. ... Missing values were introduced by randomly setting each yi to Na N with a probability of 0.03.