Feature Shift Localization Network

Authors: Mı́riam Barrabés, Daniel Mas Montserrat, Kapal Dev, Alexander G. Ioannidis

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our evaluation setup is consistent with (Barrab es et al., 2023), using the same reference and query sets and optimized benchmarking methods. Hyperparameter tuning for FSL-Net is detailed in Section E. We compare FSL-Net against five feature shift localization methods (Data Fix, MB-SM, MB-KS, KNN-KS, and Deep-SM) and four feature selection methods (MI, Select KBest, MRMR, and Fast-CMIM). ... Performance is evaluated using the F-1 score for feature shift localization accuracy and wall-clock runtime for computational efficiency. ... Ablation Analysis. We assess the impact of each component of FSL-Net s Statistical Descriptor Network by training models with different combinations of its three components...
Researcher Affiliation Academia 1Department of Biomedical Data Science, Stanford University, Stanford, CA 94305 USA 2Department of Computer Science, Munster Technological University, Cork T12 P928, Ireland 3Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA 95060 USA. Correspondence to: Alexander G. Ioannidis <EMAIL>.
Pseudocode No The paper describes the FSL-Net architecture and its components (Statistical Descriptor Network, Prediction Network) using prose and diagrams (Figure 1), along with mathematical formulations for loss functions and statistical measures. However, it does not include a dedicated section or figure presenting pseudocode or a formal algorithm block.
Open Source Code Yes The code and readyto-use trained model are available at https: //github.com/AI-sandbox/FSL-Net.
Open Datasets Yes We source a total of 1,032 diverse tabular datasets from Open ML (Van Rijn et al., 2013)... The continuous datasets are sourced from the UCI repository (Gas (Huerta et al., 2016), Energy (Candanedo et al., 2017), and Musk2 (Blake, 1998)) and Open ML (Scene (Boutell et al., 2004), MNIST (Deng, 2012), and Dilbert (Vanschoren et al., 2014)). Additionally, a Covid-19 dataset (Force, 2022)... The categorical datasets consist of high-dimensional biomedical data, including the Phenotypes dataset (Qian et al., 2020), a subset of categorical traits from the UK Biobank, the Founders dataset containing binary-coded human DNA sequences (Perera et al., 2022), and the Canine dataset comprising binary-coded dog DNA sequences (Barrab es et al., 2023).
Dataset Splits Yes In total, 1,350 datasets are used for training, with 50 reserved for validation. ... Each subset is then split equally into reference and query samples. ... The samples are evenly divided into two subsets, forming the reference and query sets.
Hardware Specification Yes All evaluations were conducted on an Intel Xeon Gold with 12 CPU cores. ... To expedite the training process, a single NVIDIA-SMI GPU with 32GB of memory was used.
Software Dependencies No The paper describes the implementation of FSL-Net as a neural network and references common machine learning models like random forests and k-nearest neighbors. However, it does not explicitly state any specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes Table 7 provides an overview of the search spaces and the optimal values determined for each network-related hyperparameter in FSL-Net. These parameters encompass configurations for the statistical measures, Moment Extraction Network, Neural Embedding Network, and Prediction Network. ... Table 9 outlines the search space and optimal values for optimization hyperparameters in FSL-Net s training strategy. This includes the loss function and Adam optimizer settings.