Vertical Federated Learning with Missing Features During Training and Inference

Authors: Pedro Valdeira, Shiqiang Wang, Yuejie Chi

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Numerical experiments show improved performance of LASER-VFL over the baselines. Remarkably, this is the case even in the absence of missing features. For example, for CIFAR-100, we see an improvement in accuracy of 19.3% when each of four feature blocks is observed with a probability of 0.5 and of 9.5% when all features are observed. The code for this work is available at https://github.com/Valdeira/LASER-VFL.
Researcher Affiliation Collaboration Pedro Valdeira Carnegie Mellon University & Instituto Superior T ecnico EMAIL Shiqiang Wang IBM Research EMAIL Yuejie Chi Carnegie Mellon University EMAIL
Pseudocode Yes Algorithm 1: LASER-VFL Training Algorithm 2: LASER-VFL Inference
Open Source Code Yes The code for this work is available at https://github.com/Valdeira/LASER-VFL.
Open Datasets Yes HAPT (Reyes-Ortiz et al., 2016): a human activity recognition dataset. Credit (Yeh & Lien, 2009): a dataset for predicting credit card payment defaults. MIMIC-IV (Johnson et al., 2020): a time series dataset concerning healthcare data. CIFAR-10 & CIFAR-100 (Krizhevsky et al., 2009): two popular image datasets.
Dataset Splits No The paper mentions splitting samples into K=4 feature blocks for datasets, but does not provide specific train/test/validation dataset splits (e.g., percentages, exact counts, or citations to predefined splits for training/evaluation sets).
Hardware Specification No The paper does not provide specific details about the hardware used for experiments, such as CPU or GPU models, or memory specifications.
Software Dependencies No The paper mentions models like MLPs, LSTM, and ResNets18, and refers to methods like Fed SVD, but does not specify any software libraries (e.g., PyTorch, TensorFlow) or their version numbers, which are crucial for reproducibility.
Experiment Setup No The paper does not provide specific experimental setup details such as learning rates, batch sizes, number of epochs, or optimizer configurations, which are essential hyperparameters for reproducing the results.