A Unified View of Double-Weighting for Marginal Distribution Shift
Authors: José I. Segovia-Martín, Santiago Mazuelas, Anqi Liu
TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, the proposed methods achieve enhanced classification performance in both synthetic and empirical experiments. (...) 6 Experiments This section shows experimental results for the proposed approaches in comparison with the state-of-the-art methods. (...) 6.1 Experiments for single-source label shift adaptation (...) 6.2 Experiments for multi-source covariate shift adaptation (...) 6.3 Experiments for multi-source label shift adaptation |
| Researcher Affiliation | Academia | José I. Segovia-Martín EMAIL Basque Center for Applied Mathematics (BCAM) Bilbao, Spain Santiago Mazuelas EMAIL Basque Center for Applied Mathematics (BCAM) IKERBASQUE-Basque Foundation for Science Bilbao, Spain Anqi Liu EMAIL CS department, Whiting School of Engineering Johns Hopkins University Baltimore, Maryland, USA |
| Pseudocode | Yes | 5 Practical Algorithm and Implementation This section describes the implementation of the proposed techniques for double-weighting label shift (DW-LS), double-weighting multi-source (MS) covariate shift (DW-MSCS), and double-weighting MS label shift (DW-MSLS) detailed in Algorithm 1, Algorithm 2 and Algorithm 3, respectively. Algorithm 1 The proposed algorithm for label shift adaptation: DW-LS Algorithm 2 The proposed algorithm for multi-source covariate shift adaptation: DW-MSCS Algorithm 3 The proposed algorithm for multi-source label shift adaptation: DW-MSCS |
| Open Source Code | Yes | The source code for the methods and the experimental setup presented are publicly available in https://github.com/Machine Learning BCAM/Unified-Double-Weighting-TMLR-2025. |
| Open Datasets | Yes | In the second set of experiments, we assess the performance of the proposed methods in comparison with existing techniques using real datasets publicly available in the UCI repository Dua & Graff(2017). (...) We consider Spam Detection, 20 Newsgroups, and Sentiment classification datasets. (...) 20 Newsgroups , available at http://qwone.com/ ~jason/20Newsgroups/, Sentiment Analysis , available at https://www.cs.jhu.edu/~mdredze/datasets/sentiment/, and Spam detection , available at http:// www.ecmlpkdd2006.org/challenge.html. (...) Domain Net , available at https://ai.bu.edu/M3SDA/, (Peng et al., 2019), and Office-31 , available at https://github.com/jindongwang/transferlearning/blob/master/data/dataset.md. |
| Dataset Splits | Yes | In addition, for each type of label shift (value of δ), we carried out 200 random repetitions with 100 training samples and 100 testing samples. (...) In the tweak-one shift, the training distribution is uniform over the set of possible labels, ptr(y) = 1/|Y|, while in the testing distribution, we assign probability pte(y) = δ to half of the classes (rounded up). We set δ = 0.05 for 10 repetitions and δ = 0.10 for another 10 repetitions. In the knock-out shift, the testing distribution is uniform over the set of possible labels, while in the training distribution, we remove a proportion δ of the samples from the selected classes. We set δ = 0.9 and select half of the classes (rounded up) for all 20 repetitions. (...) We carried out 100 random repetitions with 200 samples from each source and 200 testing samples and considered linear feature mapping. (...) For the experiments using Sentiment dataset, (...) We randomly sample 1,000 training samples from each source and 150 testing samples in each repetition. For the experiments using Domain Net dataset, (...) randomly sample 100 training samples from each source and 200 testing samples in each repetition. For the experiments using Office-31 dataset, (...) randomly sample half of the samples from each domain as training samples, and half of the samples from each domain, ensuring that the same number of samples from each domain, as the testing set in each repetition. |
| Hardware Specification | No | The paper does not provide specific hardware details such as exact GPU/CPU models, processors, or memory amounts used for running the experiments. It mentions using 'pretrained Res Nets to map images into feature vectors' but does not specify the hardware these models ran on. |
| Software Dependencies | No | The paper mentions several methods and tools like 'KMM', 'BBSE', 'RLLS', 'MLLS', '2SW-MDA', 'MS-DRL', 'CW KMM', and 'Res Net' models. However, it does not specify the version numbers for these software components or any programming language versions used to implement them, which is necessary for reproducibility. |
| Experiment Setup | No | The paper mentions how hyperparameters D and λ are determined ('We select the value of D to achieve the lowest minimax risk R(U)', 'hyperparameters {λs}S s=1 are determined solving min p,λs 1Tλs') and that 'For the existing methods, we consider the default hyperparameter values provided by the authors.' However, it does not provide concrete values for crucial training hyperparameters such as learning rate, batch size, number of epochs, or the specific optimizer settings used for the models (e.g., logistic regression, ResNets) in its experiments. |