reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Rethinking Multiple-Instance Learning From Feature Space to Probability Space

Authors: Zhaolong Du, Shasha Mao, Xuequan Lu, Mengnan Qi, Yimeng Zhang, Jing Gu, Licheng Jiao

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results illustrate that PSMIL could potentially achieve performance close to supervised learning level in complex tasks (gap within 5%), with the incremental alignment in propability space bring more than 19% accuracy improvements for current existing mainstream models in simulated CIFAR datasets. For existing publicly available MIL benchmarks/datasets, attention in probability space also achieves competitive performance to the state-of-the-art deep MIL models. ... To demonstrate the potential degradation issues that current MIL models may face in feature space, we introduce comprehensive simlulated datasets to evaluate the ability of MIL models in learning instance representations. On complex tasks, experiments show that the designed probability-space alignment objective effectively constrains instance representations to a more stable space during the representation learning stage, meanwhile bringing non-trivial performance improvements and stability across current MIL methods. ... We also validated our model on various existing MIL datasets with SOTA-level performance.
Researcher Affiliation	Academia	Zhaolong Du1, Shasha Mao1 , Xuequan Lu2, Mengnan Qi1, Yimeng Zhang1, Jing Gu1, Licheng Jiao1 1Xidian University, 2 The University of Western Australia Corresponding author. Email: EMAIL.
Pseudocode	No	The paper describes algorithms and derivations using mathematical equations and text, but does not present any explicitly labeled 'Pseudocode' or 'Algorithm' blocks with structured steps.
Open Source Code	Yes	Codes are available at https://github.com/LMBDA-design/PSAMIL.
Open Datasets	Yes	Experimental results illustrate that PSMIL could potentially achieve performance close to supervised learning level in complex tasks (gap within 5%), with the incremental alignment in propability space bring more than 19% accuracy improvements for current existing mainstream models in simulated CIFAR datasets. ... For existing publicly available MIL benchmarks/datasets, attention in probability space also achieves competitive performance to the state-of-the-art deep MIL models. ... Table 1: Statistics of synthesized datasets. Dataset FMNIST SVHN CIFAR-10 CIFAR-100 ... CAMELYON16 is a significant publicly available Whole Slide Image (WSI) dataset for lymph node classification and metastasis detection. ... The TCGA Lung Cancer dataset comprises two non-small cell lung cancer subtypes, LUAD and LUSC, with 1053 slides...
Dataset Splits	Yes	In the new synthesized datasets, training set instances are reorganized into the form of multi-instance bags for training. Instances with class label 0 serve as background instances, randomly sampled within each multi-instance bag as non-key instances. Each multi-instance bag has a fixed length of 64, with 5 being key instances, accounting for 7.8%. The number of categories of bags generated are the same as the number of classes in each original dataset. ... During testing, we directly evaluate instance representation quality based on the classification accuracy of single instance images from the original CIFAR test set. ... CAMELYON16 is a significant publicly available Whole Slide Image (WSI) dataset for lymph node classification and metastasis detection. It includes 270 training and 129 test slides from two medical centers... In our experiment, we followed a standard evaluation scheme 5-fold-cv-standalone-test by DSMIL with as shown in the logs in supplementary detail and codes on Github.
Hardware Specification	Yes	All the algorithms including PSMIL are implemented on a single Nvidia RTX 4090 GPU.
Software Dependencies	No	The paper mentions using 'stochastic gradient descent (SGD) optimizer', but does not specify any software libraries or their version numbers (e.g., PyTorch, TensorFlow, Python version, etc.).
Experiment Setup	Yes	We apply the stochastic gradient descent (SGD) optimizer with a momentum of 0.9 and a weight decay of 0.0001. The initial learning rate is chosen from a set of {0.01, 0.001} and is decayed by steps. On first epoch we freeze the backbone as warming up to improve stability. The value of λ is selected from a set of {0.1, 0.01}, with the threshold epoch τ being 10.