reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

QPM: Discrete Optimization for Globally Interpretable Image Classification

Authors: Thomas Norrenbrock, Timo Kaiser, Sovan Biswas, Ramesh Manuvinakurike, Bodo Rosenhahn

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive evaluations show that QPM delivers unprecedented global interpretability across small and large-scale datasets while setting the state of the art for the accuracy of interpretable models. ... The usual metrics for compactness-based globally interpretable models are shown in table 2. ... The results for the interpretability metrics are shown in table 3. ... This section validates the impact of the individual objectives in the quadratic problem in table 4 and presents the compactness trade-off in fig. 7. We focus on CUB-2011 but observed similar results for other datasets.
Researcher Affiliation	Collaboration	Thomas Norrenbrock, Timo Kaiser & Bodo Rosenhahn Institute for Information Processing (tnt) L3S Leibniz Universität Hannover, Germany EMAIL Sovan Biswas & Ramesh Manuvinakurike Intel Labs, USA EMAIL
Pseudocode	No	The paper presents mathematical formulations for the Quadratic Problem (equations 1-7) and details its standard form in Appendix O, which defines variables, objective functions, and constraints. However, it does not include a figure, block, or section explicitly labeled 'Pseudocode' or 'Algorithm', nor does it provide structured, step-by-step procedures formatted like code, but rather a mathematical optimization problem definition.
Open Source Code	Yes	Code: https://github.com/Thomas Norr/QPM
Open Datasets	Yes	Following prototype-based methods we applied our method to CUB-2011 (Wah et al., 2011) and Stanford Cars (Krause et al., 2013). ... we also include results on the large-scale dataset Image Net-1K (Russakovsky et al., 2015).
Dataset Splits	Yes	An overview of the used datasets is shown in Suppl. table 5. (Table 5 shows # Training and # Testing counts for CUB-2011, Stanford Cars, Traveling Birds, Image Net-1K).
Hardware Specification	Yes	In our experiments, the global optimum for the relaxed problem is usually found in less than 4 hours for fine-grained datasets, and roughly 11 hours for Image Net-1K using a CPU like EPYC 72F3.
Software Dependencies	No	We implement our method using Py Torch (Paszke et al., 2019) and its Image Net-1K pretrained models as feature extractors. ... This section presents details on how the described quadratic problem with eq. (7) as objective is solved using Gurobi (Gurobi Optimization, LLC, 2023). While these specify software tools used, they do not provide specific version numbers for PyTorch or Gurobi, which is required for a reproducible description of ancillary software.
Experiment Setup	Yes	The remaining parameters, including dense training for 150 epochs on fine-grained datasets and directly using the pretrained model on Image Net-1K with subsequent 40 epochs of fine-tuning, mirror the SLDD-Model and are described in appendix C. ... We fine-tune the pretrained models on the fine-grained datasets using stochastic gradient descent with a batch size of 16 for 150 epochs. The learning rate starts at 5 x 10^-3 for the pretrained layers and 0.01 for the final linear layer and gets multiplied by 0.4 every 30 epochs. We set momentum to 0.9, ℓ2-regularization to 5 x 10^-4 and apply dropout with rate 0.2 to the features. ... After solving the quadratic problem, the model is trained with the final layer fixed to the sparse assignment of selected features W for 40 epochs. The learning rate starts at 100 times the final learning rate of the dense training and decreases by 60 % every 10 epochs. During fine-tuning, momentum is increased to 0.95 and dropout on the features reduced to 10%.