reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

pyDML: A Python Library for Distance Metric Learning

Authors: Juan Luis Suárez, Salvador García, Francisco Herrera

JMLR 2020 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The py DML package currently provides more than 20 algorithms... In addition, the library also provides some utilities for the visualization of classiﬁer regions, parameter tuning and a stats website with the performance of the implemented algorithms. The py DML library also incorporates graphical tools for the representation and evaluation of the learned distances... Finally, a stats website is also provided, where the performance of the implemented algorithms is evaluated under diﬀerent conditions (Suarez et al., 2019).
Researcher Affiliation	Academia	Juan Luis Su arez EMAIL Salvador Garc ıa EMAIL Francisco Herrera EMAIL Da SCI, Andalusian Research Institute in Data Science and Computational Intelligence University of Granada, Granada, Spain
Pseudocode	No	The paper includes a code snippet in Figure 1, but it is an example of how to use the library's functions in Python, not a pseudocode or algorithm block detailing the logic of any of the DML algorithms themselves.
Open Source Code	Yes	Source code and documentation can be found at https://github.com/jlsuarezdiaz/py DML.
Open Datasets	Yes	Figure 1 shows a basic example... >>> from sklearn.datasets import load_iris # Iris dataset
Dataset Splits	No	The paper mentions 'tune functions, which allow the parameters of the DML algorithms to be easily estimated with cross validation', implying the use of cross-validation for parameter tuning. However, it does not specify any particular dataset splits (e.g., percentages, folds) for general experimental reproduction within the paper's main text.
Hardware Specification	No	The paper describes a software library and its features, compatibility, and installation. It does not provide any specific hardware details (like CPU, GPU models, or memory) used for developing, testing, or evaluating the library or its algorithms within the paper's content.
Software Dependencies	No	The package relies on the scipy ecosystem, it is fully compatible with scikit-learn, and is distributed under GPLv3 license... The main one is Scikit-Learn (Pedregosa et al., 2011), an eﬃcient open-source library for machine learning, which relies on the Scipy ecosystem, which contains numerical calculus libraries, such as Num Py, data processing libraries, such as Pandas, or data visualization libraries, such as Matplotlib. These software components are mentioned, but specific version numbers are not provided for them.
Experiment Setup	No	The paper states, 'It is important to emphasize that these algorithms include diﬀerent hyperparameters that can be modiﬁed to improve the performance or to change the conditions of the learned distances. To this end the package includes tune functions, which allow the parameters of the DML algorithms to be easily estimated with cross validation...' While it mentions hyperparameters and tuning, it does not provide specific values for any experimental setup (e.g., learning rates, batch sizes, epochs) for any experiment detailed within the paper itself.