reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Generalized Ambiguity Decomposition for Ranking Ensemble Learning

Authors: Hongzhi Liu, Yingpeng Du, Zhonghai Wu

JMLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on recommendation and information retrieval tasks demonstrate the eﬀectiveness and theoretical advantages of the proposed method compared with several state-of-the-art methods.
Researcher Affiliation	Academia	Hongzhi Liu EMAIL Yingpeng Du EMAIL School of Software and Microelectronics Peking University Beijing, P.R.China, 102600 Zhonghai Wu EMAIL National Engineering Center of Software Engineering Peking University Beijing, P.R.China, 100871
Pseudocode	Yes	Speciﬁcally, the pseudocode of the algorithm for parameter learning is shown in Algorithm 1.
Open Source Code	No	The paper does not contain an explicit statement about releasing source code, nor does it provide a link to a code repository.
Open Datasets	Yes	We conduct experiments on four real world data sets, including Amazon1-Movies, Amazon Kindle, Epinion2, and ML-20m3. The Amazon-Movies and Amazon-Kindle data sets were collected from Amazon.com... The Epinion data set was collected from a rating website... The ML-20m data set was collected from the Movie Lens website (movielens.umn.edu)... We conduct experiments on two benchmark data sets, MQ2007-agg and MQ2008-agg, which are provided by LETOR4.0 (Qin and Liu, 2013).
Dataset Splits	Yes	Each data set is randomly divided into two subsets: 80% for training and 20% for testing. The training set is used for training the base models and learning the parameters of ensemble models. The test set is used for evaluating the performance of ensemble models. In addition, we split 20% data from the training set as the validation set, which is used for hyper-parameter selection. For each data set, we divide it into two equal subsets as in Shen et al. (2014), one for learning the parameters of ensemble models (including 10% for hyper-parameter selection) and the other for evaluating the performance of ensemble models.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU, GPU models, or cloud configurations) used for running its experiments.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions) used for implementing the described methodology.
Experiment Setup	Yes	For the proposed model, we set embedding dimension d = 16, initial learning rating τ = 0.01, maximum iteration threshold ξ = 100 \|DT \| for all data sets except ξ = 300 \|DT \| for ML-20m, trade-oﬀ coeﬃcient λ = 0.1 and interpolation constants4 c1 = = c T = cd = 0.25 and 1.0 for Amazon (Movie and Kindle) and ML-20m data sets, λ = 1.0 and c1 = = c T = cd = 0.5 for Epinion data set. Experimental results show that the proposed method performs well when embedding dimension d = 5, initial learning rating τ = 0.01, interpolation constants c1 = = c T = cd = 0.5, trade-oﬀcoeﬃcient λ = 0.1 and λ = 0.01, maximum iteration threshold ξ = 200 \|DT \| and ξ = 500 \|DT \| for MQ2007 and MQ2008, respectively.