reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Semantic Visualization with Neighborhood Graph Regularization

Authors: Tuan M. V. Le, Hady W. Lauw

JAIR 2016 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To validate the eﬃcacy of Semafore, our comprehensive experiments on a number of real-life text datasets of news articles and Web pages show that the proposed methods outperform the state-of-the-art baselines on objective evaluation metrics.
Researcher Affiliation	Academia	Tuan M. V. Le EMAIL Hady W. Lauw EMAIL School of Information Systems Singapore Management University 80 Stamford Road, Singapore 178902
Pseudocode	No	The paper describes the EM algorithm steps and gradient computations in text and mathematical equations in Section 5 'Model Fitting', but it does not present them in a structured pseudocode or algorithm block.
Open Source Code	No	The paper does not explicitly state that source code is provided or offer any links to a code repository.
Open Datasets	Yes	We use three real-life, publicly available datasets (Cardoso-Cachopo, 2007) for evaluation. 20News contains newsgroup articles (in English) from 20 classes. Reuters8 contains newswire articles (in English) from 8 classes. Cade12 contains web pages (in Brazilian Portuguese) classiﬁed into 12 classes.
Dataset Splits	No	The paper mentions creating balanced classes by sampling documents from each class and describes an evaluation method where a document's true class is 'hidden' for prediction using nearest neighbors. However, it does not specify explicit training, validation, and test dataset splits in the traditional sense for reproducing the data partitioning for model training.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies	No	The paper does not specify any particular software libraries, frameworks, or their version numbers used in the implementation.
Experiment Setup	Yes	We set the hyper-parameters to α = 0.01, β = 0.1N and γ = 0.1Z following PLSV (Iwata et al., 2008). In the E-step, P(z\|n, m, ˆΨ) is updated as follows: ... In the M-step, by maximizing Q(Ψ\|ˆΨ) w.r.t θzw, the next estimate of word probability θzw is as follows: ... φz and xn cannot be solved in a closed form, and are estimated by maximizing Q(Ψ\|ˆΨ) using quasi-Newton (Liu & Nocedal, 1989).