reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Correcting Exposure Bias for Link Recommendation

Authors: Shantanu Gupta, Hao Wang, Zachary Lipton, Yuyang Wang

ICML 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, experiments on semi-synthetic data based on real-world citation networks, show that our methods reliably identify (truly) relevant citations. Additionally, our methods lead to greater diversity in the recommended papers ﬁelds of study. We empirically validate our methods on real-world citation data from the Microsoft Academic Graph (MAG) (Sinha et al., 2015) (Section 6).
Researcher Affiliation	Collaboration	Shantanu Gupta 1 2 Hao Wang 3 Zachary Lipton 2 Yuyang Wang 4 1Work done while interning at Amazon 2Machine Learning Department, Carnegie Mellon University 3Department of Computer Science, Rutgers University 4Amazon Web Services (AWS) AI Labs, Palo Alto, CA, USA.
Pseudocode	No	The paper does not contain any pseudocode or algorithm blocks.
Open Source Code	Yes	The code is available at github.com/shantanu95/ exposure-bias-link-rec.
Open Datasets	Yes	We empirically validate our methods on real-world citation data from the Microsoft Academic Graph (MAG) (Sinha et al., 2015).
Dataset Splits	Yes	We generate train-test-validation splits by taking a topological ordering of the nodes and use the subgraph created from the ﬁrst 70% for training, next 10% for validation, and the remaining 20% for testing.
Hardware Specification	No	We use Amazon Sagemaker (Liberty et al., 2020) to run our experiments. (This mentions a cloud platform but does not specify the exact hardware specifications like GPU models or CPU types.)
Software Dependencies	No	We use a Sci BERT model (Beltagy et al., 2019), which is a BERT model trained on scientiﬁc text, with this library. For training, we use the Adam optimizer (Kingma & Ba, 2014) with a learning rate of 10 4 and a batch size of 32. (While specific software libraries and models are mentioned, no version numbers are provided for them.)
Experiment Setup	Yes	For training, we use the Adam optimizer (Kingma & Ba, 2014) with a learning rate of 10 4 and a batch size of 32.