Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]
Correcting Exposure Bias for Link Recommendation
Authors: Shantanu Gupta, Hao Wang, Zachary Lipton, Yuyang Wang
ICML 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, experiments on semi-synthetic data based on real-world citation networks, show that our methods reliably identify (truly) relevant citations. Additionally, our methods lead to greater diversity in the recommended papers fields of study. We empirically validate our methods on real-world citation data from the Microsoft Academic Graph (MAG) (Sinha et al., 2015) (Section 6). |
| Researcher Affiliation | Collaboration | Shantanu Gupta 1 2 Hao Wang 3 Zachary Lipton 2 Yuyang Wang 4 1Work done while interning at Amazon 2Machine Learning Department, Carnegie Mellon University 3Department of Computer Science, Rutgers University 4Amazon Web Services (AWS) AI Labs, Palo Alto, CA, USA. |
| Pseudocode | No | The paper does not contain any pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code is available at github.com/shantanu95/ exposure-bias-link-rec. |
| Open Datasets | Yes | We empirically validate our methods on real-world citation data from the Microsoft Academic Graph (MAG) (Sinha et al., 2015). |
| Dataset Splits | Yes | We generate train-test-validation splits by taking a topological ordering of the nodes and use the subgraph created from the first 70% for training, next 10% for validation, and the remaining 20% for testing. |
| Hardware Specification | No | We use Amazon Sagemaker (Liberty et al., 2020) to run our experiments. (This mentions a cloud platform but does not specify the exact hardware specifications like GPU models or CPU types.) |
| Software Dependencies | No | We use a Sci BERT model (Beltagy et al., 2019), which is a BERT model trained on scientific text, with this library. For training, we use the Adam optimizer (Kingma & Ba, 2014) with a learning rate of 10 4 and a batch size of 32. (While specific software libraries and models are mentioned, no version numbers are provided for them.) |
| Experiment Setup | Yes | For training, we use the Adam optimizer (Kingma & Ba, 2014) with a learning rate of 10 4 and a batch size of 32. |