LabelCoRank: Revolutionizing Long Tail Multi-Label Classification with Co-Occurrence Reranking
Authors: Yan Yan, Junyuan Liu, Bo-Wen Zhang
JAIR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental evaluations on popular datasets including MAG-CS, Pub Med, and AAPD demonstrate the effectiveness and robustness of Label Co Rank. |
| Researcher Affiliation | Academia | YAN YAN, China University of Mining & Technology, Beijing, China JUNYUAN LIU, China University of Mining & Technology, Beijing, China BO-WEN ZHANG , University of Science and Technology Beijing, China |
| Pseudocode | No | The paper describes the methodology using text and mathematical equations, and provides a block diagram in Fig. 1. However, it does not include explicitly labeled pseudocode or algorithm blocks with structured steps. |
| Open Source Code | Yes | The implementation code is publicly available on https://github.com/821code/Label Co Rank. |
| Open Datasets | Yes | The proposed method is evaluated on three well-known public datasets: MAG-CS, Pub Med, and AAPD. ... MAG-CS[32]: The dataset consists of 705,407 papers from the Microsoft Academic Graph (MAG), ... Pub Med[21]: The dataset comprises 898,546 papers sourced from Pub Med, ... AAPD[41]: The dataset comprises English abstracts of computer science papers sourced from arxiv.org, ... |
| Dataset Splits | Yes | Table 1. Dataset statistics.𝑁trn and 𝑁tst represent the number of documents in the training set and test set, respectively. ... MAG-CS 564340 70534 ... Pub Med 718837 89855 ... AAPD 54840 1000 |
| Hardware Specification | Yes | Experiments were conducted on a computer equipped with an Nvidia RTX 4090 GPU and 128 GB of RAM. |
| Software Dependencies | No | The paper mentions "The Adam W optimizer was employed" and "utilized the Ro BERTa pre-trained model as a feature extractor," but it does not specify version numbers for any software libraries, frameworks (e.g., PyTorch, TensorFlow), or programming languages used to implement the methodology. |
| Experiment Setup | Yes | The Adam W optimizer was employed with a learning rate of 1e-5, sentence truncation set to 512, and batch size of 16. The threshold hyperparameter, 𝛼, was set to 0.3, and the hyperparameters for the loss function weight, 𝛽, was set to 0.3, 0.3, and 0.25 for the MAG-CS, Pub Med, and AAPD datasets, respectively. The number of selected labels for these datasets 𝛾was 30, 35, and 20, respectively. |