COMM: Concentrated Margin Maximization for Robust Document-Level Relation Extraction
Authors: Zhichao Duan, Tengyu Pan, Zhenyu Li, Xiuxing Li, Jianyong Wang
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments and analysis demonstrate the versatility and effectiveness of COMM, especially its robustness when trained on low-quality data (achieves >10% performance gains). |
| Researcher Affiliation | Academia | Zhichao Duan1, Tengyu Pan1, Zhenyu Li1, Xiuxing Li2, Jianyong Wang1* 1Tsinghua University 2Beijing Institute of Technology |
| Pseudocode | Yes | Algorithm 1: COMM Input: REncoder, Batch Data, Relations R Parameter: γ, m 1: for all D in Batch Data do 2: L CMM 0 3: T IARA(REncoder, D, |R| + 1) 4: for all Entity Pair (es, eo) in D do 5: t T H T (es, eo)[0] 6: for all r in R do 7: if r PT then 8: tr+ T (es, eo)[r] 9: dr+ tr+ t T H 10: q+ r log(σ(dr+)) 11: L CMM L CMM qr+(1 qr+)γ 12: else if r NT then 13: tr T (es, eo)[r] 14: dr t T H tr 15: qr log(min(σ(dr ) + m, 1)) 16: L CMM L CMM qr 17: end if 18: end for 19: end for 20: Perform Stochastic Gradient Descent using L CMM 21: end for |
| Open Source Code | No | All results are reported based on the public code of these models with three repeated executions. (This refers to the relational encoders used, not the COMM framework itself) - No explicit statement for COMM code. |
| Open Datasets | Yes | Doc RED (Yao et al. 2019) represents a comprehensive, crowd-sourced dataset for Doc RE created from Wikipedia articles. [...] Re-Doc RED (Tan et al. 2022b) constitutes an augmented variant of the Doc RED dataset, specifically designed to rectify the prevalent issue of false negatives in Doc RE. |
| Dataset Splits | Yes | For training, we utilize the training sets of both Doc RED and Re-Doc RED, each comprising 3,053 documents. During validation and testing phases, in order to better assess the model s performance, we employ the higher-quality Re Doc RED validation and test sets, each consisting of 500 documents. |
| Hardware Specification | No | COMM is implemented utilizing the Py Torch framework (Paszke et al. 2019). - No specific hardware details provided. |
| Software Dependencies | No | COMM is implemented utilizing the Py Torch framework (Paszke et al. 2019) [...] The COMM framework is optimized using the Adam W optimizer (Loshchilov and Hutter 2017). - Specific version numbers for software are not provided. |
| Experiment Setup | Yes | The parameter m is selected through a grid search over the values [0.1, 0.2, 0.3, 0.4], while γ is chosen through [1, 1.2, 1.4, 1.6, 2]. The COMM framework is optimized using the Adam W optimizer (Loshchilov and Hutter 2017). |