COMM: Concentrated Margin Maximization for Robust Document-Level Relation Extraction

Authors: Zhichao Duan, Tengyu Pan, Zhenyu Li, Xiuxing Li, Jianyong Wang

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments and analysis demonstrate the versatility and effectiveness of COMM, especially its robustness when trained on low-quality data (achieves >10% performance gains).
Researcher Affiliation Academia Zhichao Duan1, Tengyu Pan1, Zhenyu Li1, Xiuxing Li2, Jianyong Wang1* 1Tsinghua University 2Beijing Institute of Technology
Pseudocode Yes Algorithm 1: COMM Input: REncoder, Batch Data, Relations R Parameter: γ, m 1: for all D in Batch Data do 2: L CMM 0 3: T IARA(REncoder, D, |R| + 1) 4: for all Entity Pair (es, eo) in D do 5: t T H T (es, eo)[0] 6: for all r in R do 7: if r PT then 8: tr+ T (es, eo)[r] 9: dr+ tr+ t T H 10: q+ r log(σ(dr+)) 11: L CMM L CMM qr+(1 qr+)γ 12: else if r NT then 13: tr T (es, eo)[r] 14: dr t T H tr 15: qr log(min(σ(dr ) + m, 1)) 16: L CMM L CMM qr 17: end if 18: end for 19: end for 20: Perform Stochastic Gradient Descent using L CMM 21: end for
Open Source Code No All results are reported based on the public code of these models with three repeated executions. (This refers to the relational encoders used, not the COMM framework itself) - No explicit statement for COMM code.
Open Datasets Yes Doc RED (Yao et al. 2019) represents a comprehensive, crowd-sourced dataset for Doc RE created from Wikipedia articles. [...] Re-Doc RED (Tan et al. 2022b) constitutes an augmented variant of the Doc RED dataset, specifically designed to rectify the prevalent issue of false negatives in Doc RE.
Dataset Splits Yes For training, we utilize the training sets of both Doc RED and Re-Doc RED, each comprising 3,053 documents. During validation and testing phases, in order to better assess the model s performance, we employ the higher-quality Re Doc RED validation and test sets, each consisting of 500 documents.
Hardware Specification No COMM is implemented utilizing the Py Torch framework (Paszke et al. 2019). - No specific hardware details provided.
Software Dependencies No COMM is implemented utilizing the Py Torch framework (Paszke et al. 2019) [...] The COMM framework is optimized using the Adam W optimizer (Loshchilov and Hutter 2017). - Specific version numbers for software are not provided.
Experiment Setup Yes The parameter m is selected through a grid search over the values [0.1, 0.2, 0.3, 0.4], while γ is chosen through [1, 1.2, 1.4, 1.6, 2]. The COMM framework is optimized using the Adam W optimizer (Loshchilov and Hutter 2017).