reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Soft-Label Integration for Robust Toxicity Classification

Authors: Zelei Cheng, Xian Wu, Jiahao Yu, Shuo Han, Xin-Qiang Cai, Xinyu Xing

NeurIPS 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results demonstrate that our approach outperforms existing baseline methods in terms of both average and worst-group accuracy, confirming its effectiveness in leveraging crowdsourced annotations to achieve more effective and robust toxicity classification.
Researcher Affiliation	Academia	Zelei Cheng Northwestern University Evanston, USA EMAIL Xian Wu Northwestern University Evanston, USA EMAIL Jiahao Yu Northwestern University Evanston, USA EMAIL Shuo Han Northwestern University Evanston, USA EMAIL Xin-Qiang Cai The University of Tokyo Tokyo, Japan EMAIL Xinyu Xing Northwestern University Evanston, USA EMAIL
Pseudocode	Yes	We provide the full algorithm in Algorithm 1.
Open Source Code	Yes	We release the data and code in https://github. com/chengzelei/crowdsource_toxicity_classification.
Open Datasets	Yes	Additionally, we conduct our experiments on the public Hate Xplain dataset [20].
Dataset Splits	Yes	For each classification task, we have a large training set with crowdsourced annotations (i.e., 6,941 samples for toxic question classification and 28,194 samples for toxic response classification) and a testing set containing 2,000 samples with ground truth. The validation set with ground truth includes a small number of samples (i.e., 1,000 samples) from the training set.
Hardware Specification	Yes	We train the machine learning models on a server with 8 NVIDIA A100 80GB GPUs and 4TB memory for all the learning algorithms.
Software Dependencies	Yes	The toxicity classifier and soft-label weight estimator are both implemented based on the transformers library of version 4.34.1 [62].
Experiment Setup	Yes	We list the hyper-parameter settings for all experiments in Appendix C.3.