A General Framework for Adversarial Label Learning
Authors: Chidubem Arachie, Bert Huang
JMLR 2021 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments show that our method can train without labels and outperforms other approaches for weakly supervised learning. |
| Researcher Affiliation | Academia | Chidubem Arachie EMAIL Department of Computer Science Virginia Tech Blacksburg, VA 24061, USA Bert Huang EMAIL Department of Computer Science Data Intensive Studies Center Tufts University Medford, MA 02155, USA |
| Pseudocode | Yes | Algorithm 1 Adversarial Label Learning and Algorithm 2 Multiclass Adversarial Label Learning |
| Open Source Code | No | The paper does not provide an explicit statement or link to open-source code for the methodology described. |
| Open Datasets | Yes | Fashion-MNIST (Xiao et al., 2017), Breast Cancer (Blake and Merz, 1998; Street et al., 1993), OBS Network (Rajab et al., 2016), Cardiotocography (Ayres-de Campos et al., 2000), Clave Direction (Vurkaç, 2011), Credit Card (Blake and Merz, 1998), Statlog Satellite (Blake and Merz, 1998), Phishing Websites (Mohammad et al., 2012), Wine Quality (Cortez et al., 2009), Microsoft COCO: Common Objects in Context (Plummer et al., 2015), RTE (Snow et al., 2008), Word Similarity (Snow et al., 2008), Street View House Numbers (SVHN) (Netzer et al., 2018) |
| Dataset Splits | Yes | We randomly split each dataset such that 30% is used as weak supervision data, 40% is used as training data, and 30% is used as test data. For our experiments, we use 10 such random splits and report the mean of the results. We assume we have access to a labeled validation set consisting of 1% of the available data. We calculate error and precision bounds for the pseudolabels with four-fold cross-validation on the validation set. |
| Hardware Specification | No | We thank NVIDIA for their support through the GPU Grant Program and Amazon for their support via the AWS Cloud Credits for Research program. (This acknowledges support but does not specify hardware models or configurations used for experiments.) |
| Software Dependencies | No | The paper mentions using Adagrad for optimization and describes model architectures (e.g., six-layer convolutional neural network, logistic regression), and references pre-trained models (BERT, ResNet-101), but it does not provide specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x). |
| Experiment Setup | Yes | We use the sigmoid function as our parameterized function fθ for estimating class probabilities of ALL and GE, i.e., [fθ(xj)]n j=1 = 1/(1 + exp( θT x)) = pθ. We use a fixed upper bound of b1 = b2 = b3 = 0.3. For multiclass classification using deep neural networks, our model uses popular cross-entropy loss. We optimize the loss functions using Adagrad (Duchi et al., 2011). The final result for each experiment is that Multi-ALL outperforms both Snorkel and averaging in all settings, showing a strong ability to fuse noisy signals and to avoid being confounded by redundant signals. We use the same deep neural network architecture for all experiments: a six-layer convolutional neural network where each layer contains a max-pooling unit, a relu activation unit, and dropout. The final layer is a fully connected layer with softmax output. |