Learning from weak labelers as constraints
Authors: Vishwajeet Agrawal, Rattana Pukdee, Nina Balcan, Pradeep K Ravikumar
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we demonstrate the superior performance and robustness of our method on a popular weak supervision benchmark. ... 5 EXPERIMENTAL EVALUATION ... Table 1: Average test accuracy and the corresponding standard error (over 5 random train-val-test split of the data) of our proposed algorithm and the baselines. |
| Researcher Affiliation | Academia | Vishwajeet Agrawal*, Rattana Pukdee*, Maria-Florina Balcan, Pradeep Ravikumar Carnegie Mellon University EMAIL |
| Pseudocode | Yes | We also provide a compact version of our algorithm in Algorithm 1 in Appendix C. ... Algorithm 1 Learning from weak labeler constraints |
| Open Source Code | No | The paper does not provide an explicit statement or link for the open-sourcing of the code for the methodology described. |
| Open Datasets | Yes | We show a comparison of our proposed method and baselines on 8 text classification datasets from the WRENCH benchmark Zhang et al. (2021). |
| Dataset Splits | Yes | Table 1: Average test accuracy and the corresponding standard error (over 5 random train-val-test split of the data) of our proposed algorithm and the baselines. ... We tune these hyperparameters on the validation set of size 100 for each dataset. |
| Hardware Specification | No | The paper does not provide specific hardware details used for running the experiments. It only mentions using a neural network and training parameters. |
| Software Dependencies | No | The paper mentions 'BERT text embeddings', 'Adam optimizer', 'scipy.linprog in Python' and 'cvxpy.CLARABEL convex solver' but does not provide specific version numbers for these software dependencies. |
| Experiment Setup | Yes | For all methods and datasets, we use a neural network with a single hidden layer and a hidden size of 16 on top of the pre-trained BERT text embeddings. The neural network is trained with a full batch gradient descent with an Adam optimizer with a learning rate in [0.001, 0.003, 0.01], weight decay in [0.001, 0.003, 0.01] and a number of epochs in range(1, 500, 5). |