Consistent Multiclass Algorithms for Complex Metrics and Constraints
Authors: Harikrishna Narasimhan, Harish G. Ramaswamy, Shiv Kumar Tavker, Drona Khurana, Praneeth Netrapalli, Shivani Agarwal
JMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on a variety of multiclass classification tasks and fairness constrained problems show that our algorithms compare favorably to the state-of-the-art baselines. |
| Researcher Affiliation | Collaboration | Harikrishna Narasimhan EMAIL Google Research, Mountain View, USA Harish G. Ramaswamy EMAIL Indian Institute of Technology Madras, Chennai, India Shiv Kumar Tavker EMAIL Amazon Inc., Bengaluru, India Drona Khurana EMAIL University of Colorado Boulder, USA Praneeth Netrapalli EMAIL Google Research India, Bengaluru, India Shivani Agarwal EMAIL University of Pennsylvania, Philadelphia, USA |
| Pseudocode | Yes | Algorithm 1 Frank-Wolfe (FW) Algorithm for OP1 with Smooth Convex ψ Algorithm 2 Gradient Descent-Ascent (GDA) Algorithm for OP1 with Non-smooth Convex ψ Algorithm 3 Ellipsoid Algorithm for OP1 with Non-smooth Convex ψ Algorithm 3(a) John-Lowner Ellipsoid (JLE) Construction Algorithm 4 Bisection Algorithm for OP1 with Ratio-of-linear ψ Algorithm 5 Split Frank-Wolfe (Split FW) Algorithm for OP2 with Smooth Convex ψ Algorithm 6 Constrained GDA (Con GDA) Algorithm for OP2 with Non-smooth Convex ψ Algorithm 7 Constrained Ellipsoid (Con Ellipsoid) Algorithm for OP2 with Non-smooth Convex ψ Algorithm 8 Constrained Bisection (Con Bisection) Algorithm for OP2 with Ratio-of-linear ψ Algorithm 9 Plug-in Based LMO Algorithm 10 Plug-in Based LMO for Fairness Problems |
| Open Source Code | Yes | Code available at: https://github.com/shivtavker/constrained-classification |
| Open Datasets | Yes | A summary of the datasets we use is provided in Tables 4 and 5, along with the model architecture we use in each case. The details of the data pre-processing are provided in Appendix B. With the exception of the CIFAR datasets, which comes with standard train-test splits, we split all other datasets into 2/3-rd for training and 1/3-rd for testing, and repeat our experiments over multiple such random splits. All our methods were implemented in Python using Py Torch and Scikit-learn. Table 4: Multi-class datasets used in our experiments Table 5: Multi-group fairness datasets with binary labels used in our experiments. UCI Machine Learning repository (Frank and Asuncion, 2010) |
| Dataset Splits | Yes | With the exception of the CIFAR datasets, which comes with standard train-test splits, we split all other datasets into 2/3-rd for training and 1/3-rd for testing, and repeat our experiments over multiple such random splits. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. It mentions using a Res Net-50 model, but not the specific hardware it was trained or evaluated on. |
| Software Dependencies | No | All our methods were implemented in Python using Py Torch and Scikit-learn. The paper mentions software names (Python, PyTorch, Scikit-learn) but does not provide specific version numbers for any of them. |
| Experiment Setup | Yes | We train a Res Net-50 model for the class probability estimator bη, using SGD to minimize the standard cross-entropy loss. We use a batch size of 64, a base learning rate of 0.01 (with a warm-up cosine schedule), and a momentum of 0.9. We apply a weight decay of 0.01 and train for 39 epochs. In Appendix B.1, we provide other details such as how we choose the hyper-parameters for our algorithms and the baselines. |