Three-Way Trade-Off in Multi-Objective Learning: Optimization, Generalization and Conflict-Avoidance
Authors: Lisha Chen, Heshan Fernando, Yiming Ying, Tianyi Chen
JMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on various multi-task learning benchmarks are performed to demonstrate the practical applicability. |
| Researcher Affiliation | Academia | Lisha Chen EMAIL Department of Electrical, Computer & Systems Engineering Rensselaer Polytechnic Institute, United States Heshan Fernando EMAIL Department of Electrical, Computer & Systems Engineering Rensselaer Polytechnic Institute, United States Yiming Ying EMAIL School of Mathematics and Statistics University of Sydney, NSW, Australia Tianyi Chen EMAIL Department of Electrical, Computer & Systems Engineering Rensselaer Polytechnic Institute, United States |
| Pseudocode | Yes | Algorithm 1 Regularized MGDA Algorithm 2 Mo Do Stochastic MGDA Algorithm 3 SMG (Liu and Vicente, 2021) Algorithm 4 Mo Co (Fernando et al., 2023) |
| Open Source Code | Yes | Code is available at https://github.com/heshandevaka/Trade-Off-MOL. |
| Open Datasets | Yes | We further verify our theory in the NC case on MNIST image classification (Le Cun, 1998) using a multi-layer perceptron and three objectives: cross-entropy, mean squared error (MSE), and Huber loss. |
| Dataset Splits | Yes | The training, validation, and testing data sizes are 50k, 10k, and 10k, respectively. |
| Hardware Specification | Yes | Experiments are done on a machine with GPU NVIDIA RTX A5000. |
| Software Dependencies | Yes | We use MATLAB R2021a for the synthetic experiments in strongly convex case, and Python 3.8, CUDA 11.7, Pytorch 1.8.0 for other experiments. |
| Experiment Setup | Yes | The default parameters are T = 100, α = 0.01, γ = 0.001. In other words, in Figure 3a, we fix α = 0.01, γ = 0.001, and vary T; in Figure 3b, we fix T = 100, γ = 0.001, and vary α; and in Figure 3c, we fix T = 100, α = 0.01, and vary γ. |