Selective Task Group Updates for Multi-Task Optimization

Authors: Wooseong Jeong, Kuk-Jin Yoon

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental settings. We assess the proposed techniques using three datasets: NYUD-V2 for indoor vision tasks (Silberman et al., 2012), PASCAL-Context for outdoor scenarios (Mottaghi et al., 2014), and Taskonomy (Zamir et al., 2018) for large number of tasks. Multi-task performance is compared using the metric introduced by (Maninis et al., 2019). This metric calculates the per-task performance by averaging it relative to the single-task baseline b: m = (1/K) PK i=1( 1)li(Mm,i Mb,i)/Mb,i where li = 1 if a lower value of measure Mi means indicates better performance for task τi, and 0 otherwise. More details are introduced in Appendix C.
Researcher Affiliation Academia Wooseong Jeong & Kuk-Jin Yoon Korea Advanced Institute of Science and Technology EMAIL
Pseudocode Yes Algorithm 1: Tracking Proximal Inter-Task Affinity for Task Group Updates
Open Source Code No We implement our experiments on top of publically available code from Ye & Xu (2022b). We run our experiments on A6000 GPUs.
Open Datasets Yes We assess the proposed techniques using three datasets: NYUD-V2 for indoor vision tasks (Silberman et al., 2012), PASCAL-Context for outdoor scenarios (Mottaghi et al., 2014), and Taskonomy (Zamir et al., 2018) for large number of tasks.
Dataset Splits No We assess the proposed techniques using three datasets: NYUD-V2 for indoor vision tasks (Silberman et al., 2012), PASCAL-Context for outdoor scenarios (Mottaghi et al., 2014), and Taskonomy (Zamir et al., 2018) for large number of tasks. Explanation: The paper mentions using well-known datasets but does not explicitly provide details about the training, validation, or test splits (e.g., percentages or sample counts) used for these datasets in the provided text.
Hardware Specification Yes We run our experiments on A6000 GPUs.
Software Dependencies No Table C.1: Hyperparameters for experiments. Hyperparameter Value Optimizer Adam Kingma & Ba (2014) Scheduler Polynomial Decay Minibatch size 8 Number of iterations 40000 Backbone (Transformer) Vi T Dosovitskiy et al. (2020) Learning rate 0.00002 Weight Decay 0.000001 Affinity decay factor β 0.001. Explanation: The paper lists software components like "Adam" and "Vi T" but does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes Implementation Details. For experiments, we adopt Vi T Dosovitskiy et al. (2020) pre-trained on Image Net-22K Deng et al. (2009) as the multi-task encoder. Task-specific decoders merge the multi-scale features extracted by the encoder to generate the outputs for each task. The models are trained for 40,000 iterations on both NYUD Silberman et al. (2012) and PASCAL Everingham & Winn (2012) datasets with batch size 8. We used Adam optimizer with learning rate 2 10 5 and 1 10 6 of a weight decay with a polynomial learning rate schedule. The cross-entropy loss was used for semantic segmentation, human parts estimation, and saliency, edge detection. Surface normal prediction and depth estimation used L1 loss. The tasks are weighted equally to ensure a fair comparison. For the Taskonomy Benchmark Zamir et al. (2018), we use the dataloader from the open-access code provided by Chen et al. (2023), while maintaining experimental settings identical to those used for NYUD-v2 and PASCAL-Context. We use the same experimental setup for the other hyperparameters as in previous works Ye & Xu (2022a;c), as detailed in Table C.1. Table C.1: Hyperparameters for experiments. Hyperparameter Value Optimizer Adam Kingma & Ba (2014) Scheduler Polynomial Decay Minibatch size 8 Number of iterations 40000 Backbone (Transformer) Vi T Dosovitskiy et al. (2020) Learning rate 0.00002 Weight Decay 0.000001 Affinity decay factor β 0.001