Commander-GPT: Dividing and Routing for Multimodal Sarcasm Detection
Authors: Yazhou Zhang, Chunwang Zou, Bo Wang, Jing Qin, Prayag Tiwari
TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate Commander-GPT on the MMSD and MMSD 2.0 benchmarks, comparing five prompting strategies. Experimental results show that our framework achieves4.4% and 8.5% improvement in F1 score over state-of-the-art (So TA) baselines on average, demonstrating its effectiveness. |
| Researcher Affiliation | Academia | Yazhou Zhang EMAIL School of Computer Science and Technology, Tianjin University Chunwang Zou EMAIL Software Engineering College, Zhengzhou University of Light Industry Bo Wang EMAIL School of Computer Science and Technology, Tianjin University Jing Qin* EMAIL Center for Smart Health, School of Nursing, The Hong Kong Polytechnic University Prayag Tiwari EMAIL School of Information Technology, Halmstad University |
| Pseudocode | Yes | A.2 Algorithm The algorithm is shown in Alg. 1. Algorithm 1 Commander-GPT: Modular Multimodal Sarcasm Understanding |
| Open Source Code | No | The paper does not provide an explicit statement about releasing the source code for the described methodology, nor does it provide a direct link to a code repository. |
| Open Datasets | Yes | In this section, we conduct comprehensive experiments on two widely-used multimodal sarcasm detection benchmarks, MMSD (Cai et al., 2019) and MMSD 2.0 (Qin et al., 2023). |
| Dataset Splits | Yes | Table 1: Statistics of the MMSD and MMSD 2.0 datasets. Dataset Train Validation Test Sarcastic Non-sarcastic Source MMSD 19,816 2,410 2,409 10,560 14,075 Twitter MMSD 2.0 19,816 2,410 2,409 11,651 12,980 Twitter |
| Hardware Specification | Yes | All experiments were conducted on a server equipped with two NVIDIA RTX 4090 GPUs and 256GB RAM. |
| Software Dependencies | No | Commander-GPT was implemented using Py Torch, Hugging Face Transformers, and the Open MMLab toolkit. The paper mentions software tools but does not provide specific version numbers for them. |
| Experiment Setup | Yes | For supervised components (e.g., the BERT-based commander and the routing classifier), we fine-tuned for 10 epochs (approximately 12 hours in total). We used the Adam W optimizer with an initial learning rate of 2 10 5, batch size of 64, maximum sequence length of 512, and weight decay of 0.01. A linear warm-up and decay scheduler was applied. Early stopping was triggered if the validation F1 score did not improve for 3 consecutive epochs. |