Open-Det: An Efficient Learning Framework for Open-Ended Detection
Authors: Guiping Cao, Tao Wang, Wenjian Huang, Xiangyuan Lan, Jianguo Zhang, Dongmei Jiang
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4. Experiments, 4.1. Main Results, 4.2. Ablation Studies, 4.3. Improvements in VL Alignment Scores |
| Researcher Affiliation | Academia | 1Research Institute of Trustworthy Autonomous Systems and Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen 518055, China. 2Pengcheng Laboratory, Shenzhen, China. 3Pazhou Laboratory (Huangpu). 4Guangdong Provincial Key Laboratory of Brain-inspired Intelligent Computation, Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen 518055, China. Correspondence to: Xiangyuan Lan <EMAIL>, Jianguo Zhang <EMAIL>. |
| Pseudocode | No | The paper describes methods in prose and mathematical formulations and includes architectural diagrams (Figure 2, 3, 5), but does not contain any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | The source codes are available at: https://github. com/Med-Process/Open-Det. |
| Open Datasets | Yes | Datasets. We train our model with a small set of detection data Visual Genome (VG) (Krishna et al., 2017)... our model is evaluated on the commonly used zero-shot LVIS (Gupta et al., 2019) dataset... The COCO2017 (Lin et al., 2014) and Objects365 (Shao et al., 2019) are also used for performance evaluation. |
| Dataset Splits | Yes | We train our model with a small set of detection data Visual Genome (VG) (Krishna et al., 2017), which contains 77,398 images for training... evaluating on the 5k Mini Val subset of the LVIS (Gupta et al., 2019) dataset. |
| Hardware Specification | Yes | fewer GPU resources (4 V100 vs. 16 A100), our models are trained with a mini-batch size of 8 on 4 Tesla V100 GPUs, Swin-Small (using 4 V100 GPUs) and Swin-Large (using 4 A800 GPUs) |
| Software Dependencies | No | The paper mentions software components like 'Flan T5-base' and 'Adam W optimizer' but does not provide specific version numbers for these or other ancillary software libraries or programming environments. |
| Experiment Setup | Yes | our models are trained with a mini-batch size of 8 on 4 Tesla V100 GPUs, using the Adam W optimizer (Loshchilov, 2017) with a weight decay of 0.05. The learning rates are configured as follows: 1 10 4 for both the Object Detector and Prompts Distiller, and 2 10 4 for the Object Name Generator. threshold τ (default: 0.99), α (default: 0.25) |