Zero-Shot Learning in Industrial Scenarios: New Large-Scale Benchmark, Challenges and Baseline
Authors: Zekai Zhang, Qinghui Chen, Maomao Xiong, Shijiao Ding, Zhanzhi Su, Xinjie Yao, Yiming Sun, Cong Bai, Jinglin Zhang
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, ablation studies and comparative experiments are performed to demonstrate the effectiveness of RTVP. Dataset and Evaluation Metrics This paper conducts experiments on MMIO-80K, MSCOCO, and LVIS datasets. For the MMIO zeroshot task, this paper split MMIO into 65 base classes for training and 35 novel classes for testing. For the MMIO closed task, this paper split MMIO into a training-test set ratio of 80% and 20%. To evaluate generalization, this paper performs closed-scene validation on COCO and zero-shot validation on LVIS. This paper uses the COCO and LVIS metrics to measure the model s accuracy. |
| Researcher Affiliation | Academia | Zekai Zhang1, Qinghui Chen1, Maomao Xiong1, Shijiao Ding1, Zhanzhi Su2, Xinjie Yao3, Yiming Sun4, Cong Bai5, Jinglin Zhang1* 1School of Control Science and Engineering, Shandong University, Jinan, China 2 National Supercomputing Center in Jinan, Qilu University of Technology, Jinan, China 3 College of Intelligence and Computing, Tianjin University, Tianjin, China 4 School of Automation, Southeast University, Nanjing, China 5 College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, China EMAIL, EMAIL, EMAIL, EMAIL, EMAIL, EMAIL |
| Pseudocode | No | The paper describes its methodology using mathematical equations and descriptive text, but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | MMIO-80K: To the best of our knowledge, this paper constructed the first object detection data set MMIO-80K for industrial open scenarios. MMIO-80K consists of more than 80K samples, effectively alleviating the lack of domain expertise in industrial open scenarios. 1Extended version: https://github.com/hellozzk/MMIO |
| Open Datasets | Yes | MMIO-80K: To the best of our knowledge, this paper constructed the first object detection data set MMIO-80K for industrial open scenarios. MMIO-80K consists of more than 80K samples, effectively alleviating the lack of domain expertise in industrial open scenarios. 1Extended version: https://github.com/hellozzk/MMIO |
| Dataset Splits | Yes | MMIO is divided into a visible class training dataset and an invisible class test set for zero-shot tasks. Among them, the visible class contains 18,811 images and annotations, and the invisible class contains 3025 images and annotations. This paper also provides a training test set for closed scenario tasks, with a test and training set ratio of 20% and 80%. |
| Hardware Specification | Yes | The experimental model is built on Py Torch 2.0.1, and the hardware environment is 4 Nvidia RTX 4090 GPUs. |
| Software Dependencies | Yes | The experimental model is built on Py Torch 2.0.1, and the hardware environment is 4 Nvidia RTX 4090 GPUs. |
| Experiment Setup | Yes | The model is trained 200 epochs using Adam W with 32 batches. The input image size is 640. The initial learning rate is 2e3, the weight decay is 0.025, the text encoder (CLIP-Text Encoder) and Mobile-SAM-T are frozen during pre-training. |