Exploring Vision Semantic Prompt for Efficient Point Cloud Understanding
Authors: Yixin Zha, Chuxin Wang, Wenfei Yang, Tianzhu Zhang, Feng Wu
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments conducted on datasets including Scan Object NN, Model Net40, and Shape Net Part demonstrate the effectiveness of our proposed paradigm. In particular, our method achieves 95.6% accuracy on Model Net40 and attains 90.09% performance on the most challenging classification split Scan Object NN(PB-T50-RS). |
| Researcher Affiliation | Academia | 1University of Science and Technology of China, Hefei, China 2Deep Space Exploration Lab, Hefei, China. Correspondence to: Yixin Zha <EMAIL>. |
| Pseudocode | No | The paper describes processes using mathematical formulas and architectural diagrams (e.g., Figure 2, Figure 3) but does not contain a clearly labeled pseudocode or algorithm block. |
| Open Source Code | No | The paper does not provide any explicit statement about releasing source code, nor does it include a link to a code repository. |
| Open Datasets | Yes | Extensive experiments conducted on datasets including Scan Object NN, Model Net40, and Shape Net Part demonstrate the effectiveness of our proposed paradigm. ... In this section, we first present the implementation details in Sec. 4.1. After that, in Sec. 4.2, to demonstrate the effectiveness of the proposed paradigm, we evaluate its performance using four combinations of 2D and 3D pre-trained models on four downstream tasks, including synthetic object classification, real-world object classification, part segmentation and few-shot learning. |
| Dataset Splits | Yes | Real-World Shape Classification: Scan Object NN (Uy et al., 2019)is one of the most challenging 3D datasets, which covers 15K real-world objects from 15 categories. We report classification results of three variants. ... Synthetic Shape Classification: In addition to the experiments conducted on a real-world dataset, we perform experiments on a synthetic dataset, Model Net40 (Wu et al., 2015), which consists of 12,311 clean 3D CAD models, covering 40 object categories. ... Part Segmentation: We conduct part segmentation experiments on the challenging Shape Net Part (Yi et al., 2016) dataset, which comprises 16880 models with 16 different shape categories and 50 part labels. |
| Hardware Specification | Yes | All experiments are conducted on a single Ge Force TRX 3090. ... Table 5. Training recipes for Parameter-Efficient Transfer Learning. ... GPU device GTX 3090 |
| Software Dependencies | No | The paper mentions 'Adam W' as the optimizer and 'cosine' for the learning rate scheduler, but does not specify version numbers for any software frameworks or libraries like Python, PyTorch, or TensorFlow. |
| Experiment Setup | Yes | Performing fine-tuning on Scan Object NN (Uy et al., 2019) as an example, the overall training includes 300 epochs, with a cosine learning rate (Loshchilov & Hutter, 2016) of 5e-4, and a 10-epoch warm-up period. We adopt Adam W (Loshchilov, 2017) as the optimizer. Besides, we show the BN rank of our proposed Hybrid Attention Adapter (HAA) and the rank of α and β generator, which is the dimension of the feature passed through downward projection. We also provide relevant setup of 3D-to-2D projection and 2D pretrained models, such as the resolution of 2D depth maps and the image patch size of 2D transformers. ... Table 5. Training recipes for Parameter-Efficient Transfer Learning. Config Scan Object NN ... learning rate 2e-5 weight decay 5e-2 ... training epochs 300 warmup epochs 10 batch size 32 drop path rate 0.2 Generator rank 16 q rank of HAA 18 k rank of HAA 18 BN-v rank of HAA 64 image resolution 224 image patch size 16/14 number of points 2048 number of point patches 128 point patch size 32 |