Learning 2D Invariant Affordance Knowledge for 3D Affordance Grounding
Authors: Xianqiang Gao, Pingrui Zhang, Delin Qu, Dong Wang, Zhigang Wang, Yan Ding, Bin Zhao
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments are conducted on the MIPA dataset and our method outperforms previous state-of-the-art methods. Specifically, MIFAG achieve scores of 85.10 in AUC and 20.50 in a IOU, outperforming the second-best method, LASO (Li et al. 2024c), and IAGNet (Yang et al. 2023), with an improvement of +1.97 in AUC and +2.58 in a IOU. |
| Researcher Affiliation | Academia | 1University of Science and Technology of China 2Shanghai AI Laboratory 3Fudan University 4Northwestern Polytechnical University |
| Pseudocode | No | The paper describes the methodology in narrative text and block diagrams (e.g., Figure 2) but does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code https://github.com/goxq/MIFAG-code |
| Open Datasets | Yes | To address these challenges, we constructed the Multi-Image and Point Affordance (MIPA) Dataset, which comprised paired multi-images and point clouds. We leverage point clouds and affordance annotations from 3D Affordance Net (Deng et al. 2021), while gathering paired multiple images from IAGNet (Yang et al. 2023), HICO (Chao et al. 2015) and AGD20K (Luo et al. 2022). |
| Dataset Splits | No | In addition, we conducted our training and evaluation under the seen and unseen settings following previous works (Yang et al. 2023; Li et al. 2024c). The seen setting shares identical object categories in training and evaluation, whereas the unseen setting utilizes different splits of categories. The paper does not provide specific percentages or counts for these splits for the MIPA dataset itself. |
| Hardware Specification | Yes | We train MIFAG model on a single NVIDIA A100 GPU with a batch size of 64, using the Adam optimizer with a learning rate of 4e-5. |
| Software Dependencies | No | The paper mentions using 'Point Net++' and 'Res Net18' as backbones, but does not specify version numbers for these or other software dependencies. |
| Experiment Setup | Yes | We train MIFAG model on a single NVIDIA A100 GPU with a batch size of 64, using the Adam optimizer with a learning rate of 4e-5. |