Multimodal Knowledge Retrieval-Augmented Iterative Alignment for Satellite Commonsense Conversation
Authors: Qian Li, Xuchen Li, Zongyu Chang, Yuzheng Zhang, Cheng Ji, Shangguang Wang
IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate that Sat-RIA outperforms existing large language models and provides more comprehensible answers with fewer hallucinations. The paper includes a dedicated section '5 Experiments' with subsections '5.1 Evaluation Datasets', '5.2 Evaluation Metrics', '5.3 Comparison Methods', and '5.5 Main Results' which features a performance comparison table. |
| Researcher Affiliation | Academia | 1School of Computer Science, Beijing University of Posts and Telecommunications, China 2Institute of Automation, Chinese Academy of Sciences and Zhongguancun Academy, China 3SKLCCSE, School of Computer Science and Engineering, Beihang University, China EMAIL, EMAIL, EMAIL, EMAIL, EMAIL, EMAIL |
| Pseudocode | No | The paper describes methods through text and mathematical formulas (e.g., equations 1-7) but does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any explicit statement about releasing the source code for the methodology, nor does it provide a link to a code repository. |
| Open Datasets | No | To evaluate our models on satellite commonsense conversation, we construct two datasets: one for satellite multi-turn dialogues (Sat Diag) and one for satellite visual question-answering (Sat VQA) (more details in Appendix C). The paper does not provide concrete access information (link, DOI, repository, or external citation) for these constructed datasets in the main body. |
| Dataset Splits | No | The paper describes the size and content of the constructed datasets (e.g., 'The Sat Diag dataset includes 2,000 dialogues', 'The Sat VQA dataset consists of 2,000 labeled examples') but does not specify any training, validation, or test splits, nor does it mention cross-validation or specific splitting methodologies. |
| Hardware Specification | Yes | We have trained our model through the method of full parameter fine-tuning, using a 2x A800 80G machine, and All experiments were conducted on the same machine. |
| Software Dependencies | No | The paper mentions 'Py Torch framework' and specific LLM models like 'Intern VL 2 8B' and 'LLa Ma3 8B', but it does not provide specific version numbers for PyTorch or any other ancillary software libraries or tools. |
| Experiment Setup | Yes | We use a total batch size of 1 throughout the training process. The Adam W [Loshchilov and Hutter, 2019] optimizer is applied with a cosine learning rate decay and a warm-up period. In the training stage, every alignment epoch number is 1 with a learning rate of 1 10 5 and a warmup ratio of 0.05. |