Advancing Generalization Across a Variety of Abstract Visual Reasoning Tasks
Authors: Mikołaj Małkiński, Jacek Mańdziuk
IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experiments demonstrate strong generalization capabilities of the proposed model, which in several settings outperforms the existing literature methods. |
| Researcher Affiliation | Academia | 1Warsaw University of Technology, Warsaw, Poland 2AGH University of Krakow, Krakow, Poland EMAIL, EMAIL |
| Pseudocode | No | The paper describes the model architecture (Fig. 5) and its components in detail but does not present any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code for reproducing all experiments is publicly accessible at: https://github.com/mikomel/raven |
| Open Datasets | Yes | We utilize four RPM datasets: PGM [Barrett et al., 2018], I-RAVEN [Hu et al., 2021], I-RAVEN-Mesh and A-I-RAVEN [Małki nski and Ma ndziuk, 2025a]. We extend the evaluation of Po NG beyond RPMs, to two benchmarks comprising visual analogies with both synthetic [Hill et al., 2019] and real-world [Bitton et al., 2023] images. |
| Dataset Splits | Yes | PGM: Each regime contains 1.42M RPMs, where 1.2M, 20K, and 200K belong to the train, validation, and test splits, resp. I-RAVEN: The benchmark consists of 10K matrices per configuration, totaling 70K matrices, split into the train, validation, and test with a 60/20/20 ratio. VAP: Each regime contains 710K matrices, with 600K, 10K, and 100K devoted to the train, validation, and test splits, resp. VASR: We use Silver data for training, which includes 150K, 2.25K, and 2.55K matrices in the train, validation, and test splits, resp. |
| Hardware Specification | Yes | All experiments are performed on a single GPU (NVIDIA DGX A100). |
| Software Dependencies | No | The paper mentions the Adam optimizer, Vision Transformer (ViT), and TCN, but does not specify any software versions for these or other libraries. |
| Experiment Setup | Yes | Po NG is trained using a standard training strategy involving the Adam optimizer [Kingma and Ba, 2014] with default hyperparameters (λ = 0.001, β1 = 0.9, β2 = 0.999, ϵ = 10 8). Learning rate λ is reduced by a factor of 10 after 5 epochs without improvement in validation loss. Early stopping is applied after 10 epochs without validation loss reduction. We use batch size B = 128 for experiments on RAVEN-like datasets and B = 256 in the remaining cases to reduce training time on large datasets. |