reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

All-in-One: Transferring Vision Foundation Models into Stereo Matching

Authors: Jingyi Zhou, Haoyu Zhang, Jiakang Yuan, Peng Ye, Tao Chen, Hao Jiang, Meiya Chen, Yangyang Zhang

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results show that AIO-Stereo achieves start-of-the-art performance on multiple datasets and ranks 1st on the Middlebury dataset and outperforms all the published work on the ETH3D benchmark.
Researcher Affiliation	Collaboration	Jingyi Zhou1, Haoyu Zhang1, Jiakang Yuan1*, Peng Ye1,3,4 , Tao Chen1 , Hao Jiang2, Meiya Chen2, Yangyang Zhang2 1School of Information Science and Technology, Fudan University, China 2Xiaomi Inc., Beijing, China 3Shanghai AI Laboratory, Shanghai, China 4The Chinese University of Hong Kong
Pseudocode	No	The paper describes the methodology in the 'Method' section, but it does not include any explicitly labeled pseudocode or algorithm blocks. The steps are explained in narrative form.
Open Source Code	No	The paper does not provide an explicit statement about the availability of source code, nor does it include any links to code repositories or mention code in supplementary materials.
Open Datasets	Yes	Following Selective-Stereo (Wang et al. 2024), we verify the effectiveness of AIO-Stereo on four widely used datasets including Scene Flow (Mayer et al. 2016), Middlebury 2014 (Scharstein et al. 2014), KITTI-2015 (Menze and Geiger 2015) and ETH3D (Schops et al. 2017). ... For the Middlebury dataset, following (Wang et al. 2024), we first finetune our pre-trained model on the mixed Tartan Air (Wang et al. 2020), CREStereo Dataset (Li et al. 2022), Scene Flow, Falling things (Tremblay, To, and Birchfield 2018), In Stereo2k (Bao et al. 2020), CARLA HR-VS (Yang et al. 2019), and Middlebury datasets 200k steps...
Dataset Splits	Yes	Scene Flow (Mayer et al. 2016) contains more than 39000 synthetic stereo frames which are divided into training and testing set. Middlebury 2014 (Scharstein et al. 2014) provides a training set with images of 23 indoor scenes and a testing set with images of 10 indoor scenes, and both sets have three resolutions to use. KITTI-2015 (Menze and Geiger 2015) contains 200 training pairs and 200 testing pairs with sparse disparity maps which were collected in real-world driving scenes.
Hardware Specification	Yes	We implement our AIO-Stereo with Pytorch framework and perform our experiments using NVIDIA A100 GPUs while using the Adam W optimizer.
Software Dependencies	No	The paper mentions 'Pytorch framework' for implementation but does not specify its version or any other software dependencies with their respective version numbers.
Experiment Setup	Yes	We implement our AIO-Stereo with Pytorch framework and perform our experiments using NVIDIA A100 GPUs while using the Adam W optimizer. For pre-training, we trained our model on the augmented Scene Flow training set (i.e., both cleanpass and finalpass) for 200k steps with a batch size of 8, and we use a random crop size of 320 720. We use a one-cycle learning rate schedule with warm up strategy and the learning rate gradually increases to 0.0002 in the first 1% of steps and gradually decreases thereafter. And for finetune, the learning rate linearly decays from 0.0003 to 0.