reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

MamBEV: Enabling State Space Models to Learn Birds-Eye-View Representations

Authors: Hongyu Ke, Jack Morris, Kentaro Oguchi, Xiaofei Cao, Yongkang Liu, Haoxin Wang, Yi Ding

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate Mam BEV s promising performance across diverse visual perception metrics, highlighting its advantages in input scaling efficiency compared to existing benchmark models. A thorough set of ablation studies is provided to showcase model scaling and other properties. We open-source our code 1 and provide a strong baseline and evaluation framework for future experimentation.
Researcher Affiliation	Collaboration	1Georgia State University 2Info Tech Labs, Toyota Motor North America R&D EMAIL EMAIL
Pseudocode	Yes	A.3 ALGORITHMS The Pseudocode of our proposed Spatial Cross Mamba shows in Algorithm 1. The details of the Cross Quasi-Separable State Space Model (XQSSM) show in Algorithm 2.
Open Source Code	Yes	The code is available at https://github.com/amaigsu/Mam BEV. We open-source our code 1 and provide a strong baseline and evaluation framework for future experimentation. 1https://github.com/amai-gsu/Mam BEV
Open Datasets	Yes	We conduct our experiments using the nu Scenes dataset Caesar et al. (2020). The nu Scenes dataset is a large-sale autonomous driving dataset containing 1000 driving scenes from Boston and Singapore.
Dataset Splits	No	The paper mentions using the nu Scenes dataset but does not explicitly describe how the data was split into training, validation, or test sets for their experiments (e.g., specific percentages, counts, or a reference to predefined splits used by the authors).
Hardware Specification	Yes	We trained with an effective batch size of 32 with no gradient accumulation on 8 A100s for 30 epochs, truncated at 24 epochs. The FPS is the average number of samples per second processed by the model in evaluation mode on an RTX 4090 GPU.
Software Dependencies	No	The paper mentions using an Adam W optimizer and an automatic mixed precision optimizer wrapper, but does not provide specific version numbers for any software libraries, frameworks, or programming languages used.
Experiment Setup	Yes	We used a learning rate of 8 10 4, with a linear warmup for 10% of the scheduled steps starting from 8 3 10 4 Following the warmup, the learning rate follows an epoch based cosine annealing schedule with a minimum learning rate of 8 10 7. We trained with an effective batch size of 32 with no gradient accumulation on 8 A100s for 30 epochs, truncated at 24 epochs. Starting from step 100 an exponential moving average according to the function w t = (1 0.0002)wt + 0.0002wt is applied to all weights. An Adam W optimizer with a 0.01 weight decay is used, and training employs an automatic mixed precision optimizer wrapper with an initial gradient scaling of 512. A 0.1 multiplier is applied to the learning rate of the backbone weights and the deformable attention sampling offsets Zhu et al. (2020).