reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Top-Down Guidance for Learning Object-Centric Representations

Authors: Junhong Zou, Xiangyu Zhu, Zhaoxiang Zhang, Zhen Lei

IJCAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate TDGNet and compare it with current SOTA models on multiple tasks. We first introduce CLEVRTex [Karazija et al., 2021], MOVi-C [Greff et al., 2022] and COCO to evaluate the object-centric representations, where TDGNet outperforms current SOTA models in terms of common object discovery metrics. Furthermore, we expand the downstream task scope of TDGNet by applying it to the field of robotics. We introduce Robo Net [Dasari et al., 2020] and VP2 [Tian et al., 2023], to evaluate TDGNet with downstream tasks including video prediction and visual planning, demonstrating that TDGNet adapts well to these tasks.
Researcher Affiliation	Academia	1MAIS, Institute of Automation, Chinese Academy of Sciences, Beijing, China 2School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China 3CAIR, HKSIS, Chinese Academy of Sciences, Hong Kong, China 4School of Computer Science and Engineering, the Faculty of Innovation Engineering, M.U.S.T, Macau, China EMAIL
Pseudocode	No	The paper describes the methodology using text and diagrams (Figure 2), but does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code	No	Code will be available at https://github.com/zoujunhong/RHGNet.
Open Datasets	Yes	We first introduce CLEVRTex [Karazija et al., 2021], MOVi-C [Greff et al., 2022] and COCO [Caesar et al., 2018]. ... We introduce Robo Net [Dasari et al., 2020] and VP2 [Tian et al., 2023]...
Dataset Splits	Yes	The model s generalization ability is also evaluated using CLEVRTex OOD and -CAMO, two out-of-distribution test sets. ... Following previous works [Song et al., 2024], the model predicts 8 future frames from 6 context frames with no conditions. ... The models are required to predict 10 future frames given 2 context frames and the robotic arm s action as conditions.
Hardware Specification	No	The paper does not provide any specific hardware details used for running the experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details with version numbers.
Experiment Setup	Yes	Here we use a combination of L1 loss and perceptual loss (LPIPS) [Zhang et al., 2018] for optimization. ... The top-down guidance loss LTD is formulated as below: LTD := 1 Cos Sim(P(b F), FH). Overall, TDGNet is trained by L, the weighted sum of the reconstruction loss and the top-down guidance loss: L = Lrec + λTDLTD. ... We set a threshold th to determine whether a conflict is large or not. ... As for the choice of th, we propose a heuristic method that for each trained model, we calculate the average distance between slots and set th as half of this distance.