reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

ReMask-Animate: Refined Character Image Animation Using Mask-Guided Adapters

Authors: Xunzhi Xiang, Haiwei Xue, Zonghong Dai, Di Wang, Minglei Li, Ye Yue, Fei Ma, Weijiang Yu, Heng Chang, Fei Richard Yu

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments show that our method outperforms state-of-the-art methods on five metrics in public datasets. Additionally, qualitative evaluations highlight a significant improvement in the quality of generated videos, demonstrating our approach s superiority.
Researcher Affiliation	Collaboration	1Guangdong Laboratory of Artificial Intelligence and Digital Economy (SZ), Shenzhen, China 201AI, Beijing, China 3Tsinghua University, Shenzhen, Guangdong, China 4Sun Yat-sen University, Guangzhou, Guangdong, China 5Shenzhen University, Shenzhen, Guangdong, China 6Carleton University, Canada EMAIL, EMAIL, EMAIL
Pseudocode	No	The paper describes methods using text and mathematical formulations but does not contain a clearly labeled pseudocode block or algorithm.
Open Source Code	No	The paper mentions "Our method, which exclusively utilizes open-source datasets" but does not provide any concrete access information (link, explicit statement of release) for the source code of their own methodology.
Open Datasets	Yes	We propose a Mask-guided Human-Centric framework, Re Mask-Animate, which exclusively utilizes open-source datasets to achieve character image animation and significantly enhances the quality of visual generation. Datasets. The Tik Tok dataset comprises 350 dance videos... In contrast, the Fashion dataset is characterized by a minimalistic, pure white background and limited motion variation...
Dataset Splits	Yes	The Fashion dataset... including 500 training videos and 100 testing videos...
Hardware Specification	Yes	We train our model using 4 NVIDIA A800 GPUs in a two-stage process.
Software Dependencies	No	The paper mentions freezing the CLIP image encoder and VAE but does not provide specific version numbers for any programming languages, libraries, or other software components used in their implementation.
Experiment Setup	Yes	In the initial stage... we randomly center-crop the input images to 768 768, use a batch size of 4, and train the model for 60,000 steps with a learning rate of 0.0001. In the subsequent stage, we randomly center-crop video frames to 512 512, use a batch size of 1, and train for an additional 20,000 steps while maintaining the same learning rate.