reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Improving Diffusion Models for Scene Text Editing with Dual Encoders

Authors: Jiabao Ji, Guanhua Zhang, Zhaowen Wang, Bairu Hou, Zhifei Zhang, Brian L. Price, Shiyu Chang

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our approach on five datasets and demonstrate its superior performance in terms of text correctness, image naturalness, and style controllability. Our code is publicly available at https://github.com/UCSB-NLP-Chang/Diff STE.
Researcher Affiliation	Collaboration	1University of California, Santa Barbara 2Adobe Research
Pseudocode	No	The paper describes the methodology, dual-encoder design, and instruction tuning framework in prose, but it does not contain a formally labeled pseudocode or algorithm block.
Open Source Code	Yes	Our code is publicly available at https://github.com/UCSB-NLP-Chang/Diff STE.
Open Datasets	Yes	As described in Section 3.2, we collect 1.3M examples by combining the synthetic dataset (Synthetic) and three real-world datasets (Ar TChng et al. (2019), COCOText Gomez et al. (2017), and Text OCR Singh et al. (2021)) for instruction tuning. For thew Synthetic dataset, we randomly pick up 100 font families from the google fonts library1 and 954 XKCD colors2 for text rendering.
Dataset Splits	Yes	We randomly select 200 images from each dataset for validation and 1000 images for testing.
Hardware Specification	Yes	In total, the training has 80k steps, which requires approximately two days of training time using eight Nvidia-V100 gpus.
Software Dependencies	No	The paper mentions software like "diffusers" and specific "stable-diffusion-inpainting" models, but it does not provide explicit version numbers for these software components.
Experiment Setup	Yes	The batch size is set to 256. We use the Adam W optimizer with a fixed learning rate 5e 5 to train the model for 15 epochs.