reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Efficient Fine-Grained Guidance for Diffusion Model Based Symbolic Music Generation

Authors: Tingyu Zhu, Haoyu Liu, Ziyu Wang, Zhimin Jiang, Zeyu Zheng

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We provide numerical experiments and subjective evaluation to demonstrate the effectiveness of our approach. We have published a demo page 1 to showcase performances, which enables real-time interactive generation.
Researcher Affiliation	Collaboration	1University of California, Berkeley, USA 2New York University, New York, USA 3Touka Technologies. Correspondence to: Haoyu Liu <EMAIL>, Zeyu Zheng <EMAIL>.
Pseudocode	Yes	Algorithm 1 DDPM sampling with fine-grained harmonic control Algorithm 2 DDPM sampling with fine-grained textural guidance
Open Source Code	Yes	The demo page is available at https://huajianduzhuocode.github.io/FGG-diffusion-music/, we also release the complete source code at https://github.com/huajianduzhuocode/FGG-music-code
Open Datasets	Yes	We use the POP909 dataset (Wang et al., 2020a) for training and evaluation. This dataset consists of 909 MIDI pieces of pop songs, each containing lead melodies, chord progression, and piano accompaniment tracks.
Dataset Splits	Yes	We use the POP909 dataset (Wang et al., 2020a) for training and evaluation. This dataset consists of 909 MIDI pieces of pop songs, each containing lead melodies, chord progression, and piano accompaniment tracks. We exclude 29 pieces that are in triple meter. 90% of the data are used to train our model, and the remaining 10% are used for evaluation.
Hardware Specification	Yes	It takes 0.4 seconds to generate the 4-measure accompaniment on a NVIDIA RTX 6000 Ada Generation GPU.
Software Dependencies	No	The paper mentions an 'Adam W optimizer' but does not specify software names with version numbers for libraries or frameworks used (e.g., Python, PyTorch version).
Experiment Setup	Yes	We set diffusion timesteps T = 1000 with β0 = 8.5e 4 and βT = 1.2e 2. We use Adam W optimizer with a learning rate of 5e 5, β1 = 0.9, and β2 = 0.999. We applied data augmentation by transposing each 4-measure piece into all 12 keys. ... Training is conducted with a batch size of 16, utilizing random sampling without replacement. ... resulting in a total of 23,642 iterations.