JEN-1 DreamStyler: Customized Musical Concept Learning via Pivotal Parameters Tuning

Authors: Boyu Chen, Peike Li, Yao Yao, Alex Wang

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our proposed JEN1-Dream Styler outperforms several baselines in both qualitative and quantitative evaluations. We also introduce a new dataset and evaluation protocol for this task. To validate our method, we introduce a new benchmark dataset and an evaluation protocol. Through a combination of qualitative and quantitative assessments, we demonstrate the effectiveness of our proposed method. We generate five baseline models for comparative analysis. As demonstrated in Table 1, ours outperforms these baselines considering the balance of Text and Audio Alignment. Finally, the paper concludes with an in-depth ablation study, providing insights into contributory elements of our method.
Researcher Affiliation Industry Boyu Chen, Peike Li*, Yao Yao, Alex Wang Jen Music AI EMAIL, EMAIL
Pseudocode No The paper describes methods using equations and textual descriptions, but no explicit pseudocode or algorithm blocks are provided.
Open Source Code No The paper provides a link for 'Demo https://www.jenmusic.ai/audio-demos' and 'Datasets https://huggingface.co/datasets/Jen Music AI/Dream Styler', but does not explicitly state that the source code for the methodology described in the paper is openly available or provide a link to a code repository.
Open Datasets Yes Datasets https://huggingface.co/datasets/Jen Music AI/Dream Styler ... To validate our method, we introduce a new benchmark dataset and an evaluation protocol. Through a combination of qualitative and quantitative assessments, we demonstrate the effectiveness of our proposed method. To support this challenging task, we develop a novel dataset and evaluation protocol specifically tailored for customized music generation. This dataset serves as a benchmark for assessing our method and establishes a foundation for future research in this area.
Dataset Splits No The paper states: "For each concept, we collect a two-minute audio segment to form the training set." and "In our evaluation suite, we generated 50 audio clips for each concept and prompt." This describes the data used for fine-tuning specific concepts and for evaluation, but it does not provide explicit training/test/validation splits for a larger dataset or for the new dataset introduced, beyond using a 2-minute segment as a 'training set' for each concept.
Hardware Specification Yes All experiments are conducted using an A6000 GPU and Pytorch framework.
Software Dependencies No The paper mentions "Pytorch framework" and utilizing the "JEN-1 (Li et al. 2024) model" and "FLAN-T5 (Chung et al. 2022)", but it does not provide specific version numbers for any of these software components or libraries.
Experiment Setup No The paper mentions: "More training details are shown in the Supplementary Materials." This indicates that specific experimental setup details, such as hyperparameters or training configurations, are not provided in the main text.