MindCustomer: Multi-Context Image Generation Blended with Brain Signal
Authors: Muzhou Yu, Shuyun Lin, Lei Ma, Bo Lei, Kaisheng Ma
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Section 4: Experiments. This section includes subsections like "4.1. Qualitative Results", "4.2. Comparisons and Analysis", "4.3. Few-shot Generation on New Subject", and "4.4. Ablation", along with numerous figures and tables presenting empirical data, metrics (e.g., CLIP-I, DINOv2, CLIP-IQA), and comparisons against baselines. |
| Researcher Affiliation | Collaboration | The authors are affiliated with: 1Xi an Jiaotong University, 2Tsinghua University, 3Beijing Academy of Artificial Intelligence, 4Peking University. This mix includes academic institutions (Xi'an Jiaotong University, Tsinghua University, Peking University) and a research institution (Beijing Academy of Artificial Intelligence), indicating a collaborative effort. |
| Pseudocode | No | The paper describes its methods and processes using descriptive text and mathematical formulations. It does not contain any explicitly labeled "Pseudocode" or "Algorithm" blocks with structured, code-like steps. |
| Open Source Code | No | The paper does not contain any explicit statement about releasing source code, nor does it provide a link to a code repository or mention code in supplementary materials for the methodology described in the paper. It mentions using "publicly available versatile diffusion" which refers to a third-party tool. |
| Open Datasets | Yes | In Section 3.1 Preliminaries, the paper states: "In this paper, we utilize the widely-used data, Natural Scenes Dataset (NSD) (Allen, 2022), as our brain contexts." and "...sourced from the MS-COCO dataset (Lin et al., 2014)." Both NSD and MS-COCO are well-known public datasets, and formal citations are provided. |
| Dataset Splits | Yes | In Section 3.1 Preliminaries, under 'NSD Data', the paper specifies: "we utilize all subject-wise data from four subjects (Subj1, 2, 5, 7) as the training data. Each subject viewed 8859 individual images. And the remaining data of 982 images were viewed by all four subjects as the test data." |
| Hardware Specification | Yes | In Appendix B. Implementation, the paper states: "The time of one image generation is about 6 minutes on a single Tesla A100 GPU." |
| Software Dependencies | No | The paper mentions "publicly available versatile diffusion" as the base structure but does not specify any version numbers for this or any other software libraries, programming languages, or environments crucial for replication. |
| Experiment Setup | Yes | Appendix B. Implementation, details specific parameters: "We train the subject-wise IBT with a learning rate of 5e 5 for 200 epochs and then adopt Adam W schedule for our Brain Embedder with a learning rate of 3e 3 for 600 epochs." and "We fine-tune the VD for 200 epochs with a learning rate of 5e 8 utilizing Adam schedule. Then we optimize the brain embedding for 100 epochs with a learning rate of 1e 5." Table 5 provides a detailed breakdown of optimizers, learning rates, weight decay, training epochs, batch sizes, and LR schedulers for each component. |