Feature Denoising Diffusion Model for Blind Image Quality Assessment
Authors: Xudong Li, Yan Zhang, Yunhang Shen, Ke Li, Runze Hu, Xiawu Zheng, Sicheng Zhao
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the performance of the proposed PFD-IQA model on eight typical BIQA datasets, including four synthetic LIVE (Sheikh, Sabir, and Bovik 2006), CSIQ (Larson and Chandler 2010), TID2013 (Ponomarenko et al. 2015), KADID (Lin, Hosu, and Saupe 2019) and four authentic datasets LIVEC (Ghadiyaram and Bovik 2015), KONIQ (Hosu et al. 2020), LIVEFB (Ying et al. 2020), SPAQ (Fang et al. 2020). ... Tab. 1 presents a comparative analysis between the proposed PFD-IQA and 14 state-of-the-art BIQA methods. ... Tab. 3, Tab. 4, Tab. 5, Tab. 6 show ablation experiments about ... |
| Researcher Affiliation | Collaboration | Xudong Li1, Yan Zhang1*, Yunhang Shen2, Ke Li2, Runze Hu3, Xiawu Zheng1, Sicheng Zhao4 1Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Ministry of Education of China, Xiamen University, 361005, P.R. China 2Tencent Youtu Lab, Shanghai, China 3School of Information and Electronics, Beijing Institute of Technology, Beijing, China 4BNRist, Tsinghua University, Beijing, China |
| Pseudocode | Yes | Algorithm 1: Pseudocode for proposed PFD-IQA Input: Image x; label yg; Text Tq and Td; diffusion steps T; mode { train , infer }; student, teacher Ns, Nt; softmax σ Output: Predicted quality score ˆy |
| Open Source Code | No | The paper does not explicitly state that the source code for the proposed PFD-IQA method is publicly available. There is no mention of a code repository link or supplementary materials containing the code. |
| Open Datasets | Yes | We evaluate the performance of the proposed PFD-IQA model on eight typical BIQA datasets, including four synthetic LIVE (Sheikh, Sabir, and Bovik 2006), CSIQ (Larson and Chandler 2010), TID2013 (Ponomarenko et al. 2015), KADID (Lin, Hosu, and Saupe 2019) and four authentic datasets LIVEC (Ghadiyaram and Bovik 2015), KONIQ (Hosu et al. 2020), LIVEFB (Ying et al. 2020), SPAQ (Fang et al. 2020). |
| Dataset Splits | Yes | For each dataset, 80% of images are used for training and 20% for testing. This process is repeated 10 times to mitigate bias, and we report the average SRCC and PLCC. For synthetic distortion datasets, training and testing sets are divided by reference images to ensure content independence. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU types, or other accelerator specifications used for running the experiments. |
| Software Dependencies | No | The paper mentions using models like Vi T-B from DeiT III and CLIP-B/16 but does not specify the version numbers of any underlying software dependencies such as PyTorch, TensorFlow, Python, or CUDA. |
| Experiment Setup | Yes | Our model is trained for 9 epochs with a learning rate of 8 10 5, decaying by a factor of 10 every 3 epochs. The batch size is 16 for LIVEC and 64 for Kon IQ. For the student network, the image encoder is based on Vi TB from Dei T III (Touvron, Cord, and J egou 2022) with a decoder depth of one, and the parameters of the text encoder are frozen. ... In all experiments, we empirically set λ1 = 0.5, λ2 = 1, and λ3 = 0.01. ... We find that an iteration of 5 is adequate for effective performance in our approach. |