HVDualformer: Histogram-Vision Dual Transformer for White Balance

Authors: Yan-Tsung Peng, Guan-Rong Chen

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experimental results on public benchmark datasets demonstrate that the proposed model performs favorably against state-of-the-art methods.
Researcher Affiliation Academia Yan-Tsung Peng* and Guan-Rong Chen National Chengchi University EMAIL, EMAIL
Pseudocode No The paper describes the architecture and methodology using textual descriptions, mathematical equations (e.g., equations 1, 2, 4, 5, 6), and block diagrams (Figure 2, Figure 3), but no explicit pseudocode or algorithm blocks are provided.
Open Source Code Yes Code https://github.com/ytpeng-aimlab/HVDualformer
Open Datasets Yes The Rendered WB Dataset Set1 (Afifi et al. 2019) consists of 62, 535 s RGB images rendered from two public illumination estimation datasets: the NUS dataset (Cheng, Prasad, and Brown 2014) and Gehler dataset (Gehler et al. 2008)... To evaluate the performance, we use three commonly used evaluation datasets: Set1-Test (21,046 images), Set2 of the Rendered WB Dataset (2,881 images), and the s RGB Rendered version of the Cube+ Dataset (10,242 images). Set1-Test corresponds to fold1 of the Rendered WB Dataset Set1 (Afifi et al. 2019). The s RGB images in the Rendered WB Dataset Set2 are rendered from raw images of the NUS dataset (Cheng, Prasad, and Brown 2014)... The Rendered Cube+ Dataset... derived from ... the Cube+ Dataset (Bani c, Koˇsˇcevi c, and Lonˇcari c 2017).
Dataset Splits Yes Following (Afifi and Brown 2020; Li, Kang, and Ming 2023; Li et al. 2023), we randomly select 12,000 images from the Set1 s fold2 and fold3... for training. Testing sets. To evaluate the performance, we use three commonly used evaluation datasets: Set1-Test (21,046 images)... Set1-Test corresponds to fold1 of the Rendered WB Dataset Set1 (Afifi et al. 2019).
Hardware Specification No The paper does not provide specific hardware details such as GPU/CPU models, memory, or other detailed computer specifications used for running experiments. It only mentions model sizes and parameter counts.
Software Dependencies No The paper mentions using 'Adam W optimizer (Loshchilov and Hutter 2017)' and 'Adam optimizer (Kingma and Ba 2014)' but does not provide specific version numbers for these optimizers or other key software components such as deep learning frameworks (e.g., PyTorch, TensorFlow) or their versions, nor the programming language version used.
Experiment Setup Yes During the training phase, we simultaneously optimize Histoformer and Visformer for 350 epochs. Histoformer is trained using the Adam W optimizer (Loshchilov and Hutter 2017) with a decay rate of the gradient moving average, β1 = 0.9, and β2 = 0.999, while Visformer is trained using the Adam optimizer (Kingma and Ba 2014) with β1 = 0.5 and β2 = 0.999 The learning rate is set to 2e 4. ... During training, we randomly crop four 128 128 patches from the training images as input during training. Additionally, we apply geometric transformations, including rotation and flipping, for data augmentation.