SegFace: Face Segmentation of Long-Tail Classes

Authors: Kartik Narayan, Vibashan Vs, Vishal M. Patel

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct extensive experiments demonstrating that Seg Face significantly outperforms previous state-of-the-art models, achieving a mean F1 score of 88.96 (+2.82) on the Celeb AMask-HQ dataset and 93.03 (+0.65) on the La Pa dataset.
Researcher Affiliation Academia Kartik Narayan, Vibashan VS, Vishal M. Patel Johns Hopkins University EMAIL
Pseudocode No The paper describes the proposed method in prose and provides a diagram (Figure 2) but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes Code https://github.com/Kartik-3004/Seg Face
Open Datasets Yes We conduct our experiments on three standard face segmentation datasets: La Pa (Liu et al. 2020), Celeb AMask HQ (Lee et al. 2020b), and Helen (Le et al. 2012).
Dataset Splits Yes The La Pa dataset contains a total of 22,168 images, with 18,176 used for training, 2,000 for validation, and 2,000 for testing. This dataset is annotated for 11 classes, including skin, hair, nose, left eye, right eye, left brow, right brow, upper lip, and lower lip. The Celeb AMask-HQ dataset comprises 30,000 face images, split into 24,183 for training, 2,993 for validation, and 2,824 for testing. It features 19 semantic classes, including accessories such as earring, necklace, eyeglass, and hat, which are considered long-tail classes due to their infrequent occurrence in the dataset. The other classes are the same as those in the La Pa dataset, with the addition of left/right ear, cloth and neck. The Helen dataset, being the smallest, consists of 2,000 training samples, 230 validation samples, and 100 test samples, annotated for 11 classes.
Hardware Specification Yes All code was implemented in Py Torch, and the models were trained on eight A6000 GPUs, each equipped with 48 GB of memory.
Software Dependencies No The paper mentions "Py Torch" but does not specify a version number or other software dependencies with version numbers.
Experiment Setup Yes The models were optimized for 300 epochs using the Adam W optimizer, with an initial learning rate of 1e 4 and a weight decay of 1e 5. We employed a step LR scheduler with a gamma value of 0.1, which reduces the learning rate by a factor of 0.1 at epochs 80 and 200. A batch size of 32 was used for training on the La Pa and Celeb AMask-HQ datasets, and 16 for the Helen dataset. We did not perform any augmentations on the Celeb AMask-HQ and Helen datasets. For the La Pa dataset, we applied random rotation [ 30 , 30 ], random scaling [0.5, 3], and random translation [ 20px, 20px], along with Ro I tanh warping (Lin et al. 2019) to ensure that the network focused on the face region. The λ1 and λ2 values were set at 0.5 for dice loss and cross entropy loss, respectively.