SegFace: Face Segmentation of Long-Tail Classes
Authors: Kartik Narayan, Vibashan Vs, Vishal M. Patel
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive experiments demonstrating that Seg Face significantly outperforms previous state-of-the-art models, achieving a mean F1 score of 88.96 (+2.82) on the Celeb AMask-HQ dataset and 93.03 (+0.65) on the La Pa dataset. |
| Researcher Affiliation | Academia | Kartik Narayan, Vibashan VS, Vishal M. Patel Johns Hopkins University EMAIL |
| Pseudocode | No | The paper describes the proposed method in prose and provides a diagram (Figure 2) but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code https://github.com/Kartik-3004/Seg Face |
| Open Datasets | Yes | We conduct our experiments on three standard face segmentation datasets: La Pa (Liu et al. 2020), Celeb AMask HQ (Lee et al. 2020b), and Helen (Le et al. 2012). |
| Dataset Splits | Yes | The La Pa dataset contains a total of 22,168 images, with 18,176 used for training, 2,000 for validation, and 2,000 for testing. This dataset is annotated for 11 classes, including skin, hair, nose, left eye, right eye, left brow, right brow, upper lip, and lower lip. The Celeb AMask-HQ dataset comprises 30,000 face images, split into 24,183 for training, 2,993 for validation, and 2,824 for testing. It features 19 semantic classes, including accessories such as earring, necklace, eyeglass, and hat, which are considered long-tail classes due to their infrequent occurrence in the dataset. The other classes are the same as those in the La Pa dataset, with the addition of left/right ear, cloth and neck. The Helen dataset, being the smallest, consists of 2,000 training samples, 230 validation samples, and 100 test samples, annotated for 11 classes. |
| Hardware Specification | Yes | All code was implemented in Py Torch, and the models were trained on eight A6000 GPUs, each equipped with 48 GB of memory. |
| Software Dependencies | No | The paper mentions "Py Torch" but does not specify a version number or other software dependencies with version numbers. |
| Experiment Setup | Yes | The models were optimized for 300 epochs using the Adam W optimizer, with an initial learning rate of 1e 4 and a weight decay of 1e 5. We employed a step LR scheduler with a gamma value of 0.1, which reduces the learning rate by a factor of 0.1 at epochs 80 and 200. A batch size of 32 was used for training on the La Pa and Celeb AMask-HQ datasets, and 16 for the Helen dataset. We did not perform any augmentations on the Celeb AMask-HQ and Helen datasets. For the La Pa dataset, we applied random rotation [ 30 , 30 ], random scaling [0.5, 3], and random translation [ 20px, 20px], along with Ro I tanh warping (Lin et al. 2019) to ensure that the network focused on the face region. The λ1 and λ2 values were set at 0.5 for dice loss and cross entropy loss, respectively. |