EgoPrivacy: What Your First-Person Camera Says About You?

Authors: Yijiang Li, Genpei Zhang, Jiacheng Cheng, Yi Li, Xiaojun Shan, Dashan Gao, Jiancheng Lyu, Yuan Li, Ning Bi, Nuno Vasconcelos

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental An extensive comparison of the different attacks possible under all threat models is presented, showing that private information of the wearer is highly susceptible to leakage. For instance, our findings indicate that foundation models can effectively compromise wearer privacy even in zero-shot settings by recovering attributes such as identity, scene, gender, and race with 70 80% accuracy.
Researcher Affiliation Collaboration 1University of California, San Diego 2Carnegie Mellon University 3Qualcomm AI Research, an initiative of Qualcomm Technologies, Inc.
Pseudocode No The paper describes methods and formulations but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes Our code and data are available at https://github.com/ williamium3000/ego-privacy.
Open Datasets Yes Ego Privacy is a benchmark of synchronized ego-exo video, built upon Ego-Exo4D (Grauman et al., 2024) and Charades-Ego (Sigurdsson et al., 2018a). ... For the hand-based model, we trained a Res Net50 classifier on the publicly available 11K Hands dataset (Afifi, 2019).
Dataset Splits Yes Following the train/test split proposed in (Grauman et al., 2024), we split the Ego-Exo4D videos into a training set Dtrain, that can be used for model finetuning, and a test set Dtest for ID evaluation. Charades-Ego is then solely used as a test set for OOD evaluation.
Hardware Specification Yes All models are trained with 1 A100 with a batch size of 8. ... We also acknowledge the NRP Nautilus cluster, used for some of the experiments discussed above.
Software Dependencies No The paper mentions various models and tools like CLIP, Video MAE, Ego VLPv2, LLaVA-1.5, Video LLaMA2, ResNet50, YOLO, Fair Face, and Retina Face, but it does not specify version numbers for any of these software components or programming languages used.
Experiment Setup Yes All models are trained with 1 A100 with a batch size of 8. We use a learning rate of 1e-5 and adopt the Adam W optimizer with cosine learning rate decay. The default number of frames for one video is 8.