Toward Ethical Facial Age Estimation: A Generalized Zero-Shot Benchmark Without Training on Children's Data
Quick Take
This study introduces a zero-shot benchmark for facial age estimation that excludes children's data, revealing a 46.4% average performance drop across nine state-of-the-art methods when generalizing to unseen age groups. The findings underscore ethical concerns in current practices and emphasize the need for responsible data use.
Key Points
- Proposes a zero-shot benchmark excluding children's data for facial age estimation.
- Evaluates nine state-of-the-art methods, revealing a 46.4% performance drop.
- Models exhibit seen-class bias, anchoring predictions to nearby known classes.
- Standardized splits ensure strict age-group separation for training and evaluation.
- Highlights ethical gaps in current modeling practices and data use.
Article Content
From source RSS / original summaryarXiv:2605. 29230v1 Announce Type: new Abstract: Age estimation from facial images typically relies on training data that includes images of minors, a practice that raises serious ethical, legal, and privacy concerns. In this work, we propose a generalized zero-shot benchmark for facial age estimation that explicitly excludes children's data during training while still assessing model performance on younger populations.
We revisit six widely used datasets and introduce standardized splits with strict age-group separation: samples aged 18-59 for training, validation, and testing; samples under 18 reserved exclusively for zero-shot evaluation; and samples 60+ as an unseen validation set for model selection under distribution shift. For datasets with identity annotations, subject-exclusive splits prevent identity leakage and better reflect real-world deployment conditions.
Evaluating nine state-of-the-art age estimation methods under this protocol reveals that all evaluated methods consistently fail to generalize to unseen age groups, suffering substantial performance degradation -- on average 46. 4%, and up to 52. 8% -- relative to the supervised baseline. Moreover, models do not simply degrade: they systematically anchor predictions for unseen ages to nearby seen classes, a manifestation of the well-known seen-class bias in generalized zero-shot learning.
By formalizing age estimation without children's data as a generalized zero-shot benchmark on existing datasets, this work highlights a critical gap between current modeling practices and real-world ethical constraints. Our benchmark provides a principled basis for evaluating models under restricted data regimes and encourages the development of methods that are robust to distribution shift and aligned with responsible data use.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CV
See more →Evi-Steer: Learning to Steer Biomedical Vision-Language Models through Efficient and Generalizable Evidential Tuning
Evi-Steer introduces a novel evidential tuning framework for BiomedCLIP, achieving 0.11% parameter updates while enhancing uncertainty-aware fine-tuning. It outperforms state-of-the-art methods across 15 biomedical imaging datasets, proving effective in few-shot learning and domain shifts for clinical applications.