Few-class Fidelity: Evaluating Explanations of Real-conditions CNN classifiers with Optimized Perturbations
Quick Answer
This paper introduces a Fidelity-based XAI metric variation tailored for low-class real-world CNN applications, generating uncertainty-provoking perturbations for accurate evaluation.
Quick Take
This paper introduces a Fidelity-based XAI metric variation tailored for low-class real-world CNN applications, generating uncertainty-provoking perturbations for accurate evaluation. It demonstrates the framework's effectiveness by comparing it with human-centric metrics in medical and natural imaging, revealing the complex interplay between domain, data curation, and XAI solutions.
Key Points
- Proposes a new Fidelity-based XAI metric for low-class CNN applications.
- Generates in-distribution perturbations to measure XAI method faithfulness.
- Compares the framework with human-centric metrics in medical imaging.
- Highlights the correlation between domain, data curation, and XAI choices.
- Validates CNN model training through comprehensive evaluation methods.
Paper Resources
📖 Reader Mode
~2 min readAbstract:The wide use of Convolutional Neural Networks (CNN) in numerous domains and real-world classification applications is justified by their high precision and automation speed, helping users concentrate on higher-expertise tasks. To better understand the models and avoid bias during deployment, eXplainable Artificial Intelligence (XAI) techniques can be used after training. But as the list of XAI solutions expand, comparisons between them diverge, and consensus over their evaluation cannot be reached. This paper proposes a variation of Fidelity-based XAI metrics, with a focus on real-conditions applications, where the number of classes is often low. The approach generates in-distribution, uncertainty-provoking perturbations, to ensure proper measurement of the XAI methods faithfulness. As demonstration of the evaluation framework usefulness, it is compared with human-centric object localization and segmentation metrics. Once applied to both medical and natural imaging applications, it highlights the intricate correlation between domain, data curation, and XAI solution choices in order to validate training of a new CNN model.
| Subjects: | Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI) |
| Cite as: | arXiv:2606.28391 [cs.CV] |
| (or arXiv:2606.28391v1 [cs.CV] for this version) | |
| https://doi.org/10.48550/arXiv.2606.28391 arXiv-issued DOI via DataCite |
Submission history
From: Wistan Marchadour [view email]
[v1]
Tue, 23 Jun 2026 17:26:47 UTC (6,402 KB)
— Originally published at arxiv.org
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CV
See more →LLM-Guided ANN Index Optimization for Human-Object Interaction Retrieval
A phase-aware LLM agent optimizes human-object interaction retrieval, outperforming Optuna TPE by 33.3% and VDTuner by 34.2% on the HICO-DET benchmark. This method enhances throughput by 15.3x over UniIR and demonstrates strong transferability across vector database management systems.