Beyond Points: Spherical Distributional Part Prototypes for Interpretable Classification

arXiv cs.CV·Duarte Le\~ao, Diogo Pereira Ara\'ujo, Catarina Barata, Carlos Santiago

2d ago

·~2 min·6/29/2026·en·0

Quick Answer

This paper shows that The vMFProto framework introduces a mixture of von Mises-Fisher components for classifying images, enhancing interpretability by addressing intra-class variability.

Quick Take

The vMFProto framework introduces a mixture of von Mises-Fisher components for classifying images, enhancing interpretability by addressing intra-class variability. It achieves state-of-the-art explanation quality on benchmarks like CUB-200-2011 and Stanford Dogs while maintaining competitive accuracy through a two-stage training process.

Key Points

vMFProto models classes as mixtures of von Mises-Fisher components on the hypersphere.
Achieves state-of-the-art explanation quality with improved consistency, stability, and distinctiveness.
Utilizes entropic optimal transport for structured patch-to-prototype assignments.
Demonstrated effectiveness on CUB-200-2011, Stanford Dogs, and Stanford Cars datasets.
Two-stage training includes prototype discovery and end-to-end refinement.

Paper Resources

Read Paperarxiv.org View PDFarxiv.org

📖 Reader Mode

~2 min read

[Submitted on 25 Jun 2026]

View PDF HTML (experimental)

Abstract:Prototype-based neural networks aim to provide intrinsic interpretability by grounding predictions in a small set of part prototypes. However, modern vision backbones typically operate in normalized, directional embedding spaces where each semantic part exhibits substantial intra-class variability. As a result, point prototypes often become redundant or unstable, hurting both explanation quality and robustness. We propose vMFProto, a distributional part-prototype framework that models each class as a mixture of von Mises-Fisher components on the hypersphere. Each prototype learns its own concentration, capturing part-specific variability, and we use entropic optimal transport (OT) to obtain structured patch-to-prototype assignments. A two-stage training schedule performs OT-driven prototype discovery followed by end-to-end refinement with patch-level distillation and distribution-aware diversity regularization. Experiments on CUB-200-2011, Stanford Dogs, and Stanford Cars with frozen DINO backbones show that vMFProto achieves state-of-the-art explanation quality (consistency, stability, and distinctiveness) with competitive accuracy. Qualitative results confirm that vMFProto yields localized, non-redundant part evidence.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2606.27582 [cs.CV]
	(or arXiv:2606.27582v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2606.27582 arXiv-issued DOI via DataCite

Submission history

From: Carlos Santiago [view email]
[v1] Thu, 25 Jun 2026 22:16:41 UTC (68,271 KB)

— Originally published at arxiv.org

Continue reading on arxiv.org

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

More from arXiv cs.CV

See more →

arXiv cs.CV·Shahrzad Esmat, Chaunte W. Lacewell, Sameh Gobriel, Nilesh Jain, Ali Jannesari

3w ago

FeaturedOriginal

LLM-Guided ANN Index Optimization for Human-Object Interaction Retrieval

AI Summary

A phase-aware LLM agent optimizes human-object interaction retrieval, outperforming Optuna TPE by 33.3% and VDTuner by 34.2% on the HICO-DET benchmark. This method enhances throughput by 15.3x over UniIR and demonstrates strong transferability across vector database management systems.

#LLM #Agent #Inference #AI Startup