Concepts Worth Having: Refining VLM-Guided Concept Bottleneck Models with Minimal Annotations

arXiv cs.CV·Nicola Debole, Andrea Passerini, Stefano Teso, Andrea Pugnana, Emanuele Marconato

1d ago

·~2 min·5/19/2026·en·1

Quick Take

VH-CBM enhances concept-bottleneck models using minimal annotations and VLMs for improved interpretability.

Key Points

Combines VLMs with minimal dense annotations.
Utilizes Gaussian Process for better concept prediction.
Achieves higher accuracy with only 1% annotated data.

📖 Reader Mode

~2 min read

[Submitted on 13 May 2026]

View PDF HTML (experimental)

Abstract:Concept-bottleneck models (CBMs) are neural classifiers that compute predictions from high-level concepts extracted from the input. CBMs ensure stakeholders can understand the concepts -- and the predictions they entail -- by learning these from concept-level annotations, which are however seldom available. Recent CBM architectures work around this issue by obtaining annotations from Vision-Language Models (VLMs). While greatly broadening applicability, doing so can yield lower quality concepts and therefore less interpretable models. We strike for a middle ground by introducing Vision-plus-Human-guided CBM (VH-CBM), a hybrid approach that exploits both VLMs and a small amount of dense annotations. VH-CBM employs a Gaussian Process in the VLM's embedding space, which captures useful global information about the target domain, to propagate the expert's supervision to any target data point. Our empirical evaluation shows how VH-CBM predicts more accurate concepts than VLM-guided CBMs even when annotating as little as 1% of the data, while sporting better concept calibration and supporting active learning.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2605.16405 [cs.CV]
	(or arXiv:2605.16405v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2605.16405 arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Nicola Debole [view email]
[v1] Wed, 13 May 2026 10:07:11 UTC (2,363 KB)

— Originally published at arxiv.org

Continue reading on arxiv.org

Concepts Worth Having: Refining VLM-Guided Concept Bottleneck Models with Minimal Annotations

Quick Take

Key Points

📖 Reader Mode

Submission history

More from arXiv cs.CV

GeoSym127K: Scalable Symbolically-verifiable Synthesis for Multimodal Geometric Reasoning

Structuring Open-Ended NAS: Semi-Automated Design Knowledge Structuring with LLMs for Efficient Neural Architecture Search

MedFM-Robust: Benchmarking Robustness of Medical Foundation Models

Related in this space

Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?

From Prompts to Protocols: An AI Agent for Laboratory Automation

Agentic Trading: When LLM Agents Meet Financial Markets