Knowing When Not to Predict: Self Supervised Learning and Abstention for Safer DR Screening

arXiv cs.CV·Muskaan Chopra, Lorenz Sparrenberg, Jan H. Terheyden, Rafet Sifa

17h ago

·~2 min·5/20/2026·en·1

Quick Take

Self-supervised learning enhances diabetic retinopathy screening by enabling models to abstain from unreliable predictions.

Key Points

SSL pretraining improves selective prediction reliability.
Longer pretraining does not always enhance model confidence.
Abstention-aware evaluation is crucial for safety-critical tasks.

📖 Reader Mode

~2 min read

[Submitted on 18 May 2026]

View PDF HTML (experimental)

Abstract:Self-supervised learning (SSL) is now a standard way to pretrain medical image models, but performance is still mostly judged by downstream accuracy. For safety-critical screening tasks such as diabetic retinopathy grading, this is not enough: a model must also know when its predictions are unreliable and defer uncertain cases for clinical review. In this work, we examine how the length of SSL pretraining influences calibrated confidence and confidence-based abstention. We evaluate multiple SSL checkpoints under a fixed fine-tuning protocol and assess calibrated confidence, coverage, selective accuracy, and selective macro-F1. Across datasets and data regimes, SSL pretraining improves selective prediction compared to training from scratch. Unlike prior SSL studies that primarily evaluate downstream accuracy or AUROC, we analyze how SSL pretraining duration influences confidence behavior under calibrated confidence-based abstention. However, once accuracy saturates, selective performance can still change markedly across checkpoints, and longer pretraining does not consistently improve reliability. These results underscore the importance of abstention-aware evaluation and suggest that pretraining length should be treated as an important reliability-related design choice rather than only a computational detail. Code is available at GitHub.

Comments:	Accepted at IJCAI 2026
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2605.19133 [cs.CV]
	(or arXiv:2605.19133v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2605.19133 arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Muskaan Chopra [view email]
[v1] Mon, 18 May 2026 21:32:03 UTC (1,526 KB)

— Originally published at arxiv.org

Continue reading on arxiv.org

Knowing When Not to Predict: Self Supervised Learning and Abstention for Safer DR Screening

Quick Take

Key Points

📖 Reader Mode

Submission history

More from arXiv cs.CV

GeoSym127K: Scalable Symbolically-verifiable Synthesis for Multimodal Geometric Reasoning

Structuring Open-Ended NAS: Semi-Automated Design Knowledge Structuring with LLMs for Efficient Neural Architecture Search

MedFM-Robust: Benchmarking Robustness of Medical Foundation Models

Related in this space

Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?

From Prompts to Protocols: An AI Agent for Laboratory Automation

Agentic Trading: When LLM Agents Meet Financial Markets