Knowing When Not to Predict: Self Supervised Learning and Abstention for Safer DR Screening
Quick Take
Self-supervised learning enhances diabetic retinopathy screening by enabling models to abstain from unreliable predictions.
Key Points
- SSL pretraining improves selective prediction reliability.
- Longer pretraining does not always enhance model confidence.
- Abstention-aware evaluation is crucial for safety-critical tasks.
📖 Reader Mode
~2 min readAbstract:Self-supervised learning (SSL) is now a standard way to pretrain medical image models, but performance is still mostly judged by downstream accuracy. For safety-critical screening tasks such as diabetic retinopathy grading, this is not enough: a model must also know when its predictions are unreliable and defer uncertain cases for clinical review. In this work, we examine how the length of SSL pretraining influences calibrated confidence and confidence-based abstention. We evaluate multiple SSL checkpoints under a fixed fine-tuning protocol and assess calibrated confidence, coverage, selective accuracy, and selective macro-F1. Across datasets and data regimes, SSL pretraining improves selective prediction compared to training from scratch. Unlike prior SSL studies that primarily evaluate downstream accuracy or AUROC, we analyze how SSL pretraining duration influences confidence behavior under calibrated confidence-based abstention. However, once accuracy saturates, selective performance can still change markedly across checkpoints, and longer pretraining does not consistently improve reliability. These results underscore the importance of abstention-aware evaluation and suggest that pretraining length should be treated as an important reliability-related design choice rather than only a computational detail. Code is available at GitHub.
| Comments: | Accepted at IJCAI 2026 |
| Subjects: | Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI) |
| Cite as: | arXiv:2605.19133 [cs.CV] |
| (or arXiv:2605.19133v1 [cs.CV] for this version) | |
| https://doi.org/10.48550/arXiv.2605.19133 arXiv-issued DOI via DataCite (pending registration) |
Submission history
From: Muskaan Chopra [view email]
[v1]
Mon, 18 May 2026 21:32:03 UTC (1,526 KB)
— Originally published at arxiv.org
More from arXiv cs.CV
See more →GeoSym127K: Scalable Symbolically-verifiable Synthesis for Multimodal Geometric Reasoning
GeoSym127K introduces a scalable neuro-symbolic framework for enhanced geometric reasoning in multimodal models.