MetaRanker: Human-in-the-loop Active Ranking for Metalens Image Quality
Quick Take
MetaRanker is a human-in-the-loop active ranking framework that improves metalens image quality evaluation by aligning rankings with human assessments, reducing required pairwise annotations by 80%. It leverages a probabilistic preference model and vision-language models for semantic interpretability, addressing the perception-distortion trade-off in optical systems.
Key Points
- MetaRanker formalizes metalens image quality through semantic interpretability.
- It combines human judgments with a probabilistic preference model for effective ranking.
- The framework reduces pairwise annotation requirements by approximately 80%.
- Standard image quality metrics show limited alignment with human interpretability.
- MetaRanker enhances perceptually grounded evaluation and co-design of metalenses.
Article Content
From source RSS / original summaryarXiv:2605. 29212v1 Announce Type: new Abstract: Image quality in modern imaging systems emerges from the coupled effects of the sensor, optics, and computational reconstruction. Ultra-thin metalenses offer a path toward substantial miniaturization of optical modules, but practical designs often exhibit pronounced chromatic and field-dependent aberrations that necessitate computational reconstruction.
In current metalens pipelines, reconstruction models are commonly trained and selected using distortion-based fidelity objectives, such as PSNR, yet these proxies can be weakly correlated with human preference and downstream utility, reflecting the well-known perception--distortion trade-off.
We introduce MetaRanker, a human-in-the-loop active ranking framework that formalizes metalens image quality in terms of semantic interpretability, defined as the degree to which humans can reliably recognize objects and structures in the presence of optical artifacts. MetaRanker combines a probabilistic preference model with uncertainty-aware query selection, and leverages vision--language models to provide lightweight semantic priors.
Importantly, these priors are used only to guide the sampling of informative comparisons; human judgments remain the primary supervision signal throughout. Across real-world and synthetic metalens datasets with distinct degradation profiles, MetaRanker produces rankings that align most closely with human assessments, while reducing the number of pairwise annotations required by approximately 80% relative to exhaustive pairwise evaluation.
Finally, we show that standard image quality assessment metrics exhibit limited alignment with human interpretability in the metalens domain, positioning MetaRanker as a practical step toward perceptually grounded metalens evaluation and co-design.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CV
See more →Evi-Steer: Learning to Steer Biomedical Vision-Language Models through Efficient and Generalizable Evidential Tuning
Evi-Steer introduces a novel evidential tuning framework for BiomedCLIP, achieving 0.11% parameter updates while enhancing uncertainty-aware fine-tuning. It outperforms state-of-the-art methods across 15 biomedical imaging datasets, proving effective in few-shot learning and domain shifts for clinical applications.
