Transition-Aware best-of-N sampling for Longitudinal Chest X-ray Reports

arXiv cs.CV·Halil Ibrahim Gulluk, Max Van Puyvelde, Wim Van Criekinge, Olivier Gevaert

1d ago

·~2 min·6/30/2026·en·0

Quick Answer

The study introduces a training-free, transition-aware best-of-N sampling method for chest X-ray report generation, outperforming random selection, especially in the Impression section.

Quick Take

The study introduces a training-free, transition-aware best-of-N sampling method for chest X-ray report generation, outperforming random selection, especially in the Impression section. Utilizing four directional set distances, it enhances the accuracy of report generation by leveraging longitudinal patient data across multiple visits.

Key Points

Introduces transition-aware best-of-N sampling for chest X-ray reports.
Outperforms random selection, especially in the Impression section.
Utilizes four directional set distances for improved accuracy.
Framework evaluated on a multi-visit AP-PA cohort.
No training required for the report generation process.

Paper Resources

Read Paperarxiv.org View PDFarxiv.org

📖 Reader Mode

~2 min read

[Submitted on 23 Jun 2026]

View PDF HTML (experimental)

Abstract:In longitudinal clinical practice, every chest X-ray is read in the context of the patients prior exam, and much of what the radiologist communicates is the change from one visit to the next. To the best of our knowledge, we present the first training-free best-of-N sampling scheme for pre-trained chest X-ray report generators that is explicitly aware of this longitudinal prior to current transition. We call it transition-aware best-of-N sampling, each report is split into sentences and embedded into an unordered set in Rd; each (prior, current) pair is reduced to a fixed-dim directional vector via a set-to-set distance designed to encode the change between the two sets; and candidates are scored by cosine distance from their candidate transition vector to a cached bank of ground-truth training transition vectors, aggregated as min or kNN. We instantiate the framework with four directional set distances (mean-shift, novelty residual, directed-Hausdorff anchor, and cost-weighted optimal transport) and evaluate on a multi-visit AP-PA cohort, running inference under three prompts on three vision-language generators. Transition-aware best-of-N outperforms random selection across the board, with the largest relative gains on the Impression section.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2606.28393 [cs.CV]
	(or arXiv:2606.28393v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2606.28393 arXiv-issued DOI via DataCite

Submission history

From: Halil Ibrahim Gulluk [view email]
[v1] Tue, 23 Jun 2026 23:11:59 UTC (1,128 KB)

— Originally published at arxiv.org

Continue reading on arxiv.org

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

More from arXiv cs.CV

See more →

arXiv cs.CV·Shahrzad Esmat, Chaunte W. Lacewell, Sameh Gobriel, Nilesh Jain, Ali Jannesari

3w ago

FeaturedOriginal

LLM-Guided ANN Index Optimization for Human-Object Interaction Retrieval

AI Summary

A phase-aware LLM agent optimizes human-object interaction retrieval, outperforming Optuna TPE by 33.3% and VDTuner by 34.2% on the HICO-DET benchmark. This method enhances throughput by 15.3x over UniIR and demonstrates strong transferability across vector database management systems.

#LLM #Agent #Inference #AI Startup