Discrete Diffusion Language Models for Interactive Radiology Report Drafting

arXiv cs.AI·Max Van Puyvelde, Halil Ibrahim Gulluk, Wim Van Criekinge, Olivier Gevaert

3h ago

·~1 min·7/3/2026·en·0

Quick Answer

This paper shows that The DiffusionGemma-26B model outperforms its autoregressive counterpart Gemma-4-26B in medical visual question answering, achieving faster decoding and superior drafting capabilities.

Quick Take

The DiffusionGemma-26B model outperforms its autoregressive counterpart Gemma-4-26B in medical visual question answering, achieving faster decoding and superior drafting capabilities. This diffusion model allows radiologists to infill report fragments bidirectionally, addressing inconsistencies in clinical reports.

Key Points

DiffusionGemma-26B matches or exceeds AR performance on all tested datasets.
The finetuned model operates 3.5-4.4x faster than autoregressive models.
Diffusion models enable any-order infill, enhancing report drafting.
Medical foundation models remain predominantly autoregressive despite advancements.
Results are evaluated by a verbosity-robust LLM judge.

Paper Resources

Read Paperarxiv.org View PDFarxiv.org

Article Excerpt

From source RSS / original summary

arXiv:2607. 01436v1 Announce Type: new Abstract: Diffusion language models, which generate text by denoising a token canvas bidirectionally instead of emitting tokens left to right, have become competitive with autoregressive (AR) generation. Medical foundation models, however, remain almost entirely autoregressive.

We adapt a mixture-of-experts diffusion language model, DiffusionGemma-26B, and benchmark it against its same-size AR sibling Gemma-4-26B under an identical LoRA recipe on medical visual question answering datasets, scored by a verbosity-robust LLM judge. Diffusion matches or exceeds AR on all of them, and the finetuned model (3. 8B active) is competitive with frontier ; its decoding is also 3. 5-4. 4x faster.

Beyond this parity, the diffusion model offers a drafting capability AR lacks: any-order infill. Because the canvas is denoised bidirectionally, a radiologist can fix report fragments and have the model fill the text between them, an operation inherent to diffusion but not to autoregression, which is subpar at it. This suits real reports, which are often terse or inconsistent across clinicians and institutions.

Read on arxiv.org

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

More from arXiv cs.AI

See more →

arXiv cs.AI·Ye Liu, Srijan Bansal, Bo Pang, Yang Li, Zeyu Leo Liu, Yifei Ming, Zixuan Ke, Shafiq Joty, Semih Yavuz

3h ago

FeaturedOriginal

Procedural Memory Distillation: Online Reflection for Self-Improving Language Models

AI Summary

Procedural Memory Distillation (PMD) enhances reinforcement learning by converting cross-episode signals into reusable memory, improving Qwen3-8B and OLMo3-Instruct-7B models by 3.8-5.5% on SCIKNOWEVAL and 7.9-13.6% on . The co-evolution of policy and memory allows for more effective self-supervision, demonstrating significant performance gains when both components are active.

#LLM #AI Coding #Inference #Policy