Google's new open model DiffusionGemma generates text from noise… | AI Deep Signal

Google's new open model DiffusionGemma generates text from noise instead of word by word

The Decoder·Jonathan Kemper

6/10/2026

·~4 min·6/10/2026·en·2

Quick Answer

Google has launched DiffusionGemma, a 26-billion-parameter model that generates text through diffusion rather than token-by-token, achieving around 1,000 tokens per second on a single H100 GPU.

Quick Take

While it operates four times faster than traditional autoregressive models, the output quality is lower, making it an experimental tool for developers.

Key Points

DiffusionGemma uses a novel diffusion approach for text generation.
The model has 26 billion parameters and generates text at 1,000 tokens per second.
It is four times faster than comparable autoregressive models.
Output quality is lower, limiting its current use to experimental purposes.
Google targets developers with this new tool for further exploration.

Source Excerpt

Google released DiffusionGemma, a 26-billion-parameter model that generates text not token by token but through diffusion, similar to how image AI turns noise into a picture. According to Nvidia, it hits about 1,000 tokens per second on a single H100 GPU, roughly four times faster than comparable autoregressive models. The speed comes at a cost, though. Output quality is lower, so Google is positioning it as an experimental tool for developers for now.

Read the full article on the-decoder.com

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

More from The Decoder

See more →

An AI model programmed nonstop for 19 days on a single MirrorCode task that cost $2,600 to run

The Decoder·Matthias Bastian

4w ago

FeaturedOriginal

An AI model programmed nonstop for 19 days on a single MirrorCode task that cost $2,600 to run

AI Summary

Epoch AI's MirrorCode benchmark reveals Claude Opus 4.7 as the leader with a 56% solve rate, reconstructing a 16,000-line toolkit in 14 hours. Despite this, all models tested struggle with the most complex tasks, highlighting limitations in current AI capabilities. The single task consumed $2,600 over 19 days, raising questions about cost-effectiveness in AI development.

#LLM #AI Coding #Inference #AI Startup

Google's new open model DiffusionGemma generates text from noise instead of word by word

Quick Answer

Quick Take

Key Points

Source Excerpt

Want this in your inbox every morning?

More from The Decoder

An AI model programmed nonstop for 19 days on a single MirrorCode task that cost $2,600 to run

Moonshot AI releases Kimi K3 open weights and infrastructure after shaking up the frontier model race

Sakana AI's orchestrator adds Nvidia Nemotron to prove "collective intelligence" can rival single frontier models

Related in this space

Synthetic Data Generation for Financial AI Research with NVIDIA NeMo

Deploy a Production-Ready NVIDIA AI-Q Blueprint on Oracle Cloud Infrastructure

Deploy Long-Context Reasoning and Agentic Workflows with MiniMax M3 on NVIDIA Accelerated Infrastructure