Amazon engineers are reportedly distilling Anthropic models to cut costs before new token-based pricing kicks in

The Decoder·Matthias Bastian

1d ago

·~1 min·6/29/2026·en·0

Quick Answer

Amazon engineers are distilling Anthropic's Claude models to create cost-effective versions ahead of a new token-based pricing model set for next year.

Quick Take

Amazon engineers are distilling Anthropic's Claude models to create cost-effective versions ahead of a new token-based pricing model set for next year. This shift, which could increase expenses significantly, comes as Amazon explores alternatives like OpenAI and invests heavily in AI technologies.

Key Points

Amazon is distilling Anthropic models to reduce internal costs.
New token-based pricing for Anthropic models starts next year.
Amazon's distillation service currently does not support Claude models.
The company has invested up to $25 billion in Anthropic this year.
Amazon is also considering alternatives like OpenAI's models.

📖 Reader Mode

~1 min read

Matthias Bastian

Jun 29, 2026

Worried about rising costs, some Amazon engineers are already distilling Anthropic models to build smaller, cheaper versions for internal use. That's according to a report from The Information. Distillation works by having a smaller model learn from a larger model's outputs. Amazon has certain rights to use Anthropic's models for this purpose, according to a person familiar with the matter, similar to Apple's arrangement with Google Gemini. Amazon does offer a distillation service on its Bedrock cloud platform, but Anthropic's Claude models aren't available there; only Amazon's own Nova models and Meta's Llama models are supported.

The effort ties back to a renegotiation of the partnership, according to The Information. Starting next year, Amazon will pay for Anthropic's models based on tokens processed rather than compute hours, which could push costs up sharply. An Amazon spokesperson pushed back, saying the changes from the expanded partnership won't raise costs. Anthropic points to lower prices relative to the performance its models deliver.

Amazon is reportedly exploring alternatives like OpenAI and its own Nova models. The company has invested up to $25 billion more in Anthropic and up to $50 billion in OpenAI this year.

— Originally published at the-decoder.com

Continue reading on the-decoder.com

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

More from The Decoder

See more →

An AI model programmed nonstop for 19 days on a single MirrorCode task that cost $2,600 to run

The Decoder·Matthias Bastian

4d ago

FeaturedOriginal

An AI model programmed nonstop for 19 days on a single MirrorCode task that cost $2,600 to run

AI Summary

Epoch AI's MirrorCode benchmark reveals Claude Opus 4.7 as the leader with a 56% solve rate, reconstructing a 16,000-line toolkit in 14 hours. Despite this, all models tested struggle with the most complex tasks, highlighting limitations in current AI capabilities. The single task consumed $2,600 over 19 days, raising questions about cost-effectiveness in AI development.

#LLM #AI Coding #Inference #AI Startup