
Nemotron 3 Ultra now available on AI Gateway
Quick Answer
This paper shows that Nvidia's Nemotron 3 Ultra is now available on Vercel AI Gateway, featuring a 1M token context window and optimized for multi-turn agent workflows.
Quick Take
Nvidia's Nemotron 3 Ultra is now available on Vercel AI Gateway, featuring a 1M token context window and optimized for multi-turn agent workflows. It achieves a throughput of 350 tokens per second with up to 30% cost savings on agentic tasks, making it ideal for planning, , and error recovery.
Key Points
- Nemotron 3 Ultra supports long-running agent workflows with a 1M token context window.
- Achieves throughput of 350 tokens per second, enhancing performance for multi-turn tasks.
- Offers up to 30% lower costs on agentic tasks compared to previous models.
- AI Gateway provides unified API for model calls, usage tracking, and performance optimizations.
- No platform fees on inference, including Bring Your Own Key (BYOK) requests.
Article Excerpt
From source RSS / original summaryNemotron 3 Ultra from Nvidia is now available on . Vercel AI GatewayNemotron 3 Ultra is an open Mixture-of-Experts reasoning model built for orchestrating long-running agent workflows, with a 1M token context window. The model targets multi-turn agent workflows: planning, , sub-agent delegation, and error recovery. Throughput reaches up to 350 tokens per second, with up to 30% lower cost on agentic tasks. To use Nemotron 3 Ultra, set model to in the .
nvidia/nemotron-3-ultra-550b-a55bAI SDKAI Gateway provides a unified API for calling models, tracking usage and cost, and configuring retries, failover, and performance optimizations for higher-than-provider uptime. It includes built-in , , , and more. AI Gateway reflects provider pricing with no markup and does not charge a platform fee on inference, including on (BYOK) requests.
custom reportingZero Data Retention supportdynamic provider sorting by latency and costBring Your Own KeyLearn more about , view the or try it in our . AI GatewayAI Gateway model leaderboardmodel playgroundRead more
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from Vercel AI
See more →
Opus 4.8 on AI Gateway
Claude Opus 4.8, now available on Vercel AI Gateway, excels in long-horizon agentic execution and complex coding tasks, producing clearer prose for knowledge work. Users can access it via the .anthropic/claude-opus-4.8 model in the AI SDK, benefiting from a unified API with no markup on provider pricing.

