
Gemini 3.5 Flash on AI Gateway
Quick Take
Gemini 3.5 Flash enhances coding, reasoning, and execution efficiency on Vercel AI Gateway.
Key Points
- Improved coding proficiency and reasoning capabilities.
- Supports parallel execution loops for efficiency.
- Unified API for model management and performance tracking.
📖 Reader Mode
~1 min read1 min read
Gemini 3.5 Flash is now available on Vercel AI Gateway.
This model has improved coding proficiency and parallel agentic execution loops versus previous Flash versions. It also brings improvements to core reasoning, instruction following, and multi-turn coherence, with stronger performance on complex tasks and higher-quality reasoning traces in thinking mode.
3.5 Flash defaults to the medium thinking level, balancing response quality with faster, more cost-efficient generation.
To use Gemini 3.5 Flash, set model to google/gemini-3.5-flash in the AI SDK.
import { streamText } from 'ai';
const result = streamText({
model: 'google/gemini-3.5-flash',
prompt: 'Refactor this service to run API calls in parallel.',
providerOptions: {
google: { // use vertex or google
thinkingConfig: {
thinkingLevel: 'high',
includeThoughts: true,
},
},
});
Note that temperature, topP, topK, and thinking_budget are not supported by this model.
AI Gateway provides a unified API for calling models, tracking usage and cost, and configuring retries, failover, and performance optimizations for higher-than-provider uptime. It includes built-in custom reporting, observability, Bring Your Own Key support, and intelligent provider routing with automatic retries.
Learn more about AI Gateway, view the AI Gateway model leaderboard or try it in our model playground.
— Originally published at vercel.com


