GLM 5.2 Fast via Wafer now available on AI Gateway

Vercel AI·Rohan Taneja

1d ago

·~1 min·6/24/2026·en·0

Quick Answer

GLM 5.2 Fast via Wafer is now available on AI Gateway, achieving 2x higher throughput than competitors in both small and large contexts.

Quick Take

GLM 5.2 Fast via Wafer is now available on AI Gateway, achieving 2x higher throughput than competitors in both small and large contexts. It supports over 170 tok/s for small context and 200 tok/s for large context, with no platform fees on inference and a unified API for model management.

Key Points

Wafer delivers 2x higher throughput for GLM 5.2 Fast compared to other providers.
Small context performance exceeds 170 tokens per second; large context exceeds 200 tokens per second.
AI Gateway offers a unified API for model calls, usage tracking, and performance optimizations.
No markup on provider pricing; no platform fees for inference, including BYOK requests.
Supports features like custom reporting and zero data retention.

Article Excerpt

From source RSS / original summary

GLM 5. 2 Fast via Wafer is now available on. AI GatewayBased on our own benchmarking across small-context, large-context, and tool-call scenarios, Wafer delivers a 2x higher throughput than other providers serving GLM-5. 2 on serverless, leading on decode and end-to-end speed for sustained generation in the small- and large-context cases. In our testing, GLM 5. 2 Fast on Wafer measured:To use GLM 5. 2 Fast, set to in the:modelzai/glm-5.

2-fastAI SDKAI Gateway provides a unified API for calling models, tracking usage and cost, and configuring retries, failover, and performance optimizations for higher-than-provider uptime. It includes built-in,,, and more. custom reportingZero Data Retention supportbudgets for API keysAI Gateway reflects provider pricing with no markup and does not charge a platform fee on inference, including on (BYOK) requests. Bring Your Own KeyTry GLM 5. 2 Fast in the.

model playgroundRead moreSmall context: 170+ tok/sLarge context: 200+ tok/s

Read on vercel.com

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

More from Vercel AI

See more →

Vercel AI·Tom Occhino

1w ago

FeaturedOriginal

The Agent Stack

AI Summary

The Agent Stack by Vercel AI provides essential building blocks for creating production-grade agents, enabling seamless integration across multiple AI models and secure operations. It features components like AI Gateway for model routing, Workflow SDK for durable execution, and Vercel Connect for scoped access, streamlining agent development and deployment across various platforms.

#Agent #AI Coding #Open Source #Security

GLM 5.2 Fast via Wafer now available on AI Gateway

Quick Answer

Quick Take

Key Points

Article Excerpt

Want this in your inbox every morning?

More from Vercel AI

The Agent Stack

AI SDK 7

Sakana Fugu Ultra now available on AI Gateway

Related in this space

Deploy Long-Context Reasoning and Agentic Workflows with MiniMax M3 on NVIDIA Accelerated Infrastructure

Deploy Self-Evolving Agents for Faster, More Secure Research with a Hermes Agent and NVIDIA NemoClaw

Run Local AI Agents with Faster Models and Multi-Node Clustering on NVIDIA DGX Spark