
Snowflake CEO finds GLM-5.2 competitive with Opus 4.7 at a fraction of the cost
Quick Answer
Zhipu AI's GLM-5.2 competes closely with Claude Opus 4.7 in a Snowflake benchmark, achieving similar performance on 103 coding tasks at one-fifth the cost per output token.
Quick Take
Zhipu AI's GLM-5.2 competes closely with Claude Opus 4.7 in a Snowflake benchmark, achieving similar performance on 103 coding tasks at one-fifth the cost per output token. However, GLM-5.2 consumes nearly twice as many tokens per task, putting pressure on Anthropic and OpenAI's valuations.
Key Points
- GLM-5.2 performs comparably to Opus 4.7 in 103 coding tasks.
- Cost per output token for GLM-5.2 is one-fifth that of Opus 4.7.
- GLM-5.2 consumes nearly double the tokens per task compared to Opus 4.7.
- The competitive pricing of GLM-5.2 pressures Western AI labs' valuations.
- Anthropic and OpenAI may face increased scrutiny due to GLM-5.2's performance.
📖 Reader Mode
~2 min readSnowflake compared GLM-5.2 and Opus 4.7 in a hands-on benchmark. The Chinese model held its own.
The test covered 103 tasks, each run three times, where models had to write code that works on both DuckDB and Snowflake. When each model got three attempts per task, the two were neck and neck: 66% vs. 67% of tasks solved.
First-attempt accuracy diverges: Opus hit 53.7%, GLM only 47.6%, showing GLM's output is less consistent. The Chinese model also averaged 99 runs per task versus Opus's 80 and burned through 860 million tokens, nearly double Opus's 439 million.

GLM's strength is validating code reliably across both platforms (DuckDB and Snowflake) at the same time. According to Snowflake CEO Sridhar Ramaswamy, that's why only GLM could solve certain tasks.
Its weaknesses are giving up too early and obsessively checking the wrong things. On one task, GLM fired off 411 tool calls in 24 minutes, checking row counts, distributions, null values, and column types, and still failed all three attempts. Opus solved the same task with 49 calls in 9 minutes.
The claim that GLM produces cleaner code didn't hold up, Ramaswamy said. More checks don't lead to more correct results. Still, the team is excited about GLM-5.2 and wants to make it available to customers.
China's pricing puts real pressure on the Western AI bubble
The results matter most in the context of price. GLM-5.2 costs $1.40 per million input tokens and $4.40 per million output tokens, according to Zhipu's official price sheet. Some third-party providers undercut Zhipu's price even further. Claude Opus 4.7 runs $5 input and $25 output. GPT-5.5 costs $5 input and $30 output.
| Model | Input | Cached Input | Output |
|---|---|---|---|
| GLM-5.2 | $1.40 | $0.26 | $4.40 |
| Claude Opus 4.7 | $5.00 | $0.50 (Cache Hit) | $25.00 |
| GPT-5.5 | $5.00 | $0.50 | $30.00 |
| GPT-5.4 | $2.50 | $0.25 | $15.00 |
GLM's higher token usage eats into that price gap somewhat. But Anthropic and OpenAI are facing serious pricing pressure, and right in coding, the flagship use case both Western AI labs are betting on.
If that pressure slows revenue growth, or worse, shrinks it, the already inflated AI market faces a real stress test. OpenAI's and Anthropic's valuations rest on the assumption that revenue keeps climbing fast. Those valuations are tied to billions in bets on AI infrastructure buildout, from data centers to chip orders.
— Originally published at the-decoder.com
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from The Decoder
See more →
Cursor announces its own AI model, a new Git platform, and a mobile app
Cursor has launched its first in-house AI model alongside a new Git platform and a mobile app, aiming to enhance developer productivity. The AI model is designed to streamline coding processes, while the Git platform offers improved version control features tailored for collaborative projects.

