Podcast Episode

Alibaba's Qwen3.7-Max Reaches No. 2 in Global AI Coding Rankings

May 27, 2026

0:00

5:26

Alibaba's Qwen3.7-Max has climbed to second place on Code Arena's coding leaderboard, scoring 1541 in blind testing and beating models from OpenAI and Google. It is now the highest-ranked non-US model, trailing only Anthropic's Claude.

A Milestone for Chinese AI

Alibaba's Qwen3.7-Max has reached the number two position on Code Arena's coding leaderboard, scoring 1541 in blind testing and surpassing models from OpenAI, Google, and other leading AI labs. According to Code Arena's latest rankings updated on 26 May, Qwen3.7-Max trails only Anthropic's Claude models, placing it above OpenAI's GPT-5.5, Google's Gemini-3.5-Flash, Zhipu's GLM-5.1, and Moonshot's Kimi-K2.6. Alibaba Cloud described the model as "officially the #2 AI coding model globally" based on the platform's blind evaluation methodology, where human judges compare code outputs without knowing which model generated them.

How Code Arena Works

Code Arena uses randomised, anonymous comparisons to eliminate brand bias, testing models on tasks including web development, game creation, data visualisation, and animation. Because evaluators never see which system produced a given answer, the leaderboard reflects real-world coding ability rather than narrow benchmark performance. That makes the second-place finish particularly notable: it is judged on the quality of working code, not on a single curated test set.

Building on Rapid Progress

The result caps months of aggressive model releases from Alibaba's Qwen team. Qwen3.7-Max, announced at the Alibaba Cloud Summit in mid-May, is a proprietary reasoning model featuring a one-million-token context window designed for agent-centric workloads including long-horizon coding and debugging tasks. It scored 56.6 on the Artificial Analysis Intelligence Index, placing fifth overall on that benchmark. Alibaba had already been climbing coding rankings earlier this year. In March, the company's Qwen3.5 medium models reached the top ten amongst open models in Code Arena, whilst its Qwen 3.6-Plus drew attention in April for outperforming Claude 4.5 Opus on several agentic coding benchmarks.

The Competitive Landscape

The broader AI coding arena remains dominated by US firms. Anthropic's Claude variants hold the top position, while OpenAI, xAI, and Google populate much of the rest of the top tier. But Alibaba's ascent, alongside strong showings from Chinese competitors like DeepSeek, Zhipu, and Moonshot's Kimi, signals that the US lead in AI model development is being contested across multiple fronts. With Qwen3.7-Max now available via Alibaba Cloud's Model Studio API, the company is positioning itself as a viable alternative for developers worldwide seeking frontier-level coding assistance. The achievement underscores the narrowing gap between Chinese and American AI capabilities, and suggests the race for the best coding model is far from settled.

Published May 27, 2026 at 12:12pm

Alibaba's Qwen3.7-Max Reaches No. 2 in Global AI Coding Rankings

A Milestone for Chinese AI

How Code Arena Works

Building on Rapid Progress

The Competitive Landscape

More Recent Episodes

Bluesky Gives Up Chasing X, Pivots to Reddit-Style Communities as Engagement Halves

US and Japan Launch $1 Billion AI Research Partnership Under Genesis Mission

Meta Pauses Multi-Billion Custom AI Chip Project with Samsung