Kimi K2 Thinking Crushes GPT-5, Claude 4.5 Sonnet in Key Benchmarks

Moonshot said Kimi K2 Thinking is designed for explicit reasoning, with intermediate logical steps visible in its outputs to ensure transparency across multi-step workflows.

Share

 

Moonshot AI, a Chinese startup backed by Alibaba, released its latest AI model, Kimi K2 Thinking, on November 6. The model surpassed several leading AI systems, including OpenAI’s GPT-5 and Claude Sonnet 4.5, in key reasoning and coding benchmarks. 

https://twitter.com/Kimi_Moonshot/status/1986449512538513505

Moonshot said the model’s architecture activates 32 billion parameters per inference out of a total of one trillion parameters and supports up to 2,56,000 token context windows.

The model can execute 200 to 300 sequential tool calls without human intervention.

Benchmark results show that Kimi K2 Thinking achieved scores of 44.9% on the Humanity’s Last Exam benchmark (with tools enabled), 60.2% on the BrowseComp web-search reasoning benchmark and 71.3% on SWE-bench Verified, which evaluate agentic reasoning and coding capabilities.

Moonshot said Kimi K2 Thinking is designed for explicit reasoning, with intermediate logical steps visible in its outputs to ensure transparency across multi-step workflows. 

Despite its trillion-parameter scale, Moonshot AI explained that Kimi K2 Thinking maintains a modest runtime cost. The company lists pricing at $0.15 per one million tokens for cache hits, $0.60 per one million tokens for cache misses and $2.50 per one million tokens for output. 

These rates are competitive even against MiniMax-M2’s $0.30 input and $1.20 output pricing, and remain an order of magnitude lower than GPT-5, which is priced at $1.25 for input and $10 for output.

The open-source model is available under a Modified MIT License, permitting free commercial use with one attribution condition for high-scale deployments.

The launch of Kimi K2 Thinking comes at a time when Chinese open-source AI firms are competing more closely with US proprietary systems. Moonshot AI views the model as a crucial step toward making powerful AI technology more accessible.

ALSO READ: EU Data Act Goes Live—Why Today Marks a Turning Point for Enterprise Strategy

Staff Writer
Staff Writer
The AI & Data Insider team works with a staff of in-house writers and industry experts.

Related

Unpack More