Cursor Finds GPT 5.2 Better Than Claude Opus 4.5 for Long Autonomous Tasks

The comparison was found when the team set out to build a web browser from scratch using Cursor.

Share

Cursor says it has found OpenAI’s GPT-5.2 models to be significantly more reliable than Anthropic’s Claude Opus 4.5 for long-running, autonomous coding tasks.

On the same day, Cursor also made the GPT 5.2 model available on its platform. 

This was found when the team set out to build a web browser from scratch using Cursor. CEO Michael Truell said on X that the browser’s rendering engine was built from scratch in Rust, with support for HTML parsing, CSS cascade and layout, text shaping, painting, and a custom JavaScript virtual machine. 

“It kind of works,” Truell wrote. “It still has issues and is, of course, very far from WebKit/Chromium parity, but we were astonished that simple websites render quickly and largely correctly.” 

Cursor has released the code on GitHub.

In a research blog post published this week, Cursor described the browser as part of a broader effort to test whether autonomous coding agents can scale to projects “that typically take human teams months to complete.”

Cursor stated that while building the browser, “We found that GPT-5.2 models are much better at extended autonomous work: following instructions, keeping focus, avoiding drift, and implementing things precisely and completely.”

By contrast, “Opus 4.5 tends to stop earlier and take shortcuts when convenient, yielding back control quickly,” Cursor said.

Other long-running experiments include a multi-week, in-place migration of Cursor’s own codebase from Solid to React, involving +266,000 and –193,000 lines of changes, a Java Language Server Protocol project with 7,400 commits and 550,000 lines of code, a Windows 7 emulator exceeding 1.2 million lines, and an Excel-like system reaching 1.6 million lines.

In another case, Cursor said a long-running agent rewrote a video-rendering pipeline in Rust, making it “25× faster” while also adding smooth zooming, panning, and motion-blur effects.

ALSO READ: TCS, AMD Partner to Push Enterprise AI Pilots to Production

Staff Writer
Staff Writer
The AI & Data Insider team works with a staff of in-house writers and industry experts.

Related

Unpack More