OpenAI Rolls Out GPT-5.5 For Agentic Coding and Autonomous Workflows

On SWE-Bench Pro, a test of real-world software debugging tasks, GPT-5.5 achieved 58.6%, outperforming its predecessor in single-pass issue resolution.

Share

penAI has released its latest frontier model, GPT-5.5, positioning it as an agentic AI system that can independently execute complex, multi-step tasks across software, data, and research workflows with minimal human supervision.

The AI giant is rolling out the model to paid users across ChatGPT tiers, including Plus, Pro, Business, and Enterprise, as well as in Codex, with API access expected soon. 

OpenAI said GPT-5.5 is its “smartest and most intuitive” system yet, capable of understanding intent faster and completing tasks in one go rather than requiring step-by-step prompting.

The release marks a shift in how OpenAI’s systems are being deployed. Instead of acting as assistants that respond to prompts, GPT-5.5 is designed to plan, use tools, verify outputs, and iterate across workflows such as coding, document creation, financial analysis, and software operation.

“GPT-5.5 understands what you’re trying to do faster and can carry more of the work itself,” the company said in its announcement, adding that users can give it “a messy, multi-part task” and expect it to navigate ambiguity and complete it end-to-end.

Early benchmarks suggest meaningful gains in agentic performance. 

The model scored 82.7% on Terminal-Bench 2.0, which evaluates complex command-line workflows that require planning and tool coordination, and 78.7% on OSWorld-Verified, which measures how effectively AI systems can operate in real computer environments. 

On SWE-Bench Pro, a test of real-world software debugging tasks, GPT-5.5 achieved 58.6%, outperforming its predecessor in single-pass issue resolution.

OpenAI said the improvements extend beyond coding into broader knowledge work. On GDPval, which evaluates performance across 44 occupations, GPT-5.5 scored 84.9%, indicating stronger capabilities in tasks such as research, data analysis, and document generation. 

The model also showed gains in scientific workflows, including bioinformatics and multi-stage data analysis, positioning it as a potential “co-scientist” for research applications.

A key differentiator is efficiency. OpenAI claims GPT-5.5 matches GPT-5.4 in per-token latency while using significantly fewer tokens to complete tasks, reducing the need for retries and iterative prompting. This allows the model to deliver higher performance without increasing response time, a common trade-off in larger AI systems.

The company also emphasised improvements in real-world usability. In Codex, GPT-5.5 can handle end-to-end engineering tasks, including implementation, debugging, testing, and refactoring, while maintaining context across large codebases. Early testers cited stronger “conceptual clarity,” with the model better able to diagnose failures and predict downstream impacts.

Pricing reflects the model’s increased capability. GPT-5.5 will be available via API at $5 per million input tokens and $30 per million output tokens, with higher-priced tiers for more advanced variants. OpenAI said efficiency gains are expected to offset costs by reducing total token usage per task.

At the same time, OpenAI highlighted expanded safeguards amid growing security concerns ahead of the potential launch of Anthropic’s Mythos.

GPT-5.5 has undergone extensive internal and external testing, including targeted evaluations for cybersecurity and biological risks. The company said it has introduced tighter controls on high-risk use cases and is expanding its Trusted Access for Cyber (TAC) program to allow vetted organisations broader use of advanced capabilities for defensive purposes.

Last week, the company launched GPT-5.4-Cyber, a powerful new model fine-tuned specifically for defensive cybersecurity, while scaling its TAC programme to thousands of verified defenders.

ALSO READ: The Playground is Closed: 10 Hard Truths from the Cisco AI Summit

Staff Writer
Staff Writer
The AI & Data Insider team works with a staff of in-house writers and industry experts.

Related

spot_img

Unpack More