OpenAI Rolls Out GPT-5.5 For Agentic Coding and Autonomous Workflows

penAI has released its latest frontier model, GPT-5.5, positioning it as an agentic AI system that can independently execute complex, multi-step tasks across software, data, and research workflows with minimal human supervision.

The AI giant is rolling out the model to paid users across ChatGPT tiers, including Plus, Pro, Business, and Enterprise, as well as in Codex, with API access expected soon.

OpenAI said GPT-5.5 is its “smartest and most intuitive” system yet, capable of understanding intent faster and completing tasks in one go rather than requiring step-by-step prompting.

The release marks a shift in how OpenAI’s systems are being deployed. Instead of acting as assistants that respond to prompts, GPT-5.5 is designed to plan, use tools, verify outputs, and iterate across workflows such as coding, document creation, financial analysis, and software operation.

“GPT-5.5 understands what you’re trying to do faster and can carry more of the work itself,” the company said in its announcement, adding that users can give it “a messy, multi-part task” and expect it to navigate ambiguity and complete it end-to-end.

Early benchmarks suggest meaningful gains in agentic performance.

The model scored 82.7% on Terminal-Bench 2.0, which evaluates complex command-line workflows that require planning and tool coordination, and 78.7% on OSWorld-Verified, which measures how effectively AI systems can operate in real computer environments.

On SWE-Bench Pro, a test of real-world software debugging tasks, GPT-5.5 achieved 58.6%, outperforming its predecessor in single-pass issue resolution.

OpenAI said the improvements extend beyond coding into broader knowledge work. On GDPval, which evaluates performance across 44 occupations, GPT-5.5 scored 84.9%, indicating stronger capabilities in tasks such as research, data analysis, and document generation.

The model also showed gains in scientific workflows, including bioinformatics and multi-stage data analysis, positioning it as a potential “co-scientist” for research applications.

A key differentiator is efficiency. OpenAI claims GPT-5.5 matches GPT-5.4 in per-token latency while using significantly fewer tokens to complete tasks, reducing the need for retries and iterative prompting. This allows the model to deliver higher performance without increasing response time, a common trade-off in larger AI systems.

The company also emphasised improvements in real-world usability. In Codex, GPT-5.5 can handle end-to-end engineering tasks, including implementation, debugging, testing, and refactoring, while maintaining context across large codebases. Early testers cited stronger “conceptual clarity,” with the model better able to diagnose failures and predict downstream impacts.

Pricing reflects the model’s increased capability. GPT-5.5 will be available via API at $5 per million input tokens and $30 per million output tokens, with higher-priced tiers for more advanced variants. OpenAI said efficiency gains are expected to offset costs by reducing total token usage per task.

At the same time, OpenAI highlighted expanded safeguards amid growing security concerns ahead of the potential launch of Anthropic’s Mythos.

GPT-5.5 has undergone extensive internal and external testing, including targeted evaluations for cybersecurity and biological risks. The company said it has introduced tighter controls on high-risk use cases and is expanding its Trusted Access for Cyber (TAC) program to allow vetted organisations broader use of advanced capabilities for defensive purposes.

Last week, the company launched GPT-5.4-Cyber, a powerful new model fine-tuned specifically for defensive cybersecurity, while scaling its TAC programme to thousands of verified defenders.

ALSO READ: The Playground is Closed: 10 Hard Truths from the Cisco AI Summit

Join Our Core Community

From Generic Models to Living Twins: A Practitioner’s Guide to ML in Design Workflows

Designing AI‑Ready Public Infrastructure: Global Lessons from India’s Aadhaar Builder

What “High-Risk AI” Actually Means for the Teams Running HR, Finance and Customer Ops

DXC’s LabX is Beating AI Theatre

Scaling Telehealth Without Scaling Fraud: The Case for an AI Trust Layer

Banks Are Drowning in Data and Starving for Insight

Unstructured Data, Deterministic Answers

Data Layer Precedes Compute, GPU Capacity in Sovereign AI

Why Data Reliability Now Governs Scaling GenAI

Cloud 3.0 and Data Sovereignty: Why Workload Placement Is Now a Strategic Decision

IBM Unveils World’s First Sub-1nm Chip Technology

OpenAI, Broadcom Unveil Custom Inference Chip Jalapeño for LLM Workloads

Figma Unveils AI Agents and Code-Native Design Tools at Config 2026

Los Angeles Opens DATALAND, World’s First Museum Featuring Solely AI Art

AI Infra Startup Baseten Announces GLM-5.2’s Fastest API Yet

OpenAI Rolls Out GPT-5.5 For Agentic Coding and Autonomous Workflows

On SWE-Bench Pro, a test of real-world software debugging tasks, GPT-5.5 achieved 58.6%, outperforming its predecessor in single-pass issue resolution.

IBM Unveils World’s First Sub-1nm Chip Technology

OpenAI, Broadcom Unveil Custom Inference Chip Jalapeño for LLM Workloads

Unpack More

OpenAI, Broadcom Unveil Custom Inference Chip Jalapeño for LLM Workloads

OpenAI to Acquire Ona for Persistent Enterprise AI Agents

OpenAI Considers Price Cuts Amidst Anthropic Rivalry

OpenAI Files for IPO, Following SpaceX and Anthropic

Why Data Reliability Now Governs Scaling GenAI

Middle East: The Sovereign AI Testbed US, EU and Asia Can Learn From

NVIDIA’s VP of Solutions Architecture on What It Actually Takes to Build a Sovereign AI Factory