OpenAI Launches GPT-5.4, Claims Higher Results on Multiple Fronts

OpenAI has launched GPT-5.4, its latest frontier AI model for professional work, across ChatGPT, the API, and Codex. The company also introduced GPT-5.4 Pro, a higher-performance version for complex tasks.

The model is rolling out to ChatGPT Plus, Team, and Pro users as GPT-5.4 Thinking. GPT-5.2 Thinking will remain available as a legacy model until June 5, 2026. GPT-5.4 Pro is available to Pro and Enterprise users.

In the API, GPT-5.4 is available as gpt-5.4, while gpt-5.4-pro is offered for developers needing higher performance.

The new model combines improvements in reasoning, coding, and agentic workflows while integrating GPT-5.3-Codex’s coding capabilities.

GPT-5.4 supports up to 1 million tokens of context in Codex and includes tool search, allowing the model to locate and use tools from large ecosystems without adding all tool definitions to the prompt.

OpenAI said the model can complete professional tasks involving spreadsheets, presentations, and documents with fewer interactions.

“It is our most capable and efficient frontier model for professional work,” the company said in a statement.

In ChatGPT, the model is available as GPT-5.4 Thinking, which can outline its approach before generating a response. This allows users to adjust instructions while the model is working.

OpenAI said the model also improves deep web research and maintains context better during longer tasks.

On the GDPval benchmark, which tests knowledge work across 44 occupations, GPT-5.4 matched or exceeded industry professionals in 83% of comparisons, compared with 70.9% for GPT-5.2.

OpenAI said it also improved the model’s ability to generate spreadsheets, presentations and documents. In internal tests of spreadsheet modelling tasks similar to those performed by junior investment banking analysts, GPT-5.4 scored 87.3%, compared with 68.4% for GPT-5.2.

The company said the model also produces more reliable answers. According to OpenAI, GPT-5.4’s claims are 33% less likely to be false and full responses are 18% less likely to contain errors compared with GPT-5.2.

The model also introduces native computer-use capabilities, allowing agents to interact with software using screenshots, keyboard commands and mouse actions. On the OSWorld-Verified benchmark, which evaluates desktop navigation tasks, GPT-5.4 achieved a 75% success rate, higher than GPT-5.2’s 47.3%.

OpenAI said the model also improves tool use and web search. On BrowseComp, a benchmark measuring web browsing capability, GPT-5.4 scored 82.7%, compared with 65.8% for GPT-5.2.

ALSO READ: Big Tech Players Pledge to Pay New Data Centre Costs

Join Our Core Community

20 Women Taking On AI’s Hardest Problems

The Procedural Friction Eating Relationship Banking — and How AI Can End It

De-Risking the Crypto Portfolio: How AI Offers CFOs Control in a 24/7 Market

6 Enterprise Tests to Expose Hidden AI Compliance Risks Across Borders

Forward-Looking Technical Debt: The Hidden Cost of AI Hesitation

Cloud 3.0 and Data Sovereignty: Why Workload Placement Is Now a Strategic Decision

Inside IBM’s 11 Billion Dollar Bet: What the Confluent Deal Reveals About AI’s Investment Paradox

“Synthetic Data Is Not the Ground Truth” — SandboxAQ’s VP of Engineering on Simulation’s Power and Limits

Data as the New Diagnostic: How Ahead Health is Turning Algorithms Into Preventive Care

Why Data Leaders Are Wary of a Synthetic Future

TCS Unveils 7th Gemini Experience Centre in Michigan

OpenAI Releases Codex App for Windows