OpenAI Launches GPT-5.4, Claims Higher Results on Multiple Fronts

In ChatGPT, the model is available as GPT-5.4 Thinking, which can outline its approach before generating a response.

Share

OpenAI has launched GPT-5.4, its latest frontier AI model for professional work, across ChatGPT, the API, and Codex. The company also introduced GPT-5.4 Pro, a higher-performance version for complex tasks.

The model is rolling out to ChatGPT Plus, Team, and Pro users as GPT-5.4 Thinking. GPT-5.2 Thinking will remain available as a legacy model until June 5, 2026. GPT-5.4 Pro is available to Pro and Enterprise users.

In the API, GPT-5.4 is available as gpt-5.4, while gpt-5.4-pro is offered for developers needing higher performance. 

The new model combines improvements in reasoning, coding, and agentic workflows while integrating GPT-5.3-Codex’s coding capabilities.

GPT-5.4 supports up to 1 million tokens of context in Codex and includes tool search, allowing the model to locate and use tools from large ecosystems without adding all tool definitions to the prompt.

OpenAI said the model can complete professional tasks involving spreadsheets, presentations, and documents with fewer interactions.

“It is our most capable and efficient frontier model for professional work,” the company said in a statement.

In ChatGPT, the model is available as GPT-5.4 Thinking, which can outline its approach before generating a response. This allows users to adjust instructions while the model is working.

OpenAI said the model also improves deep web research and maintains context better during longer tasks.

On the GDPval benchmark, which tests knowledge work across 44 occupations, GPT-5.4 matched or exceeded industry professionals in 83% of comparisons, compared with 70.9% for GPT-5.2.

OpenAI said it also improved the model’s ability to generate spreadsheets, presentations and documents. In internal tests of spreadsheet modelling tasks similar to those performed by junior investment banking analysts, GPT-5.4 scored 87.3%, compared with 68.4% for GPT-5.2.

The company said the model also produces more reliable answers. According to OpenAI, GPT-5.4’s claims are 33% less likely to be false and full responses are 18% less likely to contain errors compared with GPT-5.2.

The model also introduces native computer-use capabilities, allowing agents to interact with software using screenshots, keyboard commands and mouse actions. On the OSWorld-Verified benchmark, which evaluates desktop navigation tasks, GPT-5.4 achieved a 75% success rate, higher than GPT-5.2’s 47.3%.

OpenAI said the model also improves tool use and web search. On BrowseComp, a benchmark measuring web browsing capability, GPT-5.4 scored 82.7%, compared with 65.8% for GPT-5.2.

ALSO READ: Big Tech Players Pledge to Pay New Data Centre Costs

Staff Writer
Staff Writer
The AI & Data Insider team works with a staff of in-house writers and industry experts.

Related

Unpack More