Every May, Google gathers developers, partners and press for I/O, its annual developer conference in Mountain View that has become the company’s main stage for signalling where its platforms are headed next. Last year’s show was framed as the start of the “Gemini era”; this year Sundar Pichai went further, welcoming attendees to the “agentic Gemini era” and making it clear that Google now sees agents—not chatbots—as the organising idea for its AI roadmap.
Google used I/O 2026 to shift its AI narrative from “smarter chatbots” to “governed agents” that can plan, write and ship real work across code, cloud and Workspace — with new guardrails to match. For enterprise IT and data leaders, the real story sits in how Gemini 3.5 Flash, Antigravity 2.0, Managed Agents and Gemini Spark fit together as an end‑to‑end agent stack rather than as a series of isolated feature drops.
From Chat Windows to Operating Systems
Midway through the developer keynote, Google set Antigravity 2.0 a brutal task: build a new operating system from scratch that could boot Doom. Over roughly 12 hours, 93 autonomous sub‑agents working inside Antigravity wrote the core of the OS, generated around 2.6 billion tokens and, after a live fix for missing keyboard drivers, finally got the game running.
It was theatre, but not just theatre. The demo illustrated how Google now expects AI to work in the enterprise: not as a single assistant replying to prompts, but as teams of agents coordinating over long‑running tasks inside controlled sandboxes. At the same time, Google’s model story has been tuned for that world — Gemini 3.5 Flash is explicitly billed as its “most impressive model yet for agentic workflows”, combining a 1 million‑token context window, up to 64k output tokens and frontier‑level coding and tool‑use performance at Flash‑series latency.
The subtext for CIOs and CDOs is clear: Google is betting that the next wave of AI adoption will be less about a clever chatbot in the corner of your browser, and more about fleets of governed agents building, testing and operating systems on your behalf.
Gemini 3.5 Flash: Engine for Agentic Work
Gemini 3.5 Flash is the new anchor of this strategy. Google DeepMind describes it as “best for frontier performance across agents and coding”, with support for text, image, audio, video and PDF inputs and advanced tool use including function calling, search as a tool and code execution.
Independent and Google‑published benchmarks position Flash as competitive with much larger frontier models on agent‑oriented workloads such as Terminal‑Bench 2.1 (agentic terminal coding) and SWE‑Bench Pro (diverse software tasks), while running at significantly higher token throughput — TechCrunch notes Google’s claim that Flash can generate nearly 300 tokens per second and is roughly four times faster than prior frontier models, with an optimised variant reported as 12× faster on the same quality. That speed matters when you expect dozens of agents to operate in parallel for hours, as in the operating system demo or in complex enterprise workflows.
ALSO READ: 10 Hard Truths from the Cisco AI Summit
Crucially, Flash is not confined to a single product. It is now available through the Gemini app, Gemini API, Gemini Enterprise, the Gemini Enterprise Agent Platform, Google AI Studio, Google Antigravity and Android Studio, and it underpins AI Mode in Search globally. In other words, the same engine now drives consumer chat, developer tooling, enterprise agent platforms and internal experiments, which helps explain why Google is comfortable betting its agent story on this tier.
Antigravity 2.0 and Managed Agents: The Harness
If Flash is the engine, Antigravity 2.0 is the chassis. At I/O 2026 Google upgraded Antigravity from an internal experiment to what it called an “unabashedly agent‑first” development environment: a standalone desktop application and CLI designed around orchestrating teams of agents with terminals, tools and sandboxes baked in.
The Antigravity blog describes how agents are given access to secure sandboxes where they can run code, interact with Git and other tools, while credential masking and hardened Git policies reduce the risk of agents exfiltrating secrets or force‑pushing bad changes. The OS demo was designed to show how this harness behaves at scale — dozens of agents spinning up, each tackling a different subsystem, then reconciling their work into a coherent kernel and userland.
For teams that don’t want to manage their own agent infrastructure, Google introduced Managed Agents in the Gemini API and on the Gemini Enterprise Agent Platform. With a single API call, developers can now spin up a fully managed agent that runs in a remote Linux sandbox, can execute code, call tools, browse and access files, and is governed by the same controls as other Agent Platform workloads. This is tightly integrated with AI Studio and cloud runtimes: a project can be prototyped in AI Studio, then exported into Antigravity for deeper agent orchestration or deployed via Cloud Run with managed agents handling long‑running tasks.
For enterprises, the combination is important. Antigravity gives advanced teams a highly customisable environment to build sophisticated agentic applications, while Managed Agents offers a higher‑level abstraction where the runtime, sandboxing and much of the security posture are handled by Google’s platform.
ALSO READ: Alteryx Inspire 2026: Takeaways for Enterprise Leaders
Gemini Spark: Agents that Don’t Wait for Prompts
Where Antigravity and Managed Agents are aimed at developers and platform teams, Gemini Spark brings agents directly into everyday work. Announced as a “24/7 personal AI agent”, Spark runs on Gemini 3.5 Flash and Antigravity’s harness, but presents as a user‑facing assistant that can take multi‑step actions across Gmail, Calendar, Drive, Docs, Sheets, Slides, YouTube and Maps.
Spark runs on dedicated virtual machines in Google Cloud, so tasks continue even when the user’s laptop or phone is switched off, and Google emphasises that it is designed to check with the user before taking major actions, such as sending emails or making bookings. Users can email Spark directly via a dedicated address, configure recurring schedules and define reusable “skills” — pre‑approved patterns of behaviour Spark can apply without being micromanaged each time.
Coverage from outlets such as Wired notes that Spark goes significantly beyond passive assistants: it can scan email and financial statements, reconcile calendars, escalate anomalies and proactively suggest actions, raising both productivity opportunities and new questions about how much autonomy to give agents over highly personal or financial data. On the enterprise side, Google’s Cloud blog positions Spark as part of the Gemini Enterprise and Workspace stack, where the same agentic behaviour operates within corporate data boundaries and identity controls.
Governance Built in: Sandboxes, Gateways and Provenance
For tech leaders, the differentiator isn’t just that Google has more agents; it is how those agents are being controlled.
On the infrastructure side, Google’s Cloud recap emphasises that agent workloads run in isolated, fully managed runtimes. In the enterprise version of Spark, each task executes in its own virtual machine, and an Agent Gateway enforces data loss prevention, logs activity and ensures that credentials remain encrypted and are not directly exposed to the agent. Antigravity brings its own controls through terminal sandboxing, credential masking and strict Git rules, which become critical when agents are allowed to refactor codebases or modify infrastructure as code.
ALSO READ: Rethink Governance Not as a Defensive Mechanism, But as a Strategic Lever
On the content side, Google is extending SynthID, its watermarking and provenance technology, across more media surfaces and partnering with other model vendors so their outputs can be tagged with SynthID and C2PA credentials. Alongside this, a new AI Content Detection API on the Agent Platform lets organisations detect AI‑generated content from both Google’s models and popular third‑party models, with an eye on compliance workflows and misinformation controls.
Security‑focused agents are part of the story too. CodeMender, originally developed in Google DeepMind, is now integrated into the Agent Platform as an AI security agent that autonomously identifies vulnerabilities, recommends fixes, tests them and applies patches with human sign‑off. Combined with the sandboxing and content provenance layers, this points towards a future where enterprises have agents doing work and other agents — plus gateways — watching them, which is exactly where regulators and model risk teams are starting to focus.
How an Enterprise Might Actually Use this Stack
Taken together, the pieces presented at I/O 2026 start to look less like a collection of demos and more like an operating model for agentic enterprise AI.
A few concrete patterns emerge from Google’s examples and early adopter stories:
- Software delivery and migrations
Platform teams could use Antigravity and the Android agent tooling to orchestrate large‑scale refactors — for example, migrating React Native or web apps to native Kotlin, or performing multi‑repo API changes — with agents handling the bulk of the code changes and tests inside sandboxes, and humans reviewing and merging. - Knowledge work and sales/operations support
Gemini Spark, wired into Workspace, can keep a salesperson’s inbox triaged, assemble account dossiers from emails, Docs and Sheets, and draft status updates or proposals, only asking for approval at key decision points. For internal operations, Spark and Managed Agents could handle recurring reconciliation tasks, report preparation and escalation workflows across line‑of‑business systems. - Data and content governance pipelines
Content teams and compliance officers could use SynthID and AI Content Detection via the Agent Platform to automatically label generated media and flag unlabelled synthetic content entering their systems, giving them an audit trail that may become mandatory in EU and UK regimes. Security teams can deploy CodeMender as a constant companion to CI/CD, scanning changesets and suggesting mitigations before releases are approved. - Internal science and R&D agents
Google’s own “Gemini for Science” prototypes use multi‑agent setups to synthesise literature, generate hypotheses and explore candidate solutions with systems like AlphaEvolve and ERA. Enterprises with strong R&D functions could adopt similar patterns, with governed agents generating and ranking research ideas while humans retain accountability for what progresses.
For CIOs and CDOs, these are the kinds of end‑to‑end workflows that matter more than any one benchmark number.
ALSO READ: The Unspoken Prerequisite by AWS: Enterprise AI Must Solve Modernisation First
What’s Still Missing, and Questions for Boards
Despite the impressive stack, there are gaps and open questions that enterprise buyers will need to interrogate.
First, complexity and cognitive load. Between the Gemini app, Spark, Antigravity, Managed Agents, Workspace add‑ons and Search‑level agents, there is a real risk of fragmentation: different teams experimenting with different agent surfaces without a clear organisational view of where agents should live, who owns them and how they are governed. Google’s integration story is strong on paper, but operating models and internal policies will need to catch up.
Second, economic clarity. Analyses like Latent Space’s I/O recap point out that Gemini 3.5 Flash, positioned as “frontier performance at Flash speed”, may also come with a higher price tag than previous Flash tiers, potentially even higher than Gemini 3.1 Pro for some workloads. If enterprises are to run dozens of long‑lived agents per workflow, that changes the cost profile dramatically and needs to be weighed against rival models and hybrid stacks.
Third, governance in real organisations. Google’s technical story — sandboxes, gateways, provenance, security agents — is thoughtful, but enterprises still have to integrate these into existing IAM, SOC, internal audit and model risk processes. Questions like “Who is allowed to create an agent with production access?”, “How are agent decisions logged and reviewed?” and “What evidence do we show regulators?” are not solved by the platform alone.
For boards, CIOs and CDOs, I/O 2026 therefore raises a set of sharper questions than previous cycles:
- If we accept that the future is “many governed agents”, whose stack do we trust to run those agents on our systems?
- How do we balance vendor‑integrated stacks like Google’s against model‑agnostic, multi‑cloud strategies?
- And how do we ensure that the guardrails demonstrated on stage translate into enforceable controls in our own environments?
Those questions, more than the novelty of the OS demo, are what will determine whether Google’s I/O 2026 vision for governed agents lands in boardrooms over the next 12–24 months.
ALSO READ: NVIDIA GTC 2026: From GPUs to AI Factories
