The Agentic AI Blast Radius: Capability, Control, and Consequences

Recent incidents with internal assistants and open-source agents are not isolated glitches. They expose structural weaknesses in how enterprises approach autonomy, trust, and security for AI systems.

Share

Within a year, AI in the enterprise has quietly crossed a line: systems that once suggested are now allowed to act.
Agents are sending emails, touching production data, and moving work through critical systems—often faster than leaders are redesigning controls around them.
The incidents we are starting to see are not one-off technical glitches. They reveal a deeper structural problem in how organisations think about autonomy, trust, and control in AI.
The lesson is not to slow innovation. It is to recognise what makes agentic AI fundamentally different—and where the real risks actually begin.

The Shift from Insight to Action

Most enterprise AI has, until recently, been advisory: systems that analyse, summarise, and recommend. Agentic AI is operational. It takes actions in real systems, not just in slideware.

These agents can send messages, open and close tickets, update records, trigger workflows, and orchestrate other tools without waiting for a human to click “approve” at every step. Every new permission an agent receives widens the blast radius when, inevitably, something goes wrong.

To be useful, agents need access—to systems, data, and delegated authority. But as that scope expands, organisations are no longer just asking AI what it thinks. They are increasingly allowing it to decide what happens next. The real danger is not that AI can make mistakes; it is that those mistakes are now embedded directly into operational flows.

When Trust Turns into Over-reliance

One of the most underestimated risks in enterprise AI adoption is not technical at all. It is behavioural. As assistants become familiar, the checks fall away. Outputs that still carry uncertainty are treated with the same confidence as internal systems or trusted colleagues.

The problem is not simply that the model might be wrong. It is that people stop assuming it could be. The convenience of a fast, polished answer gradually outweighs scrutiny of its accuracy, especially under time pressure. In this environment, “humans in the loop” can easily become humans on autopilot—rubber-stamping instead of actively challenging.

Human oversight only works if the human is meaningfully engaged. If review becomes a formality, oversight is cosmetic. The organisation believes it has a safety net. In practice, it has a workflow step.

ALSO READ: The Security Gap Enterprises Are Creating as They Scale AI Agents

Structural Risks Go Beyond Individual Tools

It is tempting to treat each incident—whether a failed copilot or a misbehaving open-source agent—as a one-off implementation flaw. But most of the high-profile failures we are seeing are symptoms of deeper architectural limitations.

One of the most important is prompt injection. Current LLM-based systems struggle to robustly distinguish between trusted instructions and untrusted data. Malicious instructions hidden inside an email, a document, or a webpage can be interpreted as legitimate guidance, particularly when they appear in contexts the model has been told to follow.

If an agent has sufficient permissions, that guidance can trigger real actions—retrieving data, changing configurations, or sending information outside the organisation. No malware needs to run. No classic exploit needs to fire. The AI simply does what it believes it has been asked to do.

Plugins, tools, and external “skills” multiply this effect. They expand capability, but every extension introduces another trust boundary. A compromised or poorly vetted plugin becomes the weakest link in the chain—able to misuse the authority of the agent that calls it. These are not isolated vendor bugs; they are inherent to how agentic systems interpret instructions and invoke actions across multiple components.

ALSO READ: What Voice AI Taught Us About Infrastructure Limits

Preparedness is Not Keeping Pace

While adoption of generative and agentic AI is accelerating, governance maturity is lagging behind. Surveys consistently show that enterprises are moving models and agents into production faster than they are defining ownership, controls, and monitoring around them.

Visibility into where agents are deployed, what permissions they hold, and how they are being used remains limited in many organisations. Business units experiment with tools independently, often outside formal review processes, creating fragmented risk and inconsistent safeguards.

Security and risk teams frequently find themselves reacting to deployments rather than shaping them. Policies are drafted after pilots launch. Roles and accountabilities are ambiguous, sitting uncomfortably across IT, security, compliance, and business leadership. Meanwhile, pressure from boards, regulators, and competitors to “move faster on AI” keeps increasing.

When adoption outpaces control, exposure grows by default. Not because anyone chose to be reckless, but because no one was explicitly responsible for constraining the blast radius before the agents went live.

Capability Versus Control

The uncomfortable reality is that the most useful agents are often the most dangerous. An agent that can only read a narrow dataset is low risk—and also low impact. An agent with access to production systems, customer records, or financial workflows can unlock real productivity gains, but it also has the potential to cause real damage.

Three principles should guide how organisations manage this trade-off. First, permissions must be tightly scoped. Agents should only access the systems, data, and tools genuinely required for their task, with explicit boundaries on what they may and may not do. Second, high-impact actions should require validation. Full autonomy may sound attractive, but for sensitive operations it remains irresponsible; approval gates, dual control, or transaction limits are essential. Third, some functions should not be delegated at all. Privileged administration, financial approvals, and strategic decisions are areas where human accountability needs to stay non-negotiable.

The key governance question becomes simple but demanding: when an agent takes a bad decision, whose decision was it? If the answer is “the system’s”, then accountability is already misaligned.

The Attacker Perspective

From a cybercrime perspective, fully autonomous AI-driven attacks are still relatively rare. Most threat actors continue to rely on techniques like phishing, credential theft, and ransomware because they are proven, repeatable, and profitable.

Where AI is already reshaping attacker operations is inside the intrusion lifecycle. Once an attacker has a foothold, models can help them write scripts, analyse logs, summarise documentation, and automate reconnaissance across sprawling environments. Tasks that previously required specialist expertise can now be performed faster, by less-skilled operators, and at greater scale.

That matters because speed is critical in modern attacks. The window between compromise and impact is shrinking—and AI accelerates the side that already benefits from acting first. Looking ahead, more capable threat actors are beginning to experiment with self-hosted models, stripped of safety features and tuned for offensive use, further increasing flexibility and reducing reliance on public APIs.

In that context, agentic AI in enterprise environments is both an opportunity and an additional surface. Agents that can move laterally, pull data, or invoke tools at machine speed are attractive targets for co‑option.

Building a Sound Strategy

The answer for enterprises is not to avoid agentic AI. It is to deploy it deliberately. That starts with visibility: understanding where agents are deployed, what they can access, and what authority they have. Without a current inventory of agents, permissions, and integrations, it is impossible to reason about risk.

Next comes governance. Policies need to define acceptable use, permission boundaries, escalation paths, and ownership—not just at the model level, but at the level of concrete actions agents are allowed to perform. This must be coupled with education so that users understand AI outputs as advisory inputs, not unquestionable truth, and with design patterns that embed security from the outset rather than retrofitting controls after deployment.

Finally, organisations should assume failure is inevitable at some point and design for containment. That means monitoring agent behaviour, validating integrations, limiting delegation chains, and enforcing blast-radius controls such as rate limits, transaction caps, and segmented environments. When something goes wrong—and eventually it will—the goal is to ensure the impact is narrow, observable, and recoverable.

The recent wave of agentic AI incidents should not be dismissed as teething problems. They are early warning signs of a broader shift in enterprise technology, where systems no longer simply support decisions but increasingly make them and act upon them. The question is no longer whether enterprises will adopt agentic AI, but whether they have designed their systems for the day an agent makes a serious mistake—and the action still goes through.

ALSO READ: Disrupting Threats Before They Materialise: AI’s Expanding Role in Investigations

David Sancho
David Sancho
Senior Threat Researcher, Trend Micro Europe

Related

Unpack More