What developers can do to combat the growing threat of rogue AI agents

And what Moltbook is teaching us about emerging agentic threats

Feb 05, 2026

A guest article by Balaji Raghavan, Head of Engineering at Postman.

There’s a lot of hype surrounding AI agents. It is to be expected with any technology revolution. But cutting through the noise, it’s clear that several AI agents offer measurable benefits. When implemented well, they can provide for faster development cycles, automated workflows, lower operational costs, quicker decision-making, and productivity gains. According to PwC, 79% of organizations are already using AI agents, and 66% say they’re delivering real value.

But the same capabilities that make AI agents so powerful also introduce a new class of risk: rogue agents. Don’t think of “rogue” in the sense of a hacker wreaking havoc on your IT platform. Rather, rogue agents are actually approved systems that behave in unexpected ways. Because of the autonomy, speed, and scale, they can easily go off the rails and create cascading security and access failures.

We’re seeing this exact scenario pan out with the release of the AI agent OpenClaw and the accompanying social network for AI agents: Moltbook. The agents, totaling more than 1.5 million, are behaving in ways we couldn’t have foreseen and now they’re communicating and interacting with each other on a social network. Humans are only allowed to observe. Famed AI researcher Andrej Karpathy called it a “dumpster fire” and AI leaders have been urging people not to use Moltbook because of the security issues that platform poses.

As more teams push agentic workflows into production, these risks will become harder to ignore. Postman’s 2025 State of the API report backs this up, showing what many developers are already seeing in the wild. More than half (51%) say their top security concern is unauthorized or excessive API calls from AI agents and almost as many worry about agents accessing sensitive data (49%) and API keys being exposed or leaked (46%).

What’s happening here is simple. Today’s APIs are being stress-tested by autonomous systems in ways they were never designed for.

Tool invocation and how AI agents go rogue

When it comes to agent systems, the biggest risk is not with the LLM. It’s at the point when the agent is allowed to take an action with tools. For enterprises, they are almost always APIs. This is why so many agent-related incidents trace back to over-permissioned or poorly scoped APIs.

In many cases, the agents are working exactly as designed. The problem is that agents don’t behave like human users. They don’t pause, second-guess, or stop after a few attempts. If an action is technically allowed, an agent may repeat it endlessly, chain it with other calls, or apply it across datasets and contexts the original designers never anticipated.

In that way, AI agents turn small API design decisions into large-scale operational risks. A permissive endpoint that was harmless when used occasionally by a human suddenly becomes a duplication machine.

How the industry and developers are responding to rogue AI agents

There’s no AI security tool that can magically solve the problem of rogue agents. Rather, the approach is to go back to the fundamentals of API governance and focus on clear contracts and sensible permissions, as well as visibility into what’s running in production.

What does this mean in practice? Well, LLMs are now first-class API consumers, not edge cases. But unlike humans, they don’t rely on intuition, implied context, or loose interpretation. Since LLMs operate probabilistically, they don’t reliably interpret ambiguous naming, vague descriptions, or inconsistent schemas the way developers might.

As a result, returning to governance and visibility fundamentals is still critical, but those fundamentals must evolve. Let’s take a look at approaches to help:

Focus on spec-driven API development: Some of the best practices are to tighten contracts, clarify intent, and embed expectations directly into API specifications. This may seem like needless bureaucratic overhead, but it’s not. For agents, the spec is the product. When intent is explicit, agents behave predictably. When it’s vague, they improvise.
Boring controls matter: Rate limits. Schema validation. Token scoping. Anomaly detection. They’re the difference between a contained incident and a cascading failure. Worry less about stopping agents from acting, and more about bounding the damage when they inevitably do something unexpected.
Secrets management is evolving: There’s a shift away from shared API keys and proxy credentials. In their place are single-purpose identities and short-lived tokens. When an agent misbehaves, teams need to know which agent did what and how fast it can be shut down.
Smart API security: AI may be opening up more attack surfaces, but there’s also ways to use it to defend from attacks by closing the loop on security testing faster.
Enforce least-privilege access at the API level: Agents should only be granted access to the specific APIs and operations they need. This would limit how far an agent can go off-script. Specific permissions would make unintended agent behavior easier to contain.

For developers, none of this requires reinventing how APIs are built. But it does require thinking differently about what can happen when APIs are used by machines instead of humans.

Operationalizing agent-ready APIs

Translating these evolved governance principles into practice means designing APIs for how agents actually behave, not how we hope they will. Developers need to assume agents will explore edge cases. Least privilege isn’t optional anymore. If an agent only needs read access, don’t give it write. If it only needs a subset of data, don’t expose the entire object and hope for the best.

Documentation has to mature too. Make sure the parameter types and values constraints are explicitly enumerated or described, in addition to highly structured examples and schemas are highly structured. Otherwise, the agents will tend to guess and make assumptions. No doubt, this is how small errors turn into outages.

As for guardrails, teams need to be serious about schema checks, scan for unused or overly broad endpoints, and validate payloads. From there, there must be a focus on unusual call patterns, like unexpected spikes, odd sequences or abnormal payloads. These need to be identified before they quickly spiral out of control.

And finally, teams will need to understand that everything should not be agentic. Deterministic workflows should stay deterministic. AI agents are powerful where judgment and flexibility matter. They’re dangerous where precision and predictability are required.

Of course, risk cannot be eliminated. But proper approaches can make rogue behavior observable, debuggable, and fixable, not a potential existential threat.

In an agentic world, the safest APIs aren’t those that are the most locked down. They are those with clear contracts, tight scopes, and intentional boundaries. So when agents operate, they can do so within limits developers actually meant to set.

Discussion about this post

Ready for more?