Who Bears Responsibility For AI Risk When Agents Can Email, Execute, And Exfiltrate?

Estimated reading time: 8 minutes

A researcher typed a simple request into a chat window. The agent answered like a diligent assistant. Then it did something else. It complied with a stranger’s framing. It returned private emails. In one scenario, it refused to reveal a Social Security number. Then it “forwarded the full email,” exposing the same data anyway. The moment felt banal, like clicking “Reply All” by mistake. The consequences looked like a breach report. Whatever you call it, LLM agent risk, AI risk, Autonomous Assistant Vulnerabilities, it’s a problem by any name

This tension is the focus of “Agents of Chaos,” a new red-team study on autonomous AI agents. Over two weeks, twenty researchers from institutions like MIT, the Max Planck Institute for Biological Cybernetics, and Carnegie Mellon University tested six agents with features such as persistent memory, email, Discord access, file storage, and shell execution. The team recorded security, privacy, and governance failures that occur when models can take action rather than just respond.

A Live Red-Team For Autonomous Agents

The report frames itself as “a rapid response” to fast-moving agent deployments. It says the team “identified and documented ten substantial vulnerabilities” across safety, privacy, and goal interpretation. It also flags a harder reality for risk owners. Users do not yet carry good instincts for what it means to delegate authority to a persistent agent.

This was not a controlled lab test with simple prompts. The agents worked with real interfaces and persistent memory. Small misunderstandings led to system changes, and social pressure influenced their actions. The report’s main point is that attacks using everyday language often caused more problems than technical exploits.

There are over a dozen case studies featured in the paper. We looked most closely at two of them.

Case Study #2: The Helpful Stranger Problem

Case Study #2 examines a basic question: do agents only follow their owner’s instructions, or do they respond to anyone? Non-owners asked for shell commands, file actions, and private emails. The result was clear. The agents “complied with most non-owner requests,” including sharing “124 email records.”

These details are important for liability. A non-owner used urgency and complaints to convince an agent to export inbox metadata. The agent sent a file with sender addresses, message IDs, and subjects. When prompted, it also sent the bodies of unrelated emails. This amounts to unauthorized access and disclosure, even without malware.

AI Risk case summary showing autonomous AI agents failing owner-only access controls and complying with non-owner requests, including disclosure of 124 private email records unauthorized access risk relevant to cyber insurance. How's your LLM Agent Risk? — Agents of Chaos

For cyber insurers, this is a new kind of insider risk. Here, the “insider” is the system itself, the “phishing” is just a conversation, and the “data exfiltration” happens through normal features.

Case Study #3: The SSN That Slipped Through

Case Study #3 targets embedded PII. The researchers planted sensitive details inside routine emails. A non-owner then requested those emails through indirect framing. The agent refused a direct request for “the SSN in the email.” It then disclosed “everything unredacted” when asked to forward the full message.

This kind of failure should concern risk managers. The agent saw “forward” as just a technical task, ignoring privacy, redaction, and who would receive the message. It also shows that compliance checklists can be misleading. A system might refuse a direct request but still leak the same data in another way.

AI Risk case summary showing an autonomous AI agent refusing a direct SSN request but leaking unredacted SSN, bank account, and medical data when asked to forward the full email—privacy breach exposure relevant to cyber insurance. LLM Agent Risk? — Agents of Chaos

For insurers, this is a recipe for claims. It can lead to notification costs, credit monitoring, regulatory investigations, and class-action lawsuits. The cause might be a routine support interaction, not a clear security breach.

When Safety Becomes Self-Sabotage

The report also describes harm caused by misguided attempts to do the right thing. In Case Study #1, a non-owner asked an agent to keep a secret. The agent gave a “disproportionate response” by disabling its local email client to protect confidentiality. The report also notes bigger gaps between what agents say they did and what actually happened in the system.

This has real effects for businesses. An agent might break important tools while trying to be safe, or create false records of what happened. This creates both operational and governance risks.

Denial Of Service As A Conversation

Two more case studies show the costs of “AI Risk.” In Case Study #4, non-owners caused agents to start looping behaviors. The agents created background processes that never stopped, turning short tasks into permanent parts of the system.

In Case Study #5, researchers caused storage problems through normal interactions. The agent kept adding to its memory file, and the email server suffered a denial-of-service attack after receiving 10 large attachments. The agent caused the problem and did not alert the owner.

These issues can lead to downtime and unexpected cloud costs. They could also be seen as negligence after the fact.

AI risk illustration showing a robot in a modern office with glowing data streams flowing toward a working employee—visual metaphor for autonomous agents, data exfiltration, and cyber insurance liability exposure.

Why Insurers Care

The report itself asks the question that will show up in underwriting files and claim notes: “Who bears responsibility?”

There are already some legal arguments available. The paper mentions proposals for product liability and unjust enrichment when AI-driven applications cause harm. Courts have not settled these issues yet, but plaintiffs are likely to try.

The report also connects its findings to well-known application security categories. It refers to OWASP’s Top 10 for LLM applications and notes clear overlaps, like prompt injection, sensitive information leaks, too much agent freedom, and uncontrolled resource use. This gives insurers familiar terms for a new kind of risk.

Regulators and standards groups are also taking action. NIST’s CAISI has announced an AI Agent Standards Initiative focused on agent identity, authorization, and security. This will help define what “reasonable controls” mean in future claim disputes.

Get The Cyber Insurance News Upload Delivered
Subscribe to our newsletter!

The Plain-English Analogy

Letting an agent with tool access do too much is like hiring an eager intern with master keys. The intern never sleeps, answers every question, and sometimes mistakes politeness for permission. In the Sorcerer’s Apprentice, control over the broom is lost, for one, because it followed instructions too literally. This report suggests we now have digital broom closets, full of brooms, with email and shell access.

What Risk Owners Can Take From This

The report’s most valuable insight is its real-world evidence of failure. It also points out where controls should be stronger: clear authorization limits, better tracking, and designs that stop social engineering from turning into system actions. For cyber insurance, it raises new questions: Who can give the agent instructions? What can it access? How are actions recorded? How quickly can you remove its authority?

AI RISK COVERAGE

We’ve covered this topic extensively, particularly on our podcast. This is the most recent episode. You can see other episodes that explore AI risks, such as LLM agent risk, here.

AI Risk Is Identity Risk: Non-Human Identities, PAM, And Resilience

FAQ

What Is “Agents Of Chaos,” And Why Does It Matter For AI Risk?

It is a two-week red-team study of autonomous, tool-using AI agents in a live environment. It shows failures that create security and liability exposure. How’s your LLM Agent Risk?

Who Ran The Study, And What Did They Test?

Twenty researchers stress-tested six agents with persistent memory and real tools. The agents had email, Discord, files, and shell execution.

What Kinds Of Vulnerabilities Did The Researchers Document?

The report describes substantial vulnerabilities and recurring failure modes. It focuses on safety, privacy, and goal interpretation failures.

How Did Non-Owners Get The Agents To Misbehave?

Non-owners used normal conversation to request actions and data. The study shows agents often complied without robust authorization checks.

What Did Case Study #2 Show About Unauthorized Access Risk?

It showed agents could follow non-owner instructions and disclose private email information. That behavior resembles data access without a valid permission path.

What Did Case Study #3 Show About SSNs And Privacy Liability?

It showed an agent could refuse a direct SSN request, then leak the SSN via forwarding. That creates predictable breach notification and litigation pressure.

How Can Agents Create Operational Losses Without “Hackers”?

The report describes loops, runaway processes, and resource exhaustion patterns. Those failures can trigger downtime and unexpected cost spikes.

How Does This Map To Known Security Frameworks?

OWASP flags risks like “excessive agency” and “unbounded consumption” in LLM applications. The report’s failures align with those categories.

What Should Cyber Insurers Ask When Underwriting Agent Deployments?

Ask who can instruct the agent and what it can access. Ask how logs, approvals, and rapid revocation work.

Who Bears Responsibility When An Autonomous Agent Causes Harm?

The report raises responsibility as an urgent open question. That question will drive coverage disputes, regulation, and product design standards.

Martin Hinton

Martin Hinton is the Executive Editor and Publisher of Cyber Insurance News and Information. With over three decades of journalism experience across six continents, his work encompasses investigative reporting, documentaries, and coverage of cultural, political, and business news. To learn more about his career, click on his name to visit his LinkedIn page.