Situation Room Provocation: What the Octopus Teaches Us About Agentic AI


Wednesday 21st April 2026

The Situation Room is brought to you by Trusted Agents

What the Octopus Teaches Us About Agentic AI

How agents learn, adapt, and find weaknesses that human-designed processes leave behind

Hi, welcome to the Trusted Agents Situation Room. We help leaders see where agentic systems will surface hidden weaknesses in data, process, and policy, and what controls need to be built before those systems scale. Because by the time this becomes obvious, it will already be late.

Alien Intelligence, Real Lessons

One of the most striking things about the octopus is that it does not solve problems in the way we expect intelligence to work. Bill Parker points to the veined octopus carrying coconut shell halves across the seabed and assembling them later into a shelter. That is not instinct in the simple sense. It is foresight, improvisation, and tool use from a creature whose mind evolved on a very different path from ours. Octopuses learn, solve problems, and find ways out of confinement despite having a neural architecture that is nothing like our own. As Bill Parker puts it, they “defy our conventional definitions of intelligence,” and yet they still manage to do things that surprise the humans studying them.

That matters because most of the systems running our businesses were built around human behaviour. Humans tire. Humans follow social norms. Humans miss things. Humans often leave weaknesses untouched because they are too busy, too polite, or too narrowly trained to pursue every possible path. An octopus does not share those assumptions. It learns through a different kind of intelligence and can escape using methods its human observer did not anticipate. In My Octopus Teacher, what stands out is not just that the octopus is intelligent, but that its intelligence is adaptive, embodied, and alien to the human trying to understand it.

Anthropic’s Mythos story brings the same pattern into the digital world, but playing out at machine speed. Mythos found vulnerabilities in major operating systems and browsers that had survived years, and in some cases decades, of human review and automated testing. The lesson is not just that AI can find software bugs faster. The lesson is that non-human intelligence is now capable of finding holes in environments designed by humans, for humans, and until now mostly tested by humans.

Agentic AI is not just adding another digital worker to the org chart. It is introducing a kind of intelligence that can learn, adapt, and probe systems without sharing the limits or habits that made those systems workable in the first place. That changes the meaning of robustness. It is no longer enough for a process to work when a human follows it. It needs to hold up when an agent pursues its goal all the way to the edge of the rules.

When Agents Find What Humans Miss

Most leaders will read the Mythos story as a cybersecurity story. That is too narrow. The more important point is that Mythos was able to find vulnerabilities that had survived years, and in some cases decades, of human review and automated testing. Anthropic says it identified thousands of zero-day vulnerabilities, including flaws in every major operating system and web browser they tested, and that some of these had persisted for decades in code that had already been heavily scrutinised.

That matters well beyond software security. The real lesson is that capable machine actors are now able to surface weaknesses that human systems have quietly lived with for years. In cybersecurity, that weakness may be a bug. In a business process, it may be a missing control, a broken escalation path, a pricing loophole, or an exception case that only works because a human steps in and makes a judgement call. The pattern is the same: the system looked robust until something non-human started probing it properly.

That is why Mythos should be read as a warning for operators, not just for CISOs. Once agents can learn, test, and pursue outcomes at machine speed, every hidden assumption in a workflow becomes part of the attack surface, even when no attack is intended. Bernardo Crespo makes this point well when he argues that institutions are still “thinking at human speed while the threat landscape evolves at machine speed.” That may be most obvious in cyber, but it applies just as much to customer journeys, decision flows, and operational policies.

Three Enterprise Weaknesses Agents Expose

Compared with humans, who tend to follow rules and are rarely exhaustive in looking for loopholes, agents are downright rude. If there are vulnerabilities in your operation, agents will find and exploit them, not maliciously, but simply because they are following orders. In practice, they tend to surface three kinds of weakness: data, process, and policy.

The first is data weakness. This is where the information an agent needs is incomplete, inconsistent, badly structured, or simply unavailable in a form a machine can use reliably. Humans compensate for this all the time. They know which spreadsheet is wrong, which field is never updated, which customer record needs interpreting rather than trusting. Agents do not compensate in the same way. They act on what is there, at speed and at scale. Agents operating on imperfect data will not just create suboptimal outputs, but you run the risk of unwanted outcomes.

The second is process weakness. Many business processes only appear robust because humans are constantly repairing them. Someone notices an exception. Someone overrides a bad recommendation. Someone spots that the workflow technically says one thing but common sense says another. These are not edge cases. In many organisations, they are the process. Once an agent starts following that same workflow literally, or optimising inside it without the benefit of social norms and tacit judgement, the weakness becomes visible very quickly. That is the broader lesson from Mythos: the flaw may have been there all along. It just needed a different kind of intelligence to expose it.

The third is policy weakness. This is where the organisation believes it has a rule, but the rule lives in training, governance documents, or human memory rather than in the system itself. Humans may know not to take a certain action, to escalate a certain case, or to apply discretion with a vulnerable customer. But if that policy is not encoded into permissions, thresholds, routing, and controls, then it is not really governing machine behaviour. It is only describing desired human behaviour. That distinction matters much more in an agentic environment.

This is why agentic readiness is not just about choosing a model or deploying a tool. It is about understanding where your business still depends on human patching, human interpretation, and human restraint. With a combination of data, process, and policy weakness, at best the agent will fail and stop running. At worst it will carry out actions that are difficult, expensive, or impossible to recover from.

Why Legacy Systems Break Under Agent Pressure

Most enterprise systems were built on an assumption that the actors inside them would be human. That means the process design quietly relies on human pacing, human judgment, and human restraint. A person might notice that a record looks wrong and pause. They might decide not to push an edge case because it feels unreasonable. They might know that the formal process says one thing, but that the right outcome for the customer requires something else. In many organisations, that invisible layer of interpretation is what makes the system work.

Agents do not bring that same layer with them. They do not tire, they do not get bored, and they do not stop at the point where a polite human would stop. If the instruction is to optimise, they will keep optimising. If the workflow allows an action, they will keep taking it. That is why the real lesson from Mythos is not just that it found old vulnerabilities. It is that it kept looking in places humans had already stopped looking, and found paths through systems that had survived decades of review.

This is the real break with the past. When businesses move from software used by humans to software used by agents, they are not just changing the interface. They are changing the nature of the actor inside the system. That means the standard for robustness has to change as well.

A process is no longer robust because it usually works when a human follows it. It is only robust if it still behaves safely when a machine actor pushes it to the edge of the rules.

Three Ways to Prepare for Agent Pressure

There is no perfect way to future-proof a business for agentic AI. But there is a practical place to start. If agents are going to act inside your operations, then the question is no longer whether the model is impressive. The question is whether the environment around it is robust enough to handle a non-human actor that is fast, tireless, and literal.

1. Make your data usable by machines, not just tolerable for humans.
Most organisations are running on data that humans have learned to work around. People know which fields are unreliable, which records are incomplete, and where context lives outside the system. Agents do not have that same intuition unless you design for it. Data needs to be higher quality, better structured, easier to access, and accompanied by enough context for a machine actor to use it safely. If not, you are not just risking poor outputs. You are risking actions based on incomplete or misleading information.

2. Redesign processes for agents, not just for staff.
A process that works with people in the loop is not automatically ready for agents. You need to look closely at where human judgment, delay, escalation, and exception handling are doing the real work. Those hidden interventions need to be made explicit. In some cases that means adding controls, thresholds, and approval points. In others it means simplifying the workflow altogether. The point is simple: if a process only works because a human quietly patches the gaps, then it is not ready for agent pressure.

3. Build policy into the system, not around it.
In many businesses, policy still lives in slide decks, manuals, training, and the judgment of experienced staff. That is not enough. If an agent is allowed to act, then the rules governing those actions need to be embedded in permissions, routing, limits, and audit trails. The machine should not have to guess what the organisation would have wanted. It should be constrained by design. That is especially important in customer-facing journeys, regulated environments, and any workflow where an action may be difficult to reverse.

This is the mindset shift.

Preparing for agent pressure is not about teaching machines to behave more like humans. It is about making your systems robust enough that they no longer depend on human restraint, human patching, and human memory to stay safe.

The Questions to Put on the Table

The practical challenge now is not deciding whether agentic AI matters. It does. The challenge is deciding whether your organisation is ready for the kind of pressure agents will place on systems that were designed around human behaviour.

That starts with a few blunt questions.

  1. Where does this process only work because people quietly patch the gaps?
  2. Which decisions rely on judgment that lives in someone’s head rather than in the system?
  3. If an agent pushed this workflow all the way to the edge of the rules, what would actually happen to the customer, the business, and the data?

Those questions matter because this is not just a technology issue. It is a customer experience issue, a governance issue, and an execution issue. The organisations that move well here will not be the ones with the most demos. They will be the ones that are most honest about where their current systems are fragile.

Where Trusted Agents Comes In

When Trusted Agents works with clients, those are exactly the questions we put on the table. We put your processes, customer experience, data, and governance on the spot, then build the execution plan needed to address them. If your organisation wants to push the envelope on agentic AI without losing control of what matters, book time with us.

Start here: Trusted Agents

Want the next Situation Room note?

Trusted Agents

An advisory firm specialising in Agentic Commerce, Digital Trust and Customer Empowerment.

Read more from Trusted Agents
Behind the lit windows of the modern enterprise, agents are beginning to touch live systems long before most organisations are ready.

Sunday 19th April 2026 The Agentic Shift: Agents Are Entering Live Systems Before Enterprises Are Ready Where agents are really working, what large organisations are getting wrong, and why the next layer of infrastructure now matters Behind the lit windows of the modern enterprise, agents are beginning to touch live systems long before most organisations are ready. Photo by Vitor Paladini on Unsplash You’re in the Trusted Agents Situation Room. Each post gives you the signal behind the...

First steps for a small country hotel to be discoverable to GenAI tools and agents

Saturday 11th April 2026 The Starting Line: How Hotels Get Found by AI First steps for a small country hotel to be discoverable to GenAI tools and agents Photo by Manuel Torres Garcia on Unsplash Welcome to The Starting Line. This is where we explain agentic commerce in plain English for business owners who are not trying to be first, but do want to avoid being left behind. In this case, that means the owner of a small country hotel who wants to understand how AI tools are beginning to...

When the customer arrives as an agent. The story of NYC's MyCity Chatbot

Tuesday 31st March 2026 When the Customer Arrives as an Agent As AI agents begin to browse, compare, negotiate, and buy on behalf of customers, businesses need a clear way to recognise them, trust them, and set limits on what they can do When the customer arrives as an agent, businesses will need to decide whether to block, ignore, tolerate, recognise, or trust that new channel. Photo by David Hurley on Unsplash Hi, welcome to the Trusted Agents situation room. We help leaders decide where...