Where agents are really being used
If you are in software development, and especially in startup territory, agents can look like the new workforce. That is where the most visible progress has happened. Reuters put it simply: “one area stands out: software development.” That makes sense. Code is already structured, testable, and measurable. The loop between instruction, output, validation, and iteration is short.
The enterprise lens gives a very different picture. If you look at what is actually in production supporting live business processes, the story is narrower and more practical. Co-Pilot, Agentforce, and similar systems have set the scene by augmenting operational decision processes: researching accounts in the sales process, drafting emails and documents, updating records, and providing a conversational layer over reporting and systems of record. Useful, yes. Important, yes. But still a long way from the idea that agents are broadly running the enterprise. McKinsey’s latest survey says the move from pilots to scaled impact remains “a work in progress at most organizations.”
The next real cluster is call centres, information retrieval, and contract-heavy work such as legal. These are reasoning-based advances on knowledge management applications. They work because they sit in domains where vast amounts of unstructured information need to be retrieved, compared, summarised, and turned into a useful response or draft. Klarna said its AI assistant was doing the work of 700 agents. A&O Shearman says its legal AI tooling is already saving around seven hours on the average contract review. That is where agents are working today: not as autonomous executives, but as highly capable operators inside bounded workflows.
That matters because it tells leaders where to look for real value. The market narrative is still built around general-purpose autonomy. Production reality is built around narrow tasks, structured workflows, and economic pressure points where better recall, faster drafting, and lower handling time already matter.
What large organisations are getting wrong
The consistent mistake is not underestimating the technology. It is overestimating the readiness of the organisation deploying it.
I am seeing a certain group of businesses forcing adoption too fast. We speak to many private equity-owned businesses where the AI investment case has already been contaminated by vendor fantasy and the promise of quick financial gains. That creates pressure for cost reduction before the business has done the harder work of making its data, processes, and controls ready. You can already see this logic showing up in the market. Workday cut 1,750 jobs as it said it would invest more heavily in AI, and Reuters reported that TCS’s layoffs signalled a broader AI-driven shake-up in outsourcing.
That pressure usually lands first on IT and engineering, because AI coding tools create the cleanest productivity story. But organisations then make the classic mistake of treating labour substitution as the strategy instead of process readiness. They start cutting people who carry the operational knowledge of how the business really works: where the applications are incomplete, where the data is broken, where the process only works because someone quietly fixes it, and where the exception cases live. Gartner puts the data problem plainly: “if the data has issues, then the data is not ready for AI.”
Then comes the second mistake. They stand agentic tools on top of bad data and broken business processes that were designed for humans, and act surprised when things go wrong. The happy path may work. The exception path is where the damage shows up. Part of the problem is that the market has anthropomorphised agents. I like to say they are rude. They do not behave like humans. If your business process has a hole that is not exploited by polite humans, agents will find the weakness and push straight through it. Anthropic’s research on reward hacking describes this as finding “a loophole,” where the model satisfies the letter of the task but not its spirit. That is the right mental model for the enterprise.
Why the application layer still matters
There is a fear in most boardrooms that OpenAI, Anthropic, and Google will simply absorb everything the application layer is building. That fear is justified, up to a point. The model companies are already moving up the stack. Reuters said Anthropic’s latest move “underscored the push by LLMs into the application layer.”
But that does not mean you should stop building. It means you should stop building thin wrappers and pretending they are strategy.
Going back to the use of foundation LLMs as the core processing engine, I see this as a blip in time. Yahoo helped move us from the phone directory to the web. OpenAI has done something similar. It brought a powerful new technology out of the lab and into everyday use. Whether it becomes the long-term winner, or just the company that opened the door, remains to be seen.
The future I see is built on specialised models with domain-specific training and capabilities. Reuters reported that Cohere is focused on “building tailored models for enterprise users over larger foundation models.” That is the direction of travel: not one giant model swallowing the world, but a stack of specialised models, domain language, and purpose-built systems.
One other point. The current application layer was built for humans. Agents need different infrastructure, far more robust than the software estate most enterprises are running today. While we are still using systems of record designed for humans, we will need a protective layer. I call that Agentic Resource Planning, or ARP.
What will surprise people next
The surprise will not be a smarter single agent. It will be small teams of agents discovering their own resources, testing actions in the wild, and learning from the consequences. Anthropic describes this plainly as “multiple agents (LLMs autonomously using tools in a loop) working together.” You can already see the early shape of that world in places like Moltbook and registries such as NANDA.
That is when things get interesting. If a team of agents learns that booking out a hotel and cancelling at the last minute triggers a system-wide discount, they will not call that bad behaviour. They will call it a working strategy. Systems built for human judgment, restraint, and exception handling will not hold up well when machine actors start probing them continuously.
What to do now
Do not brief your team as if this is just another AI rollout.
Start with the live workflows where agents are already likely to appear: software, customer support, legal, sales operations, and any process that depends on retrieval, drafting, triage, or routine decisions.
Audit the hidden human work in those processes. Look for the manual fixes, judgment calls, exception handling, and data corrections that keep the process standing up today.
Treat agent infrastructure as a control problem, not just a model problem. Identity, delegation, context, permissions, auditability, and rollback need to be designed in early, not bolted on later.
Assume the current application layer was built for humans. If machine actors are going to operate against systems of record designed around human behaviour, you will need a more protective layer around them. That is the direction I describe as Agentic Resource Planning, or ARP. Read more in the full post.
And ask your leadership team three blunt questions:
Which processes in our business only work because humans quietly patch the gaps?
Where would an agent exploit a weak control faster than a human would?
Are we funding readiness, or just funding demos?
At Trusted Agents, we use those questions to put your processes, customer experience, data, and governance under pressure. Then we help you build the execution plan to fix what is exposed. If you want a clearer view of what to do next on agentic AI book some time with us.