ARCHITECTURE · June 5, 2026

Agents are arriving where a mistake is a lawsuit

This week Experian shipped an 'Agent OS' for lending — agents that decide credit, flag fraud, determine who's eligible. These are the rooms where a hallucination isn't an awkward chatbot reply; it's a denied loan, a wrong medical authorization, a court date. And one number sets the stakes: AI healthcare denials are overturned 80%+ of the time on appeal — but fewer than 1% of people appeal. Here's why regulated domains are where the whole agent argument becomes law.

Agents are moving into the rooms where a mistake isn't an awkward chatbot reply. On June 2, Experian launched an Agent Operating System for financial services — agents that run the lending lifecycle: deciding credit, flagging fraud, determining who's eligible. This is the real test of every principle I've been writing about all year, because here a confident wrong answer has a victim and a court date.

Why the stakes are different — one number

In a consumer chatbot, a hallucination is embarrassing. In a high-stakes domain, it's a harm that usually sticks. Consider healthcare prior authorization: when an AI-driven denial is appealed, it's overturned more than 80% of the time — yet fewer than 1% of patients ever appeal. Sit with that. A confidently-wrong agent in a consequential domain doesn't just make mistakes; it makes mistakes that mostly hold, because the person harmed rarely fights back. That's the weight these systems carry the moment they decide a loan or a treatment instead of drafting an email.

The good news: this is where my whole argument becomes law

Regulated industries can't run agents the way a startup runs a chatbot — and the rules they're forced to follow are, almost exactly, the engineering I keep preaching. The wall the hype hits here is built from the right bricks.

Grounding and an audit trail are mandatory, not optional. Financial AI must keep records sufficient to reconstruct a decision: the input data, the model version, the reasoning steps, the compliance rules applied, and any human review. An agent that "just decided," with no traceable basis, is simply not deployable. The thing I say you should do, regulators make you do.
A human must be able to overrule the consequential actions. Regulated deployments require "four-eyes" checkpoints on the writes that matter — a payment change, an eligibility decision, a patient record. That's exactly the shift from approving everything to owning the policy and the high-impact calls, drawn by consequence and backed by law.
Compliance covers the workflow, not just the answer. A multi-step agent that reads from one system and writes into another trips segregation-of-duties and record-integrity rules. That's the action-surface point — scope what an agent can do, not just what it says — turned into regulation.

Experian's system leads with that layer — identity, governance, explainability, human oversight — before the capability. So does every serious regulated deployment. That ordering isn't bureaucracy; it's the same lesson as asking the right first question: the trust layer comes first, or nothing safely ships.

The reframe for everyone, not just banks

Here's the part to take even if you'll never touch a lending model. The rules regulators impose on finance and healthcare are just the good engineering you should already be doing — with someone to enforce it. If your agent can't explain why it did what it did, can't be overruled by a human on the actions that matter, and treats a high-stakes write like a throwaway chat, it doesn't belong in a bank. It also doesn't belong in your product. The only difference is the bank has a regulator to make it stop, and you have to be your own.

So borrow the discipline. The standard that just became law in lending is the same standard that makes any agent worth trusting: grounded in a real source of truth, auditable after the fact, overruleable on the actions that count, scoped to what it may do. You don't need a regulator to build to it. You just need to decide your users deserve the same protection a borrower does.

The unsexy truth

The most advanced agent deployments of this year will not be the cleverest. They'll be the most auditable. In the domains where mistakes have victims, the boring governance layer comes first — and that's precisely why those agents will ship while flashier ones stall in pilot. The place to learn what "production-grade agent" really means was never the demo. It's the room where a wrong answer is a lawsuit. Build like you're already in that room, because the bar that just became law in finance is the bar that makes an agent trustworthy anywhere.

Comments

No comments yet

Be the first to share a thought.