22 August 2025

When AI Should Ask Permission (and When It Shouldn't)

By Asgeir Albretsen5 min read

ai-agentsdesigntrustapproval

Asking for approval isn't automatically safe. The design question is which actions earn a prompt, and what happens when you get that wrong in both directions.

In October 2013, Target's security software caught the breach. Alerts fired. A team in Bengaluru flagged them to headquarters. The malware was identified. The warning was sent. Nobody acted on it.

Not because they missed it. Because they were buried. The system had been configured to monitor everything and flag everything, so "everything" had become noise. By the time the attackers exfiltrated 70 million credit card records, the real signal was indistinguishable from the routine hum of too many low-stakes warnings, too many times the system had cried wolf.

Most people take the wrong lesson from this. They assume the answer is better alerts, more specific alerts. That's partly right. But there's a second lesson that AI tool designers are currently repeating: if you ask for permission too often, you train people not to give it.

The approval paradox

The instinct when building AI tools that write to your notes, update your contacts, or propose task edits is to ask. Every time. The logic feels responsible: more oversight means more safety.

But research from security operations shows that 63% of security alerts go uninvestigated. Not because the people are careless. Because attention is finite, and when you exhaust it on low-stakes prompts, there's nothing left for the ones that matter. One study found that for every repeated reminder of the same alert type, user attention dropped 30%.

The approval fatigue this creates is worse than no approval at all. Users click through without reading. They build a muscle memory of accepting that fires automatically. Then something genuinely important slips past that same reflex.

This is the approval paradox: an AI that asks for permission for everything is safer in theory and less safe in practice.

What actually earns a prompt

There's a working heuristic, and it isn't complicated: ask when the action is irreversible, broad in scope, or low-confidence.

Reversibility is the most important axis. If an action can be undone, if a previous version is kept and a diff is available, then requiring explicit approval is mostly friction. The approval isn't protecting anything that couldn't be recovered. But a file permanently deleted, a message sent, an external API call made: those can't be walked back. The approval gate is load-bearing there.

Scope matters for similar reasons. An AI that reads your notes to draft a summary is doing something with no side effects. An AI that rewrites a contact record for someone you've known fifteen years is touching something you've built over time. Even if the edit is technically correct, the scope warrants a pause.

Confidence is the third lever, and the most underused one. An agent that knows it's uncertain, that has flagged ambiguity in the instruction or found two plausible interpretations, should ask. Not because the action is risky. Because the right thing to do is genuinely unclear. Asking in that case is information, not ceremony.

The policy layer

What doesn't work: binary models. Either the AI asks for everything or it asks for nothing. Both are failures.

What works is something closer to a policy, a set of rules that determine when prompts appear, grounded in the actual risk profile of each action. Reads can be auto-approved. Non-destructive writes with version history can be auto-approved. Destructive operations need explicit sign-off. Writes to high-stakes entities, like person records or financial documents, need review. Low-confidence actions in ambiguous situations: always ask.

The policy has to be legible, and this is the part most tools skip. Users need to understand why some actions prompt and others don't. Otherwise the whole model breaks. When "approve this?" appears for adding a tag to a note, users learn to treat the prompt as meaningless. Then when it appears for something that actually matters, it's buried in the noise they trained themselves to ignore.

In Harbor, this landed as a patch-review model. AI writes are proposed as reviewable diffs, and the categories of change that require review versus auto-apply are grounded in reversibility and entity type. Reading a note and proposing additions: low stakes. Deleting a person record you've built over years: a different problem entirely. The policy makes that boundary visible rather than hiding it in default behavior.

The question nobody asks

Most conversations about AI safety focus on preventing the AI from doing bad things. That's the right concern. But the adjacent problem, building approval systems so noisy that users stop paying attention, gets almost no scrutiny.

The cost of asking too much isn't zero. It's the atrophied attention that no longer engages when something genuinely requires a human in the loop. It's the trained reflex that clicks through, the user who stopped reading approvals six months ago because they never meant anything.

Getting this right isn't about removing oversight. It's about making oversight legible enough that it stays meaningful. The question isn't how many times to ask. It's which times matter.

Asgeir Albretsen is the founder of Harbor.

← All posts