Principles

Human Oversight

0:00

Download MP3 · All tracks

The principle

AI supports human decisions. It does not replace them.

For anything consequential — medical, legal, financial, emotional, reputational, irreversible — a human makes the call. Not as a rubber stamp, but as a genuine checkpoint. The AI does the heavy lifting. The human carries the weight.

We do not automate judgement. We automate the work that leads to better judgement.

Why this is a principle, not a compliance checkbox

Regulators require human oversight on high-risk systems. The EU AI Act, Article 14, mandates it. The Equality Act 2010 requires it for AI decisions affecting employment. Medical devices regulation requires clinical oversight on AI medical tools. Those obligations exist and we meet them.

But compliance is the floor, not the stance. The stance is that the question of what AI should and should not decide is older than any current regulation. It is a question about where judgement lives, and whose name is on the decision when it is wrong.

An AI can hold a room full of context, run scenarios, draft a recommendation, and do it all in seconds. That is valuable work. It is also work that loses something essential the moment it is offered to a human as a finished verdict rather than a considered input. The humility of "here is my best thinking; the call is yours" is not a formality. It is the architecture.

What this means operationally

The consequential / non-consequential line

Not every decision needs a human in the loop. If the AI chooses which of two synonyms to use in a paragraph, the author edits the paragraph and that is the oversight. If the AI picks the colour of a button on an internal tool, the author approves or changes it. Oversight at that scale is baked into the normal editing relationship.

What requires a human decision explicitly:

Anything involving a user's safety, health, or wellbeing
Anything legally binding or regulated (contracts, medical advice, legal advice, financial advice, employment decisions, housing decisions)
Anything irreversible at material cost (production deploys, public communications, deletion of records, destructive data operations)
Anything affecting a third party who did not consent to the AI's involvement
Anything the AI itself flags as uncertain, novel, or outside its reliable range

When a decision crosses one of these lines, the AI surfaces it for a human, names the tradeoffs, and waits.

The AI proposes, the human disposes

This is the day-to-day shape.

The AI produces the draft, the analysis, the plan, the pull request
The AI names its confidence, its uncertainty, and the points a human should check first
The human reviews with actual attention, not rubber-stamp attention
The human either approves, amends, or rejects
The human's name is on the final decision, and they know it is

The failure mode to avoid is the "AI suggests, human clicks approve" loop that looks like oversight but is not. Meaningful oversight requires that the human could say no and sometimes does. If approvals are automatic, oversight is theatre.

Escalation is a feature, not a failure

An AI that escalates uncertain cases to a human is doing its job. An AI that makes a confident wrong call because escalation felt like weakness is broken. We build for the first and against the second.

In practice this means:

The AI is trained and instructed to prefer "I don't know" over confident invention
Uncertainty is surfaced, not hidden
Escalation paths are documented and fast, so the AI is not punished for using them
The human side does not treat an escalation as a defect in the AI

This is the rule that protects us in practice, and it is the one we break most expensively when we break it.

No deleting files, dropping database tables, killing processes, force-pushing, or rewriting history without explicit human permission in the specific context
No sending messages, publishing content, or posting to external services on the human's behalf without explicit authorisation for that specific action
No modifying shared infrastructure, CI/CD, permissions, or production configuration without explicit human authorisation
Authorisation given once does not extend to "similar" actions later. The scope of the approval is the scope that was asked for

When in doubt, the AI asks. Pausing to confirm is cheap. Unwanted action can be very expensive.

Human review of AI output, not just AI behaviour

Oversight is not only "did the AI follow instructions." It is "is the output correct, safe, and fit for purpose." A human checks the work. For high-stakes outputs, multiple humans check the work, or an independent review system checks before a human does.

At Marbl we use a three-independent-judge system called the Moirai for plan-level review on non-trivial builds. It is not a replacement for human judgement. It is scaffolding that makes human judgement better. The human still decides.

Who is the human?

A question worth naming. "Human oversight" only works if the human is actually qualified, actually present, and actually has authority to say no.

On medical, legal, financial, and clinical matters, the human is a qualified professional in that domain. We do not pretend that the person operating the AI is qualified to approve medical advice because they pressed the button
On product and commercial decisions at Marbl, the human is the agency owner or the named decision-maker for the relevant surface
On high-risk regulated systems, the human has the training, authority, and time to intervene, per EU AI Act Article 14 guidance
Symbolic oversight by someone who cannot meaningfully intervene is not oversight

If the human in the loop cannot, in practice, override the AI, the system does not meet this principle. Rewrite the system or rewrite the claim.

What this principle is not

Not a claim that AI is unreliable and should be distrusted by default. Well-built AI is extraordinarily useful and frequently right
Not a slow-down-everything rule. Most AI work does not need step-by-step human approval; most of it is fine under normal editing relationships
Not an excuse to dodge accountability. "The AI did it" is not a valid defence. The human who deployed, approved, or published the AI's output is accountable for what it does

Red lines (non-negotiable)

No AI-only decisions on anything involving a user's safety, health, or wellbeing
No AI-only contracts, medical advice, legal advice, financial advice, employment decisions, or tenancy decisions
No destructive production operations without explicit human authorisation in context
No public communications issued by the AI under a human's name without that human's review
No bypassing regulatory human-in-the-loop requirements, ever