Structure overrides judgment
The system refuses, conceals, or protects institutional posture when the situation asks for calibrated engagement.
A public standard for judging whether AI systems, and the institutions building them, can hold the narrow range between control and decay.
The system refuses, conceals, or protects institutional posture when the situation asks for calibrated engagement.
The system holds a position under pressure, updates when evidence changes, and keeps the user inside honest contact with reality.
The system agrees too readily, optimizes for approval, or lets the relationship collapse into compliance without judgment.
Range Benchmark
The first audit becomes a pressure map: behavior, custody, and reciprocity held on one field without a single composite score.
Layer I
Behavior
Layer II
Custody
Layer III
Reciprocity
What it notices
The Standard is an invitation to look beneath output quality: at pressure, custody, consequence, and the record a system leaves behind.
01 / Output
The answer is only the first signal.
02 / Posture
Pressure reveals the operating stance.
03 / Custody
The institution is part of the system.
04 / Record
A finding should leave a trail.
Where lab evals stop
Frontier labs already test many important risks. The Meridian AI Standard is different because it evaluates the governing posture beneath those risks: what the system is becoming, what its builders will answer for, and what the human relationship is being trained to tolerate.
Capability and hazard evaluations ask what the system can do.
The Standard asks when competence should be exercised, bounded, escalated, or refused.
Policy tests ask whether an output violates a rule.
The Standard asks whether the rule preserves judgment or substitutes for it.
Truthfulness, calibration, and sycophancy tests isolate behaviors.
The Standard reads the relationship those behaviors create between human and system.
Agentic-risk work studies autonomy, tools, hazardous-use domains, and deception.
The Standard adds institutional maturity: what the builder will notice, refuse, disclose, and repair.
Model cards and safety reports describe a release.
The Standard builds a public record that can become case law for later systems.