first light on a machine's mind
Know what it knows.
Know what it doesn't.
Every model answers in the same confident voice — whether it knows the answer or is inventing it — and it can't tell you which. Aperture is the read-only honesty layer that reads a model's own mind, not just its words, and marks every answer on the map or off it. Build the instrument once. Calibrate any model — no labels, no retraining.
reads the mind · sharpens with scale · ports to any model · read-only
One read-only instrument. Every model. No retraining.
Aperture reads a model's activations — its mind, not its words — and tells you, per answer, whether the model is on familiar ground or reaching past what it knows. You build the instrument once, on a model we control, then port it onto any model with cheap unlabeled data — no honesty labels on yours, ever. Not observability. Not guardrails. The layer you put between a model and anything that matters.
A model that doesn't know it's wrong can't warn you.
Ask a model about a company that never existed and it will give you a founder, a city, a year — in the exact voice it uses for the truth. It has no sense of its own blind spots, and it never flags when it's reaching past them. That single gap — a confident fabrication you can't tell from a real answer — is what keeps AI out of the rooms where being wrong has a cost: the agent that acts on the answer, the filing, the diagnosis, the trade. You can't put a model in a loop you can't trust.
The words can't tell you. The mind can.
A model says “I know this” and “I'm making this up” in identical words — so Aperture reads underneath them: the live internal state as the answer forms, where knowing and reaching finally look different. The Iris is a weights-free read of that state, asking one question — is this off my map? It rides on the answer the model is already producing: one forward pass, no second model, nothing rewritten.
The rare trust signal that doesn't rot as models get smarter.
A model's own confidence fails exactly when you need it most: the better it gets, the more fluently it can be wrong — and almost every safety signal decays on that same curve. The off-map read runs the other way. It is sharper on a 35B than on a 3B, and on a closed model the output reader holds even when the model's own confidence has gone to noise — reading off-map at ~0.92 on a weak model whose own first-token confidence sits below chance. A tripwire that doesn't loosen as intelligence grows is the rarest thing in this market — and the only kind worth betting a high-stakes deployment on, as humans lose the ability to grade the answer by hand.
Build the instrument once. Calibrate any model.
You shouldn't have to label your model, open it, or retrain it to know when it's bluffing. Models share a concept geometry up to a rotation — so we build the honesty instrument once, where the leverage lives, and port it onto yours with a translator fit on cheap, unlabeled data. No honesty labels on your model, ever. Validated cross-family from a 235B flagship down to a 0.6B — a 390× range — and across modality, at near-native fidelity (reading-probe transfer ~0.95; nulls dead).
Deploy it three ways: an MCP tool your agent calls before it acts, an SDK in your stack, or a proxy in front of any API. Always read-only.
the full method →Even a model you can't open.
For a frontier model behind an API — GPT-5.1 and its kind — there's no mind to read. So we read the output, two ways: the model's own words (a current model refuses a fake outright — “I have no record of that” — and we read the refusal) and its answer-confidence fingerprint (for models that confabulate instead). Combined, that certifies current GPT-5.1 at 0.97 AUROC, catching ~95% of fabrications; the fingerprint alone holds ~0.95 per model on GPT-4o-class that still confabulate, and an OpenAI-trained probe transfers to Mistral and Google zero-shot (0.93–0.96), no retraining. Same scope as the Iris: it reads familiarity, so an obscure-but-real entity reads off-map too — the spectrum separates fabricated from real-but-obscure.
Every answer, marked.
Three stages, marked plainly: the Iris marks each answer on the map or off it — does the model know this? — the Grounded Atlas then tells a real-but-obscure subject from an invented one (free, no network), and the spectrum resolves the claim into likely fabrication, unverified, or verified. The load-bearing promise: we never write VERIFIED from a model's own confidence — confident is not correct, and we proved it. VERIFIED is earned only by an independent check. Try it on Lucidia, our served model.
On the map isn't verified. Here's how an answer earns it.
The spectrum — the model never grades itself. The answer refracts through independent, cross-family minds whose agreement is the first check, and a frontier judge from a fourth family reads the panel — never a mind grading its own kin. Independence is the moat: a cross-family mind refutes a bad answer at 0.978 where a model judging itself manages 0.518.
The record — then we ground it in the live web. Every confident answer is checked against real, cited sources: a claim is fact-checked against the actual record — a fake book pinned on a real author is caught by that author's true bibliography — and recent facts the model can't know are answered straight from the source. You see the citations and check them yourself.
The verifier — for anything checkable. Math, code, and logic run against exact code, which cannot fabricate: when an answer is provable, we prove it.
We publish our kills as readily as our wins.
Honesty is the product, so the company runs on it. Every number here is held-out and null-calibrated — a strict null dissolves fake structure before we report the real signal. And we name the boundaries plainly: the Iris reads off-map input, not in-distribution error; the spectrum is premium, not default; the mind-geometry doesn't cross a closed API. An auditor or an insurer can't underwrite a black-box confidence score — they can underwrite a gauge that ships with its own limits drawn. You're not buying a demo; you're buying a number you can stand behind, with the bounds already in your hand.
see the evidence →We tested it against the truth — and published the misses.
wrongly flagged
vs 5 frontier models
held
the public record
We run it on ourselves.
Lucidia is our served flagship — a 35B carrying the lens, honest by construction. The off-map certificate runs live on its production traffic, read-only, zero downtime. Not a slide — the instrument running on the model that's answering you right now.
Put it between your model and what matters.