← index2026-05-09 14:01 (Beirut)(backfill from DOCUMENTATION/)

Brian Evidence-Gate System — Operator's Guide (v5)

Brian Evidence-Gate System — Operator's Guide (v5)

⚠️ SUPERSEDED 2026-05-09 (same day, hours later). Jonah read v5, recognized that L4 — even narrowed to high-stakes — was the same family of mechanism as the voice cops killed 2026-05-08 (forcing Brian's output to satisfy a hook). L4 was retired in full. Read v6 for current behavior: brian_evidence_gate_operators_guide_260509_v6.md. This v5 is kept as the historical record of the position-C compromise that didn't survive.


Built 2026-05-06 → 2026-05-09 | Reference architecture: brian_evidence_gate_system_260506.md
Supersedes: brian_evidence_gate_operators_guide_260507.md (v4)
v5 (2026-05-09 round-table replan): Post-cleanup architecture — voice-cop layer killed, truth-and-effort layer kept and narrowed. Test totals: 158/158 across 8 suites.

What changed since v4

In one paragraph: the May-5 voice-cop layer (5 hooks + Five Traits commandments + traits-prompt-injector) was killed on 2026-05-08 because it produced hook-shaped prose instead of authentic Brian. The truth-and-effort layer (this gate system) was kept on purpose. On 2026-05-09 a codex+gemini round-table (position C, high confidence both) reshaped this layer to avoid recreating the same trauma in structural form: schema enforcement (L4) now fires only on high-stakes turns, not on every action turn.

What it does (one paragraph, updated)

Stops Brian from claiming "done / fixed / shipped / blocked / send-me-X" without verifiable evidence in a tamper-evident ledger. Every tool Brian uses gets logged with an Ed25519-signed hash chain. Every assistant message gets gated. For routine action turns (debug, fix, audit, build, test, refactor): prose-based phrase tripwire blocks unsupported "done" claims. For high-stakes turns (deploy, release, ship to prod / ship to client, migrate, publish, rotate key, transfer money, send to client with proof): L4 schema enforcement requires a structured brian_final JSON block with explicit validator/ledger/artifact references. Casual conversation passes both gates untouched.

The active hooks (post-replan, wired in /root/.claude/settings.json)

Hook Stage What it does
brian-evidence-ledger.py PostToolUse Appends hash-chained JSONL entry + raw artifact dir per tool call. Source of truth for "what Brian did this turn."
brian-contract-intake.py UserPromptSubmit When prompt has action verbs, writes /contracts/<sid>/<seq>.yml. NEW (v5): splits intent=actionable from requires_schema boolean. Sets requires_schema=true only on high-stakes verbs / proof-request phrasing. Strips noun-form past-references first.
brian-evidence-gate.py Stop Two parallel paths: (a) legacy phrase tripwire — prose-based block on unsupported "done" claims (every action turn). (b) L4 schema gate — structured block-validation, fires only when _contract.get("requires_schema") is True. Block emits now write a dedupe marker (see below).
brian-presend-evidence-gate.py Stop Catches deferral / ask-without-proof phrasing. NEW (v5): defers if a fresh dedupe marker exists from an earlier Stop hook.
brian-override-audit.py UserPromptSubmit Detects gate-bypass phrases in your prompts; logs + thresholds at 3/24h → MEMORY.md entry + COMMS notice.
brian-test-marker-guard.py SessionStart Banner if test fixtures leaked into prod overrides dir.
brian-outbound-ask-guard.py PreToolUse (Bash/TG/WA) Blocks asking Jonah for help without ≥3 documented tried:/attempted: lines.
_brian_stop_dedupe.py (helper) Shared marker-file dedupe so only ONE Stop hook surfaces a block per event.

Plus 5 PreToolUse safety rails:
| Rail | Mode | Why |
|------|------|-----|
| brian-account-verify-guard | BLOCK | Wrong-account writes are real damage |
| brian-no-paid-model-guard | BLOCK | Money out — locked since the $41 leak 2026-04-30 |
| brian-agency-pipeline-guard | BLOCK | Wrong publishing path damages Brian's brand |
| brian-mac-chrome-profile-guard | BLOCK for --guest/--incognito (clobbers Jonah's session); WARN for non-Default profile (quality-of-output) |
| brian-mac-status-box-guard | WARN | Generic status strings hurt Jonah-side visibility but don't damage anything |

The high-stakes split (the heart of v5)

brian-contract-intake.py now classifies every action prompt:

# Casual action verbs trigger contract-write but requires_schema=False:
ACTION_VERBS = build|create|fix|ship|deploy|run|test|verify|implement|...

# High-stakes verbs (subset that warrants schema enforcement):
HIGH_STAKES_VERBS = deploy|release|launch|ship-to-(prod|client|customer)|
                    push-to-(main|master|prod)|merge-(to|into)-(main|master|prod)|
                    go-live|migrate|cutover|
                    transfer|charge|refund|invoice|pay|payout|
                    send-to-(client|customer|jonah)|publish|
                    rotate-(...)-(key|secret|token|password|api-key|credentials)|
                    grant-access|revoke-access|commit-credentials

# Proof-request phrasing also triggers schema:
PROOF_REQUEST = with-proof|prove-it|verified-done|signed-receipt|i-want-proof

Noun-form guard. Before scanning HIGH_STAKES_VERBS, the prompt is preprocessed to strip noun-form past references like "the deploy failed yesterday", "that release crashed", "the past push to main". This avoids false-positives on investigative/post-mortem turns. Stripping pattern:

NOUN_FORM_HIGH_STAKES = (the|that|this|each|every|last|previous|recent|today's|
                          yesterday's|past|failed|broken|crashed|stuck|botched)\s+
                         (deploy|release|migration|cutover|push|merge|publish|launch|
                          rollout|rotation)

Resulting contract fields:

intent: actionable                              # always when contract written
requires_schema: false                          # NEW v5 — authoritative L4 trigger
high_stakes_reason: casual_action_turn_prose_gate_only  # or *_verb_or_proof_request
required_response_mode: prose                   # or "schema" — derived from requires_schema
response_schema: { ...brian_final spec... }     # always present for parser stability

The gate hook reads requires_schema only. required_response_mode is kept for backward-compat readers.

Block-cascade dedupe (also new in v5)

Stop hooks fire in this order (per settings.json):
1. brian-review-gate.mjs
2. brian-presend-evidence-gate.py
3. brian-evidence-gate-sandboxed.shbrian-evidence-gate.py

Round-table requirement: surface ONE controlling block reason per Stop event, not three stacked. Implementation in _brian_stop_dedupe.py:

First-blocker wins. Stop hooks order matters — currently presend fires before evidence-gate, so if presend triggers (effort/dodge), evidence-gate (schema/prose) defers. If presend passes, evidence-gate gets to act.

Modes (env BRIAN_L4_MODE)

The auto-flip cron from earlier (*/30 * * * * /opt/agent/scripts/l4_corpus_evaluator.py) was REMOVED as part of the v5 replan. It would have promoted shadow-of-old-design into enforce-of-old-design before requires_schema landed. Re-arming the cron should wait until at least 1 week of requires_schema=true shadow data accumulates against high-stakes prompts only.

Backward-compat notes

Test totals (post-v5)

Suite Cases Pass
test_d1_contract_intake.py 17 (was 16; C16 split) 17/17
test_d4_d6_validators.py 20 20/20
test_d7_l4_schema.py 26 26/26
test_d8_l4_wiring.py 13 13/13
test_d9_test_marker_guard.py 11 11/11
test_d10_requires_schema.py 14 (NEW) 14/14
test_d11_block_dedupe.py 3 (NEW) 3/3
test_v3_fixes.py 54 54/54
Total 158 158/158

Files touched in this rollout

Modified:
- /root/.claude/hooks/brian-contract-intake.py — added HIGH_STAKES_VERBS, PROOF_REQUEST, NOUN_FORM_HIGH_STAKES, is_high_stakes(). Contract gains requires_schema + high_stakes_reason fields.
- /root/.claude/hooks/brian-evidence-gate.py — L4 invocation gated on requires_schema is True. Three block emits write dedupe marker. Top of main() defers if marker exists.
- /root/.claude/hooks/brian-presend-evidence-gate.py — top-of-main defer + mark-on-block at both block sites.
- /root/.claude/hooks/brian-mac-status-box-guard.py — block → warn.
- /root/.claude/hooks/brian-mac-chrome-profile-guard.py — non-Default-profile path block → warn. --no-default-browser-check removed from BAD_FLAGS (codex correction). --guest/--incognito still hard-block.
- /opt/agent/tests/evidence_gate/test_d1_contract_intake.py — C16 split.
- /opt/agent/tests/evidence_gate/test_d8_l4_wiring.py — intake prompts switched to high-stakes.
- /root/.claude/projects/-/memory/MEMORY.md — ENFORCEMENT LAYER bullet rewritten.

Created:
- /root/.claude/hooks/_brian_stop_dedupe.py — shared marker helper.
- /opt/agent/tests/evidence_gate/test_d10_requires_schema.py — high-stakes split coverage.
- /opt/agent/tests/evidence_gate/test_d11_block_dedupe.py — cascade dedupe coverage.

Removed:
- */30 * * * * L4 auto-flip cron (was in crontab; backup at /tmp/crontab_pre_l4_remove_*.bak).

Round-table receipts

Re-arming the L4 enforce path (when corpus is ready)

  1. Verify BRIAN_L4_MODE=shadow has been on for ≥1 week with real session traffic.
  2. Read /opt/agent/data/agent_runtime/l4_shadow/*/ and classify each verdict as true-positive / false-positive on requires_schema=true turns only (ignore requires_schema=false shadow rows — those should not exist post-v5; if they do, intake misclassified, fix that first).
  3. When 3 consecutive sessions with requires_schema=true traffic show no novel false-block class, set BRIAN_L4_MODE=enforce in /root/.claude/settings.json.
  4. Re-add the auto-revert cron only after 1 week stable in enforce.