← index2026-05-09 14:01 (Beirut)(backfill from DOCUMENTATION/)

Brian Evidence-Gate System — Operator's Guide (v5)

⚠️ SUPERSEDED 2026-05-09 (same day, hours later). Jonah read v5, recognized that L4 — even narrowed to high-stakes — was the same family of mechanism as the voice cops killed 2026-05-08 (forcing Brian's output to satisfy a hook). L4 was retired in full. Read v6 for current behavior: brian_evidence_gate_operators_guide_260509_v6.md. This v5 is kept as the historical record of the position-C compromise that didn't survive.

Built 2026-05-06 → 2026-05-09 | Reference architecture: brian_evidence_gate_system_260506.md
Supersedes: brian_evidence_gate_operators_guide_260507.md (v4)
v5 (2026-05-09 round-table replan): Post-cleanup architecture — voice-cop layer killed, truth-and-effort layer kept and narrowed. Test totals: 158/158 across 8 suites.

What changed since v4

In one paragraph: the May-5 voice-cop layer (5 hooks + Five Traits commandments + traits-prompt-injector) was killed on 2026-05-08 because it produced hook-shaped prose instead of authentic Brian. The truth-and-effort layer (this gate system) was kept on purpose. On 2026-05-09 a codex+gemini round-table (position C, high confidence both) reshaped this layer to avoid recreating the same trauma in structural form: schema enforcement (L4) now fires only on high-stakes turns, not on every action turn.

What it does (one paragraph, updated)

Stops Brian from claiming "done / fixed / shipped / blocked / send-me-X" without verifiable evidence in a tamper-evident ledger. Every tool Brian uses gets logged with an Ed25519-signed hash chain. Every assistant message gets gated. For routine action turns (debug, fix, audit, build, test, refactor): prose-based phrase tripwire blocks unsupported "done" claims. For high-stakes turns (deploy, release, ship to prod / ship to client, migrate, publish, rotate key, transfer money, send to client with proof): L4 schema enforcement requires a structured brian_final JSON block with explicit validator/ledger/artifact references. Casual conversation passes both gates untouched.

The active hooks (post-replan, wired in `/root/.claude/settings.json`)

Hook	Stage	What it does
`brian-evidence-ledger.py`	PostToolUse	Appends hash-chained JSONL entry + raw artifact dir per tool call. Source of truth for "what Brian did this turn."
`brian-contract-intake.py`	UserPromptSubmit	When prompt has action verbs, writes `/contracts/<sid>/<seq>.yml`. NEW (v5): splits `intent=actionable` from `requires_schema` boolean. Sets `requires_schema=true` only on high-stakes verbs / proof-request phrasing. Strips noun-form past-references first.
`brian-evidence-gate.py`	Stop	Two parallel paths: (a) legacy phrase tripwire — prose-based block on unsupported "done" claims (every action turn). (b) L4 schema gate — structured block-validation, fires only when `_contract.get("requires_schema") is True`. Block emits now write a dedupe marker (see below).
`brian-presend-evidence-gate.py`	Stop	Catches deferral / ask-without-proof phrasing. NEW (v5): defers if a fresh dedupe marker exists from an earlier Stop hook.
`brian-override-audit.py`	UserPromptSubmit	Detects gate-bypass phrases in your prompts; logs + thresholds at 3/24h → MEMORY.md entry + COMMS notice.
`brian-test-marker-guard.py`	SessionStart	Banner if test fixtures leaked into prod overrides dir.
`brian-outbound-ask-guard.py`	PreToolUse (Bash/TG/WA)	Blocks asking Jonah for help without ≥3 documented `tried:`/`attempted:` lines.
`_brian_stop_dedupe.py`	(helper)	Shared marker-file dedupe so only ONE Stop hook surfaces a block per event.

Plus 5 PreToolUse safety rails:
| Rail | Mode | Why |
|------|------|-----|
| brian-account-verify-guard | BLOCK | Wrong-account writes are real damage |
| brian-no-paid-model-guard | BLOCK | Money out — locked since the $41 leak 2026-04-30 |
| brian-agency-pipeline-guard | BLOCK | Wrong publishing path damages Brian's brand |
| brian-mac-chrome-profile-guard | BLOCK for --guest/--incognito (clobbers Jonah's session); WARN for non-Default profile (quality-of-output) |
| brian-mac-status-box-guard | WARN | Generic status strings hurt Jonah-side visibility but don't damage anything |

The high-stakes split (the heart of v5)

brian-contract-intake.py now classifies every action prompt:

# Casual action verbs trigger contract-write but requires_schema=False:
ACTION_VERBS = build|create|fix|ship|deploy|run|test|verify|implement|...

# High-stakes verbs (subset that warrants schema enforcement):
HIGH_STAKES_VERBS = deploy|release|launch|ship-to-(prod|client|customer)|
                    push-to-(main|master|prod)|merge-(to|into)-(main|master|prod)|
                    go-live|migrate|cutover|
                    transfer|charge|refund|invoice|pay|payout|
                    send-to-(client|customer|jonah)|publish|
                    rotate-(...)-(key|secret|token|password|api-key|credentials)|
                    grant-access|revoke-access|commit-credentials

# Proof-request phrasing also triggers schema:
PROOF_REQUEST = with-proof|prove-it|verified-done|signed-receipt|i-want-proof

Noun-form guard. Before scanning HIGH_STAKES_VERBS, the prompt is preprocessed to strip noun-form past references like "the deploy failed yesterday", "that release crashed", "the past push to main". This avoids false-positives on investigative/post-mortem turns. Stripping pattern:

NOUN_FORM_HIGH_STAKES = (the|that|this|each|every|last|previous|recent|today's|
                          yesterday's|past|failed|broken|crashed|stuck|botched)\s+
                         (deploy|release|migration|cutover|push|merge|publish|launch|
                          rollout|rotation)

Resulting contract fields:

intent: actionable                              # always when contract written
requires_schema: false                          # NEW v5 — authoritative L4 trigger
high_stakes_reason: casual_action_turn_prose_gate_only  # or *_verb_or_proof_request
required_response_mode: prose                   # or "schema" — derived from requires_schema
response_schema: { ...brian_final spec... }     # always present for parser stability

The gate hook reads requires_schema only. required_response_mode is kept for backward-compat readers.

Block-cascade dedupe (also new in v5)

Stop hooks fire in this order (per settings.json):
1. brian-review-gate.mjs
2. brian-presend-evidence-gate.py
3. brian-evidence-gate-sandboxed.sh → brian-evidence-gate.py

Round-table requirement: surface ONE controlling block reason per Stop event, not three stacked. Implementation in _brian_stop_dedupe.py:

Marker: /tmp/brian_stop_dedupe/<session_id>.flag. Content: source / transcript_path / time / pid (4 lines).
mark_block_emitted(sid, source, transcript_path) — call right before print(json.dumps({"decision":"block",...})). Best-effort, never raises.
defer_if_already_blocked(sid, source, transcript_path) — call at top of main(). Returns True iff a fresh marker exists for the SAME (session_id, transcript_path) within 5s. Hook should sys.exit(0) if True.
Event identity: keyed on transcript_path so different test runs (and distinct production Stop events) don't collide. Time-window TTL (5s) is a secondary guard against stale markers.

First-blocker wins. Stop hooks order matters — currently presend fires before evidence-gate, so if presend triggers (effort/dodge), evidence-gate (schema/prose) defers. If presend passes, evidence-gate gets to act.

Modes (env `BRIAN_L4_MODE`)

off (default until corpus stable) — L4 wiring is no-op; legacy prose tripwire only.
shadow — L4 evaluates and writes verdict to /opt/agent/data/agent_runtime/l4_shadow/<sid>/<turn_id>.json. Never blocks.
enforce — L4 evaluates and blocks (exit 2) when contract has requires_schema: true AND the response lacks/misuses the brian_final block.

The auto-flip cron from earlier (*/30 * * * * /opt/agent/scripts/l4_corpus_evaluator.py) was REMOVED as part of the v5 replan. It would have promoted shadow-of-old-design into enforce-of-old-design before requires_schema landed. Re-arming the cron should wait until at least 1 week of requires_schema=true shadow data accumulates against high-stakes prompts only.

Backward-compat notes

Old contracts (pre-v5) with required_response_mode == "schema" but no requires_schema field will NOT trigger L4. They fall through to legacy prose-gate. Safe.
v4 tests that hard-coded required_response_mode == "schema" were updated in this rollout. New test suites D10 (requires_schema) and D11 (block dedupe) cover the new behavior. D1 C16 split into C16a (casual) and C16b (high-stakes).
The Mac status-box and Chrome-profile guards no longer block on quality-of-output cones. The hard_rule_mac_status_box_specific.md and hard_rule_mac_chrome_default_profile.md rules in MEMORY.md remain — but enforcement is now soft. --guest and --incognito still hard-block on Chrome.

Test totals (post-v5)

Suite	Cases	Pass
`test_d1_contract_intake.py`	17 (was 16; C16 split)	17/17
`test_d4_d6_validators.py`	20	20/20
`test_d7_l4_schema.py`	26	26/26
`test_d8_l4_wiring.py`	13	13/13
`test_d9_test_marker_guard.py`	11	11/11
`test_d10_requires_schema.py`	14 (NEW)	14/14
`test_d11_block_dedupe.py`	3 (NEW)	3/3
`test_v3_fixes.py`	54	54/54
Total	158	158/158

Files touched in this rollout

Modified:
- /root/.claude/hooks/brian-contract-intake.py — added HIGH_STAKES_VERBS, PROOF_REQUEST, NOUN_FORM_HIGH_STAKES, is_high_stakes(). Contract gains requires_schema + high_stakes_reason fields.
- /root/.claude/hooks/brian-evidence-gate.py — L4 invocation gated on requires_schema is True. Three block emits write dedupe marker. Top of main() defers if marker exists.
- /root/.claude/hooks/brian-presend-evidence-gate.py — top-of-main defer + mark-on-block at both block sites.
- /root/.claude/hooks/brian-mac-status-box-guard.py — block → warn.
- /root/.claude/hooks/brian-mac-chrome-profile-guard.py — non-Default-profile path block → warn. --no-default-browser-check removed from BAD_FLAGS (codex correction). --guest/--incognito still hard-block.
- /opt/agent/tests/evidence_gate/test_d1_contract_intake.py — C16 split.
- /opt/agent/tests/evidence_gate/test_d8_l4_wiring.py — intake prompts switched to high-stakes.
- /root/.claude/projects/-/memory/MEMORY.md — ENFORCEMENT LAYER bullet rewritten.

Created:
- /root/.claude/hooks/_brian_stop_dedupe.py — shared marker helper.
- /opt/agent/tests/evidence_gate/test_d10_requires_schema.py — high-stakes split coverage.
- /opt/agent/tests/evidence_gate/test_d11_block_dedupe.py — cascade dedupe coverage.

Removed:
- */30 * * * * L4 auto-flip cron (was in crontab; backup at /tmp/crontab_pre_l4_remove_*.bak).

Round-table receipts

/tmp/round_table_260509_post_cleanup_replan.md — the question.
/tmp/rt_codex_260509.txt — codex verdict (C, high confidence).
/tmp/rt_gemini_260509.txt — gemini verdict (C, high confidence).
/tmp/codex_parallel_260509.txt — codex parallel-test verdict + the false-positive catches that fed the noun-form guard + the chrome BAD_FLAGS narrowing.

Re-arming the L4 enforce path (when corpus is ready)

Verify BRIAN_L4_MODE=shadow has been on for ≥1 week with real session traffic.
Read /opt/agent/data/agent_runtime/l4_shadow/*/ and classify each verdict as true-positive / false-positive on requires_schema=true turns only (ignore requires_schema=false shadow rows — those should not exist post-v5; if they do, intake misclassified, fix that first).
When 3 consecutive sessions with requires_schema=true traffic show no novel false-block class, set BRIAN_L4_MODE=enforce in /root/.claude/settings.json.
Re-add the auto-revert cron only after 1 week stable in enforce.

Brian Evidence-Gate System — Operator's Guide (v5)

Brian Evidence-Gate System — Operator's Guide (v5)

What changed since v4

What it does (one paragraph, updated)

The active hooks (post-replan, wired in /root/.claude/settings.json)

The high-stakes split (the heart of v5)

Block-cascade dedupe (also new in v5)

Modes (env BRIAN_L4_MODE)

Backward-compat notes

Test totals (post-v5)

Files touched in this rollout

Round-table receipts

Re-arming the L4 enforce path (when corpus is ready)

The active hooks (post-replan, wired in `/root/.claude/settings.json`)

Modes (env `BRIAN_L4_MODE`)