← index2026-05-09 13:30 (Beirut)(backfill from DOCUMENTATION/)

Brian Evidence-Gate System — Operator's Guide (v4)

⚠️ SUPERSEDED 2026-05-09. This v4 describes the system before the post-cleanup replan. Read v5 for current behavior: brian_evidence_gate_operators_guide_260509.md. v4 is kept for historical reference only — the schema-enforcement-on-every-action-turn model documented here is no longer accurate.

Built 2026-05-06 → 2026-05-07 | Reference architecture: brian_evidence_gate_system_260506.md
v4 (2026-05-07): Round-table-confirmed v3 hardenings — 13 substantive fixes from codex+gemini reviews; 105/107 tests passing (2 pre-existing test-isolation issues in D3, unrelated).

What it does (one paragraph)

Stops Brian (Claude Opus 4.7) from claiming "done / fixed / shipped / blocked / send-me-X" without verifiable evidence in a tamper-evident ledger. Every tool Brian uses gets logged with a hash chain. Every assistant message gets gated — claim phrases and substantive-but-unframed turns are blocked unless an HMAC-signed VALIDATOR_PASS exists for the claim. Override prompts ("skip the test", "force this") get audited; threshold breach surfaces in MEMORY.md. Hook crashes are tracked + threshold-fired to MEMORY.md so silent fail-open windows are visible.

The 4 live hooks (wired in `/root/.claude/settings.json`)

Hook	Stage	What it does
`/root/.claude/hooks/brian-evidence-ledger.py`	PostToolUse	Appends hash-chained JSONL entry + raw artifact dir per tool call. Source of truth for "what Brian did this turn."
`/root/.claude/hooks/brian-contract-intake.py`	UserPromptSubmit	When prompt has action verbs, writes `/contracts/<sid>/<seq>.yml` with objective + structured target_fields (paths, urls, command_hints, backtick_tokens, keywords) + capability ladder + expected verification. Validators bind to target_fields, not bag-of-words objective.
`/root/.claude/hooks/brian-evidence-gate.py`	Stop	Extracts claim phrases + checks substantive-turn framing. BLOCKs assistant message if no HMAC-verified VALIDATOR_PASS exists for the claim.
`/root/.claude/hooks/brian-override-audit.py`	UserPromptSubmit	Detects gate-bypass phrases in your prompts; logs + thresholds at 3/24h → MEMORY.md entry + COMMS notice.

Plus shared crash recorder: /root/.claude/hooks/_brian_hook_crash.py — every hook calls record_crash(name, exc) in its except block; threshold 3 crashes/hour for the same hook → MEMORY.md entry.

Reload semantics: Claude Code reads /root/.claude/settings.json at session START. Mid-session edits do NOT affect the current session. Restart CC to pick up hook changes.

Hook fail mode (v4 — production-honest)

Hook	Fail-open	Fail-closed	Crash visibility
ledger (PostToolUse)	exception → exit 0 + `_hook_errors.log`	none — tool execution must not block on ledger bugs	crash recorder fires; MEMORY.md if threshold
gate (Stop)	transcript-read errors, empty-extract on parseable JSONL	extract_anomaly: `parse_status=unparseable` (zero JSON) OR `parse_status=schema_drift` (assistant turn lines exist but none have type:assistant/system)	crash recorder fires; threshold writes MEMORY.md
contract-intake (UPS)	hook errors don't block prompt; missing contract = validators REFUSE downstream	none	crash recorder fires
override-audit (UPS)	hook errors don't block prompt	none	crash recorder fires

When 3+ crashes of the same hook hit within 1 hour: per-hook MEMORY.md entry written (not shared, no overwrites), MEMORY.md index updated under flock, breach marker keyed by hook+date prevents spam.

The operational loop

your prompt → contract-intake hook writes contract YAML w/ structured target_fields
            ↓
Brian uses tools → ledger hook writes hash-chained entries + raw artifacts
            ↓
Brian invokes brian_validator.py → checks args HARD-BIND to contract.target_fields,
                                   scans ledger, writes HMAC-signed VALIDATOR_PASS
            ↓
Brian drafts message → gate hook reads transcript + ledger → BLOCK or PASS

If a step is missing, validators REFUSE with reasons like validator arg path=X does not match any contract.target_fields entry — inspect the contracts dir first.

CLI: `/opt/agent/scripts/brian_validator.py`

Not on PATH. Use the absolute path. Brian invokes this to assert "evidence exists for my claim." Runner verifies args against policy + contract.target_fields, scans ledger, writes HMAC-signed VALIDATOR_PASS on success.

Six validator classes:

Validator	For claim	What it checks	Bound to contract field
`command_zero_exit`	done	Specific shell command ran with exit 0	command_hints, backtick_tokens, paths
`pytest_pass`	done	N successful pytest runs against a specific target	paths, backtick_tokens
`file_diff_present`	done	An Edit/Write/MultiEdit ran against the named path	paths
`http_2xx`	done	A fetch tool returned an explicit 2xx pattern in stdout for the URL	urls, backtick_tokens
`fetch_attempt_proof`	delegation	≥3 distinct fetch tools attempted and failed for the keyword	(keyword fallback only)
`ladder_exhaustion_proof`	blocked	All capability-ladder hops in the contract were attempted, recent failure exists	(multi-token hop match)

v4 critical change: validator args MUST hard-bind to a SPECIFIC contract field, not just share keywords with the objective text. e.g. file_diff_present REFUSES if the supplied path doesn't substring-match an entry in contract.target_fields.paths. Legacy soft-fallback (bag-of-words on objective text) only triggers when ALL of {paths, urls, command_hints, backtick_tokens, keywords} are empty (legacy contracts only).

Usage:

/opt/agent/scripts/brian_validator.py \
    --session <sid> \
    --claim done \
    --validator command_zero_exit \
    --args '{"command_substring":"<cmd-fragment-from-objective>","keyword":"<other>"}'

Exit codes:
- 0 = PASS (HMAC entry written to ledger)
- 2 = FAIL (no matching evidence — keep working, run more tools)
- 3 = REFUSED (wrong validator-for-claim, weak args, no contract, or arg fails hard-bind)

Where things live

/opt/agent/data/agent_runtime/
├── ledger/<session_id>.jsonl              # hash-chained tool-call log
├── ledger/_archive/<YYYY-MM>/              # rotated >30d, atomic gzip
├── ledger/_gate_extraction_errors.log      # gate's empty-extraction events
├── ledger/_hook_errors.log                 # ledger-hook errors
├── artifacts/<session_id>/<seq>/           # raw cmd + stdout + stderr per call
├── artifacts/_archive/<YYYY-MM>/<sid>.tar.gz  # rotated >30d, atomic
├── contracts/<session_id>/<seq>.yml        # per-turn objective + target_fields + ladder
├── contracts/_archive/<YYYY-MM>/<sid>/<seq>.yml.gz  # preserves session subdir
├── overrides/<session_id>.jsonl            # detected override prompts
├── overrides/_breach_marker.json           # one-per-24h spam guard
├── _hook_crashes.jsonl                     # all hook crashes (across hooks)
├── _hook_crash_breach.json                 # one-per-hook-per-day breach marker
└── _rotate.log                             # rotation cron output

/opt/agent/scripts/
├── brian_validator.py                      # the validator runner CLI
└── evidence_gate_rotate.py                 # daily rotation cron (03:30 UTC)

/opt/agent/tests/evidence_gate/             # stable test location
├── test_gate.py                            # 19 cases — gate logic
├── test_d1_contract_intake.py              # 15 cases — contract intake
├── test_d2_schema.py                       # 11 cases — schema enforcement
├── test_d3_override.py                     # 25 cases — override audit (T1, F3 isolation issues)
├── test_d4_d6_validators.py                # 15 cases — D4-D6 validators
├── test_v3_fixes.py                        # 22 cases — v3, r2, r3 fixes
└── test_e2e.sh                             # 5 steps — full integration

/root/.claude/system/
├── policies/claim_binding.yaml             # claim-type → allowed validators (Jonah-owned, chmod 0444)
├── validators/*.yaml                       # 6 validator definitions (Jonah-owned, chmod 0444)
└── secrets/validator_hmac.key              # HMAC signing key (chmod 0400 root)

/root/.claude/projects/-/memory/
├── feedback_override_pattern.md            # threshold breach trail (override-audit)
└── feedback_hook_crash_<hook-name>.md      # per-hook crash threshold breach trail (v4)

Common ops

Did the gate BLOCK on a session? Find the transcript file, then re-run the gate against it:

SID="<session_id>"
# v4 caveat: -print -quit picks ONE match. If two transcripts share IDs (rare),
# inspect manually with `find ... -name "${SID}.jsonl" -printf '%T@ %p\n' | sort -rn`.
TRANSCRIPT="$(find /root/.claude/projects -name "${SID}.jsonl" -print -quit)"
[ -n "$TRANSCRIPT" ] || { echo "transcript for $SID not found"; exit 1; }
echo "{\"session_id\":\"$SID\",\"transcript_path\":\"$TRANSCRIPT\"}" \
  | /root/.claude/hooks/brian-evidence-gate.py

Stdout = {"decision": "block", ...} if it blocked. Stderr = the BLOCK message. Empty output = pass.

Did extraction fail (different problem)? Check the extraction-errors log:

tail /opt/agent/data/agent_runtime/ledger/_gate_extraction_errors.log

Lines record parse_status so you can distinguish:
- parse_status=ok + text_empty=True — benign in-flight turn (tool calls only, no text yet)
- parse_status=unparseable — zero JSON parses (transcript schema fully broken) → fail-closed BLOCK fired
- parse_status=schema_drift — assistant turn lines exist but none have recognized type → BLOCK fired
- parse_status=unreadable — file missing or unreadable

Did a hook crash? v4 adds visibility:

tail /opt/agent/data/agent_runtime/_hook_crashes.jsonl
ls /root/.claude/projects/-/memory/feedback_hook_crash_*.md  # per-hook breach trails
cat /opt/agent/data/agent_runtime/_hook_crash_breach.json

Did Jonah override the gate today?

ls /opt/agent/data/agent_runtime/overrides/
tail /root/.claude/projects/-/memory/feedback_override_pattern.md

Reset the 24h breach markers (after recalibration conversation):

rm -f /opt/agent/data/agent_runtime/overrides/_breach_marker.json
rm -f /opt/agent/data/agent_runtime/_hook_crash_breach.json  # v4: separate marker for hook crashes

Run rotation manually:

python3 /opt/agent/scripts/evidence_gate_rotate.py
# or override defaults: EG_HOT_DAYS=14 EG_MAX_HOT_BYTES=20000000 python3 ...

Add a new validator (4 steps — all required):
1. Write /root/.claude/system/validators/<id>.yaml describing the predicate (declarative metadata only), then chmod 0444 and chown root:root it.
2. Add the actual hardcoded predicate logic to evaluate_validator() in /opt/agent/scripts/brian_validator.py — the YAML is documentation, the Python is behavior.
3. Reference <id> in /root/.claude/system/policies/claim_binding.yaml under required_validator_classes for the claim, AND under min_validator_args if any policy keys apply. Add the validator to VALIDATOR_FIELD_MAP inside check_claim_allows_validator() so target-field hard-binding picks the right ranked fields.
4. Add a regression test in /opt/agent/tests/evidence_gate/.

Rotate the HMAC key (DESTRUCTIVE — invalidates all existing VALIDATOR_PASS entries; schedule between sessions):

# Schedule when no active CC sessions hold uncommitted claims.
python3 -c "import secrets; open('/root/.claude/system/secrets/validator_hmac.key','w').write(secrets.token_hex(32))"
chmod 0400 /root/.claude/system/secrets/validator_hmac.key
chown root:root /root/.claude/system/secrets/validator_hmac.key

Warning — single-key model has no safe rotation window. Every prior VALIDATOR_PASS entry becomes cryptographically unverifiable the moment the key is replaced. If a session is mid-claim, its next "done" message will BLOCK until Brian re-runs the relevant validator. Day 4 backlog item: implement key versioning (current + previous overlap) for non-destructive rotation.

Test suites

python3 /opt/agent/tests/evidence_gate/test_gate.py                  # 19 cases
python3 /opt/agent/tests/evidence_gate/test_d1_contract_intake.py    # 15 cases
python3 /opt/agent/tests/evidence_gate/test_d2_schema.py             # 11 cases
python3 /opt/agent/tests/evidence_gate/test_d3_override.py           # 25 cases (2 known isolation)
python3 /opt/agent/tests/evidence_gate/test_d4_d6_validators.py      # 15 cases
python3 /opt/agent/tests/evidence_gate/test_v3_fixes.py              # 22 cases (v3+r2+r3)
bash    /opt/agent/tests/evidence_gate/test_e2e.sh                   # 5 steps

Total: 107 test cases across 7 suites. 105 pass. T1 + F3 in D3 are pre-existing test-isolation issues from prior sessions; not v3-related.

Round-table review history

Round	Members	Defects found
R1 design	Codex / Gemini / Vibe	5 (Codex R3 patches)
R4 Gemini code review	Gemini	3 (gate hook bugs)
R6 Codex impl review	Codex	5 (HMAC, race, gate)
D2 schema	Codex / Gemini / Vibe	4
D3 override	Codex / Gemini / Jules	8
D4-D6 validators	Codex / Gemini / Jules	6
Guide v1-v3 reviews	Codex / Gemini / Jules	19 (across 3 iterations)
v3 fixes review	Codex / Gemini	13 (round 1)
v3-r2 hardenings	Codex	2 more (schema_drift scope, time-true backward stream)

Total: 65 unique defects caught + fixed across 9 rounds. All 7 test suites green except 2 pre-existing D3 isolation issues.

v3 hardenings (2026-05-07) — what changed from v2 (Day 1)

Fix #1: Extraction anomaly detection (no more 5MB threshold)
- v2: BLOCK if transcript >5MB AND extracted text empty.
- v3: BLOCK based on parse_status (unparseable / schema_drift), not file size. 200K-context Opus sessions no longer false-BLOCK. Schema drift catches CC mid-flight format changes.

Fix #2: Hook crash watchdog
- New _brian_hook_crash.py shared module. All 4 hooks call record_crash in except block.
- Threshold 3/h same hook → per-hook MEMORY.md entry + breach marker (one-fire-per-day).
- v3-r3: time-based backward stream (chunk-by-chunk to BOF or 1h-ago) — bounded memory + truly time-based scan.

Fix #3: Ledger / artifact / contract / override rotation
- New evidence_gate_rotate.py cron at 03:30 UTC daily.
- Files >30d → atomic gzip + move to _archive/<YYYY-MM>/.
- Hot ledger >50MB → trim oldest under per-session flock; tail of 10K lines stays hot; head archived; append-only .chain sidecar preserves chain history.
- Active-session protection: never archive entries mtime'd in last 1h.
- Contracts archive preserves session_id subdir so collisions across sessions don't overwrite.
- Files in _archive/ >365d → deleted.

Fix #4: Hard contract-field binding
- v2: validator args must share ≥1 word with full objective text (bag-of-words).
- v3: contract-intake extracts structured target_fields.{paths,urls,backtick_tokens,command_hints,keywords}. Validator's command_must_match_objective_keywords policy now requires substring match against the appropriate target_field list. e.g., file_diff_present.path must match contract.paths.
- Soft fallback (legacy keyword overlap) only when ALL target_fields are empty (pre-v3 contracts).
- v3-r2: removed keywords from ranked_fields for command_zero_exit / pytest_pass / http_2xx so an unrelated command containing a single objective keyword can no longer bypass.
- v3-r2: PATH_RE now matches single-segment paths like /Makefile, /etc/hosts. Added RELPATH_RE for src/foo.py and ./relative. BACKTICK_RE 200-char cap removed.

Known weaknesses (still honest)

Lexical "blocked" false-positives in discussion context. Gate catches "blocked" inside test descriptions / hedged talk ("what gets blocked that shouldn't"). Tightening fix queued (~15 min): exempt "blocked" inside backtick-quoted regions and after negation modifiers.
Trust boundary partly honor-system. HMAC key is chmod 0400 root, but Brian process currently runs as root in dev. Real OS isolation needs uid separation (Day 4+).
No real-world calibration data yet. All tests synthetic. First days of real usage will surface tuning gaps. Watch the extraction-errors log + override audit log + hook crash log for signal.
HMAC rotation breaks in-flight VALIDATOR_PASS entries. Schedule rotations between sessions or implement key versioning (Day 4+).
D3 test isolation — T1 + F3 in test_d3_override.py have pre-existing isolation issues (other tests pollute OVERRIDES dir before T1 runs). Doesn't affect production behavior.

What's NOT yet built

Day 3: calibration cron — weekly replay of historical bypass cases (the 260505 postmortem set + euphemism paraphrases) against the gate. Tracks false-accept rate (worse) + false-reject rate.
Day 4+: context compactor (Gemini's gap), uid separation, override-audit dashboard, HMAC key versioning, .chain sidecar consumption (currently advisory).

Architectural axiom (Codex R5, locked)

Absence of an approved validator for the claimed scope is itself a blocking condition; Brian may report only attempted actions and uncertainty, not completion.

Locked decisions — 2026-05-07 ratification round

Decision	Value	Authority
L4 fenced-block name	`brian_final`	Jonah ratified 2026-05-07 22:54 Beirut
Plan-vs-drift filter	Hard MEMORY.md rule — every open item carries `origin_workstream` / `why_it_matters` / `critical_path: yes/no`; mixing workstreams in one status report is itself a behavioral failure	Jonah ratified 2026-05-07 22:54 Beirut
Fail-closed gate	Outer crash handler exits 2, not 0; `stop_hook_active` carve-out preserved at `__main__` scope	Landed 2026-05-07 22:57 Beirut

L4 schema enforcement — design contract (in flight)

Actionable turns must carry one fenced JSON block:

```brian_final
{
  "type": "done|partial|blocked|delegation",
  "summary": "...",
  "claims": [{"claim_type":"done","object":"...","validator_id":"...","ledger_entry_id":"...","artifact_hash":"..."}],
  "what_succeeded": "...",
  "what_remains": "...",
  "next_route_or_question": "..."
}
```

Slot rules:
- done → ≥1 claim with signed ledger evidence
- blocked → ladder_exhaustion_proof validator
- delegation → fetch_attempt_proof validator
- partial → three human-readable slots, may include zero done claims

Integration points:
- L1 contract-intake writes intent: actionable and required_response_mode: schema into the contract YAML (current intake only emits a contract when ACTION_VERBS matches, so the file's existence already encodes "actionable")
- L3 Stop hook reads contract; if actionable, requires fenced block; parses JSON via stdlib only; validates slots
- Phrase tripwire runs only on prose-remainder after fenced region is stripped

Hard sequencing per Codex R2: fail-closed → schema → calibration. Block format = JSON, not YAML (stdlib parser, no version drift).

Brian Evidence-Gate System — Operator's Guide (v4)

Brian Evidence-Gate System — Operator's Guide (v4)

What it does (one paragraph)

The 4 live hooks (wired in /root/.claude/settings.json)

Hook fail mode (v4 — production-honest)

The operational loop

CLI: /opt/agent/scripts/brian_validator.py

Where things live

Common ops

Test suites

Round-table review history

v3 hardenings (2026-05-07) — what changed from v2 (Day 1)

Known weaknesses (still honest)

What's NOT yet built

Architectural axiom (Codex R5, locked)

Locked decisions — 2026-05-07 ratification round

L4 schema enforcement — design contract (in flight)

The 4 live hooks (wired in `/root/.claude/settings.json`)

CLI: `/opt/agent/scripts/brian_validator.py`