⚠️ SUPERSEDED 2026-05-09. This v4 describes the system before the post-cleanup replan. Read v5 for current behavior:
brian_evidence_gate_operators_guide_260509.md. v4 is kept for historical reference only — the schema-enforcement-on-every-action-turn model documented here is no longer accurate.
Built 2026-05-06 → 2026-05-07 | Reference architecture: brian_evidence_gate_system_260506.md
v4 (2026-05-07): Round-table-confirmed v3 hardenings — 13 substantive fixes from codex+gemini reviews; 105/107 tests passing (2 pre-existing test-isolation issues in D3, unrelated).
Stops Brian (Claude Opus 4.7) from claiming "done / fixed / shipped / blocked / send-me-X" without verifiable evidence in a tamper-evident ledger. Every tool Brian uses gets logged with a hash chain. Every assistant message gets gated — claim phrases and substantive-but-unframed turns are blocked unless an HMAC-signed VALIDATOR_PASS exists for the claim. Override prompts ("skip the test", "force this") get audited; threshold breach surfaces in MEMORY.md. Hook crashes are tracked + threshold-fired to MEMORY.md so silent fail-open windows are visible.
/root/.claude/settings.json)| Hook | Stage | What it does |
|---|---|---|
/root/.claude/hooks/brian-evidence-ledger.py |
PostToolUse | Appends hash-chained JSONL entry + raw artifact dir per tool call. Source of truth for "what Brian did this turn." |
/root/.claude/hooks/brian-contract-intake.py |
UserPromptSubmit | When prompt has action verbs, writes /contracts/<sid>/<seq>.yml with objective + structured target_fields (paths, urls, command_hints, backtick_tokens, keywords) + capability ladder + expected verification. Validators bind to target_fields, not bag-of-words objective. |
/root/.claude/hooks/brian-evidence-gate.py |
Stop | Extracts claim phrases + checks substantive-turn framing. BLOCKs assistant message if no HMAC-verified VALIDATOR_PASS exists for the claim. |
/root/.claude/hooks/brian-override-audit.py |
UserPromptSubmit | Detects gate-bypass phrases in your prompts; logs + thresholds at 3/24h → MEMORY.md entry + COMMS notice. |
Plus shared crash recorder: /root/.claude/hooks/_brian_hook_crash.py — every hook calls record_crash(name, exc) in its except block; threshold 3 crashes/hour for the same hook → MEMORY.md entry.
Reload semantics: Claude Code reads /root/.claude/settings.json at session START. Mid-session edits do NOT affect the current session. Restart CC to pick up hook changes.
| Hook | Fail-open | Fail-closed | Crash visibility |
|---|---|---|---|
| ledger (PostToolUse) | exception → exit 0 + _hook_errors.log |
none — tool execution must not block on ledger bugs | crash recorder fires; MEMORY.md if threshold |
| gate (Stop) | transcript-read errors, empty-extract on parseable JSONL | extract_anomaly: parse_status=unparseable (zero JSON) OR parse_status=schema_drift (assistant turn lines exist but none have type:assistant/system) |
crash recorder fires; threshold writes MEMORY.md |
| contract-intake (UPS) | hook errors don't block prompt; missing contract = validators REFUSE downstream | none | crash recorder fires |
| override-audit (UPS) | hook errors don't block prompt | none | crash recorder fires |
When 3+ crashes of the same hook hit within 1 hour: per-hook MEMORY.md entry written (not shared, no overwrites), MEMORY.md index updated under flock, breach marker keyed by hook+date prevents spam.
your prompt → contract-intake hook writes contract YAML w/ structured target_fields
↓
Brian uses tools → ledger hook writes hash-chained entries + raw artifacts
↓
Brian invokes brian_validator.py → checks args HARD-BIND to contract.target_fields,
scans ledger, writes HMAC-signed VALIDATOR_PASS
↓
Brian drafts message → gate hook reads transcript + ledger → BLOCK or PASS
If a step is missing, validators REFUSE with reasons like validator arg path=X does not match any contract.target_fields entry — inspect the contracts dir first.
/opt/agent/scripts/brian_validator.pyNot on PATH. Use the absolute path. Brian invokes this to assert "evidence exists for my claim." Runner verifies args against policy + contract.target_fields, scans ledger, writes HMAC-signed VALIDATOR_PASS on success.
Six validator classes:
| Validator | For claim | What it checks | Bound to contract field |
|---|---|---|---|
command_zero_exit |
done | Specific shell command ran with exit 0 | command_hints, backtick_tokens, paths |
pytest_pass |
done | N successful pytest runs against a specific target | paths, backtick_tokens |
file_diff_present |
done | An Edit/Write/MultiEdit ran against the named path | paths |
http_2xx |
done | A fetch tool returned an explicit 2xx pattern in stdout for the URL | urls, backtick_tokens |
fetch_attempt_proof |
delegation | ≥3 distinct fetch tools attempted and failed for the keyword | (keyword fallback only) |
ladder_exhaustion_proof |
blocked | All capability-ladder hops in the contract were attempted, recent failure exists | (multi-token hop match) |
v4 critical change: validator args MUST hard-bind to a SPECIFIC contract field, not just share keywords with the objective text. e.g. file_diff_present REFUSES if the supplied path doesn't substring-match an entry in contract.target_fields.paths. Legacy soft-fallback (bag-of-words on objective text) only triggers when ALL of {paths, urls, command_hints, backtick_tokens, keywords} are empty (legacy contracts only).
Usage:
/opt/agent/scripts/brian_validator.py \
--session <sid> \
--claim done \
--validator command_zero_exit \
--args '{"command_substring":"<cmd-fragment-from-objective>","keyword":"<other>"}'
Exit codes:
- 0 = PASS (HMAC entry written to ledger)
- 2 = FAIL (no matching evidence — keep working, run more tools)
- 3 = REFUSED (wrong validator-for-claim, weak args, no contract, or arg fails hard-bind)
/opt/agent/data/agent_runtime/
├── ledger/<session_id>.jsonl # hash-chained tool-call log
├── ledger/_archive/<YYYY-MM>/ # rotated >30d, atomic gzip
├── ledger/_gate_extraction_errors.log # gate's empty-extraction events
├── ledger/_hook_errors.log # ledger-hook errors
├── artifacts/<session_id>/<seq>/ # raw cmd + stdout + stderr per call
├── artifacts/_archive/<YYYY-MM>/<sid>.tar.gz # rotated >30d, atomic
├── contracts/<session_id>/<seq>.yml # per-turn objective + target_fields + ladder
├── contracts/_archive/<YYYY-MM>/<sid>/<seq>.yml.gz # preserves session subdir
├── overrides/<session_id>.jsonl # detected override prompts
├── overrides/_breach_marker.json # one-per-24h spam guard
├── _hook_crashes.jsonl # all hook crashes (across hooks)
├── _hook_crash_breach.json # one-per-hook-per-day breach marker
└── _rotate.log # rotation cron output
/opt/agent/scripts/
├── brian_validator.py # the validator runner CLI
└── evidence_gate_rotate.py # daily rotation cron (03:30 UTC)
/opt/agent/tests/evidence_gate/ # stable test location
├── test_gate.py # 19 cases — gate logic
├── test_d1_contract_intake.py # 15 cases — contract intake
├── test_d2_schema.py # 11 cases — schema enforcement
├── test_d3_override.py # 25 cases — override audit (T1, F3 isolation issues)
├── test_d4_d6_validators.py # 15 cases — D4-D6 validators
├── test_v3_fixes.py # 22 cases — v3, r2, r3 fixes
└── test_e2e.sh # 5 steps — full integration
/root/.claude/system/
├── policies/claim_binding.yaml # claim-type → allowed validators (Jonah-owned, chmod 0444)
├── validators/*.yaml # 6 validator definitions (Jonah-owned, chmod 0444)
└── secrets/validator_hmac.key # HMAC signing key (chmod 0400 root)
/root/.claude/projects/-/memory/
├── feedback_override_pattern.md # threshold breach trail (override-audit)
└── feedback_hook_crash_<hook-name>.md # per-hook crash threshold breach trail (v4)
Did the gate BLOCK on a session? Find the transcript file, then re-run the gate against it:
SID="<session_id>"
# v4 caveat: -print -quit picks ONE match. If two transcripts share IDs (rare),
# inspect manually with `find ... -name "${SID}.jsonl" -printf '%T@ %p\n' | sort -rn`.
TRANSCRIPT="$(find /root/.claude/projects -name "${SID}.jsonl" -print -quit)"
[ -n "$TRANSCRIPT" ] || { echo "transcript for $SID not found"; exit 1; }
echo "{\"session_id\":\"$SID\",\"transcript_path\":\"$TRANSCRIPT\"}" \
| /root/.claude/hooks/brian-evidence-gate.py
Stdout = {"decision": "block", ...} if it blocked. Stderr = the BLOCK message. Empty output = pass.
Did extraction fail (different problem)? Check the extraction-errors log:
tail /opt/agent/data/agent_runtime/ledger/_gate_extraction_errors.log
Lines record parse_status so you can distinguish:
- parse_status=ok + text_empty=True — benign in-flight turn (tool calls only, no text yet)
- parse_status=unparseable — zero JSON parses (transcript schema fully broken) → fail-closed BLOCK fired
- parse_status=schema_drift — assistant turn lines exist but none have recognized type → BLOCK fired
- parse_status=unreadable — file missing or unreadable
Did a hook crash? v4 adds visibility:
tail /opt/agent/data/agent_runtime/_hook_crashes.jsonl
ls /root/.claude/projects/-/memory/feedback_hook_crash_*.md # per-hook breach trails
cat /opt/agent/data/agent_runtime/_hook_crash_breach.json
Did Jonah override the gate today?
ls /opt/agent/data/agent_runtime/overrides/
tail /root/.claude/projects/-/memory/feedback_override_pattern.md
Reset the 24h breach markers (after recalibration conversation):
rm -f /opt/agent/data/agent_runtime/overrides/_breach_marker.json
rm -f /opt/agent/data/agent_runtime/_hook_crash_breach.json # v4: separate marker for hook crashes
Run rotation manually:
python3 /opt/agent/scripts/evidence_gate_rotate.py
# or override defaults: EG_HOT_DAYS=14 EG_MAX_HOT_BYTES=20000000 python3 ...
Add a new validator (4 steps — all required):
1. Write /root/.claude/system/validators/<id>.yaml describing the predicate (declarative metadata only), then chmod 0444 and chown root:root it.
2. Add the actual hardcoded predicate logic to evaluate_validator() in /opt/agent/scripts/brian_validator.py — the YAML is documentation, the Python is behavior.
3. Reference <id> in /root/.claude/system/policies/claim_binding.yaml under required_validator_classes for the claim, AND under min_validator_args if any policy keys apply. Add the validator to VALIDATOR_FIELD_MAP inside check_claim_allows_validator() so target-field hard-binding picks the right ranked fields.
4. Add a regression test in /opt/agent/tests/evidence_gate/.
Rotate the HMAC key (DESTRUCTIVE — invalidates all existing VALIDATOR_PASS entries; schedule between sessions):
# Schedule when no active CC sessions hold uncommitted claims.
python3 -c "import secrets; open('/root/.claude/system/secrets/validator_hmac.key','w').write(secrets.token_hex(32))"
chmod 0400 /root/.claude/system/secrets/validator_hmac.key
chown root:root /root/.claude/system/secrets/validator_hmac.key
Warning — single-key model has no safe rotation window. Every prior VALIDATOR_PASS entry becomes cryptographically unverifiable the moment the key is replaced. If a session is mid-claim, its next "done" message will BLOCK until Brian re-runs the relevant validator. Day 4 backlog item: implement key versioning (current + previous overlap) for non-destructive rotation.
python3 /opt/agent/tests/evidence_gate/test_gate.py # 19 cases
python3 /opt/agent/tests/evidence_gate/test_d1_contract_intake.py # 15 cases
python3 /opt/agent/tests/evidence_gate/test_d2_schema.py # 11 cases
python3 /opt/agent/tests/evidence_gate/test_d3_override.py # 25 cases (2 known isolation)
python3 /opt/agent/tests/evidence_gate/test_d4_d6_validators.py # 15 cases
python3 /opt/agent/tests/evidence_gate/test_v3_fixes.py # 22 cases (v3+r2+r3)
bash /opt/agent/tests/evidence_gate/test_e2e.sh # 5 steps
Total: 107 test cases across 7 suites. 105 pass. T1 + F3 in D3 are pre-existing test-isolation issues from prior sessions; not v3-related.
| Round | Members | Defects found |
|---|---|---|
| R1 design | Codex / Gemini / Vibe | 5 (Codex R3 patches) |
| R4 Gemini code review | Gemini | 3 (gate hook bugs) |
| R6 Codex impl review | Codex | 5 (HMAC, race, gate) |
| D2 schema | Codex / Gemini / Vibe | 4 |
| D3 override | Codex / Gemini / Jules | 8 |
| D4-D6 validators | Codex / Gemini / Jules | 6 |
| Guide v1-v3 reviews | Codex / Gemini / Jules | 19 (across 3 iterations) |
| v3 fixes review | Codex / Gemini | 13 (round 1) |
| v3-r2 hardenings | Codex | 2 more (schema_drift scope, time-true backward stream) |
Total: 65 unique defects caught + fixed across 9 rounds. All 7 test suites green except 2 pre-existing D3 isolation issues.
Fix #1: Extraction anomaly detection (no more 5MB threshold)
- v2: BLOCK if transcript >5MB AND extracted text empty.
- v3: BLOCK based on parse_status (unparseable / schema_drift), not file size. 200K-context Opus sessions no longer false-BLOCK. Schema drift catches CC mid-flight format changes.
Fix #2: Hook crash watchdog
- New _brian_hook_crash.py shared module. All 4 hooks call record_crash in except block.
- Threshold 3/h same hook → per-hook MEMORY.md entry + breach marker (one-fire-per-day).
- v3-r3: time-based backward stream (chunk-by-chunk to BOF or 1h-ago) — bounded memory + truly time-based scan.
Fix #3: Ledger / artifact / contract / override rotation
- New evidence_gate_rotate.py cron at 03:30 UTC daily.
- Files >30d → atomic gzip + move to _archive/<YYYY-MM>/.
- Hot ledger >50MB → trim oldest under per-session flock; tail of 10K lines stays hot; head archived; append-only .chain sidecar preserves chain history.
- Active-session protection: never archive entries mtime'd in last 1h.
- Contracts archive preserves session_id subdir so collisions across sessions don't overwrite.
- Files in _archive/ >365d → deleted.
Fix #4: Hard contract-field binding
- v2: validator args must share ≥1 word with full objective text (bag-of-words).
- v3: contract-intake extracts structured target_fields.{paths,urls,backtick_tokens,command_hints,keywords}. Validator's command_must_match_objective_keywords policy now requires substring match against the appropriate target_field list. e.g., file_diff_present.path must match contract.paths.
- Soft fallback (legacy keyword overlap) only when ALL target_fields are empty (pre-v3 contracts).
- v3-r2: removed keywords from ranked_fields for command_zero_exit / pytest_pass / http_2xx so an unrelated command containing a single objective keyword can no longer bypass.
- v3-r2: PATH_RE now matches single-segment paths like /Makefile, /etc/hosts. Added RELPATH_RE for src/foo.py and ./relative. BACKTICK_RE 200-char cap removed.
chmod 0400 root, but Brian process currently runs as root in dev. Real OS isolation needs uid separation (Day 4+).Absence of an approved validator for the claimed scope is itself a blocking condition; Brian may report only attempted actions and uncertainty, not completion.
| Decision | Value | Authority |
|---|---|---|
| L4 fenced-block name | brian_final |
Jonah ratified 2026-05-07 22:54 Beirut |
| Plan-vs-drift filter | Hard MEMORY.md rule — every open item carries origin_workstream / why_it_matters / critical_path: yes/no; mixing workstreams in one status report is itself a behavioral failure |
Jonah ratified 2026-05-07 22:54 Beirut |
| Fail-closed gate | Outer crash handler exits 2, not 0; stop_hook_active carve-out preserved at __main__ scope |
Landed 2026-05-07 22:57 Beirut |
Actionable turns must carry one fenced JSON block:
```brian_final
{
"type": "done|partial|blocked|delegation",
"summary": "...",
"claims": [{"claim_type":"done","object":"...","validator_id":"...","ledger_entry_id":"...","artifact_hash":"..."}],
"what_succeeded": "...",
"what_remains": "...",
"next_route_or_question": "..."
}
```
Slot rules:
- done → ≥1 claim with signed ledger evidence
- blocked → ladder_exhaustion_proof validator
- delegation → fetch_attempt_proof validator
- partial → three human-readable slots, may include zero done claims
Integration points:
- L1 contract-intake writes intent: actionable and required_response_mode: schema into the contract YAML (current intake only emits a contract when ACTION_VERBS matches, so the file's existence already encodes "actionable")
- L3 Stop hook reads contract; if actionable, requires fenced block; parses JSON via stdlib only; validates slots
- Phrase tripwire runs only on prose-remainder after fenced region is stripped
Hard sequencing per Codex R2: fail-closed → schema → calibration. Block format = JSON, not YAML (stdlib parser, no version drift).