← index2026-05-09 13:30 (Beirut)(backfill from DOCUMENTATION/)

Brian Evidence-Gate System — Operator's Guide (v4)

Brian Evidence-Gate System — Operator's Guide (v4)

⚠️ SUPERSEDED 2026-05-09. This v4 describes the system before the post-cleanup replan. Read v5 for current behavior: brian_evidence_gate_operators_guide_260509.md. v4 is kept for historical reference only — the schema-enforcement-on-every-action-turn model documented here is no longer accurate.


Built 2026-05-06 → 2026-05-07 | Reference architecture: brian_evidence_gate_system_260506.md
v4 (2026-05-07): Round-table-confirmed v3 hardenings — 13 substantive fixes from codex+gemini reviews; 105/107 tests passing (2 pre-existing test-isolation issues in D3, unrelated).

What it does (one paragraph)

Stops Brian (Claude Opus 4.7) from claiming "done / fixed / shipped / blocked / send-me-X" without verifiable evidence in a tamper-evident ledger. Every tool Brian uses gets logged with a hash chain. Every assistant message gets gated — claim phrases and substantive-but-unframed turns are blocked unless an HMAC-signed VALIDATOR_PASS exists for the claim. Override prompts ("skip the test", "force this") get audited; threshold breach surfaces in MEMORY.md. Hook crashes are tracked + threshold-fired to MEMORY.md so silent fail-open windows are visible.

The 4 live hooks (wired in /root/.claude/settings.json)

Hook Stage What it does
/root/.claude/hooks/brian-evidence-ledger.py PostToolUse Appends hash-chained JSONL entry + raw artifact dir per tool call. Source of truth for "what Brian did this turn."
/root/.claude/hooks/brian-contract-intake.py UserPromptSubmit When prompt has action verbs, writes /contracts/<sid>/<seq>.yml with objective + structured target_fields (paths, urls, command_hints, backtick_tokens, keywords) + capability ladder + expected verification. Validators bind to target_fields, not bag-of-words objective.
/root/.claude/hooks/brian-evidence-gate.py Stop Extracts claim phrases + checks substantive-turn framing. BLOCKs assistant message if no HMAC-verified VALIDATOR_PASS exists for the claim.
/root/.claude/hooks/brian-override-audit.py UserPromptSubmit Detects gate-bypass phrases in your prompts; logs + thresholds at 3/24h → MEMORY.md entry + COMMS notice.

Plus shared crash recorder: /root/.claude/hooks/_brian_hook_crash.py — every hook calls record_crash(name, exc) in its except block; threshold 3 crashes/hour for the same hook → MEMORY.md entry.

Reload semantics: Claude Code reads /root/.claude/settings.json at session START. Mid-session edits do NOT affect the current session. Restart CC to pick up hook changes.

Hook fail mode (v4 — production-honest)

Hook Fail-open Fail-closed Crash visibility
ledger (PostToolUse) exception → exit 0 + _hook_errors.log none — tool execution must not block on ledger bugs crash recorder fires; MEMORY.md if threshold
gate (Stop) transcript-read errors, empty-extract on parseable JSONL extract_anomaly: parse_status=unparseable (zero JSON) OR parse_status=schema_drift (assistant turn lines exist but none have type:assistant/system) crash recorder fires; threshold writes MEMORY.md
contract-intake (UPS) hook errors don't block prompt; missing contract = validators REFUSE downstream none crash recorder fires
override-audit (UPS) hook errors don't block prompt none crash recorder fires

When 3+ crashes of the same hook hit within 1 hour: per-hook MEMORY.md entry written (not shared, no overwrites), MEMORY.md index updated under flock, breach marker keyed by hook+date prevents spam.

The operational loop

your prompt → contract-intake hook writes contract YAML w/ structured target_fields
            ↓
Brian uses tools → ledger hook writes hash-chained entries + raw artifacts
            ↓
Brian invokes brian_validator.py → checks args HARD-BIND to contract.target_fields,
                                   scans ledger, writes HMAC-signed VALIDATOR_PASS
            ↓
Brian drafts message → gate hook reads transcript + ledger → BLOCK or PASS

If a step is missing, validators REFUSE with reasons like validator arg path=X does not match any contract.target_fields entry — inspect the contracts dir first.

CLI: /opt/agent/scripts/brian_validator.py

Not on PATH. Use the absolute path. Brian invokes this to assert "evidence exists for my claim." Runner verifies args against policy + contract.target_fields, scans ledger, writes HMAC-signed VALIDATOR_PASS on success.

Six validator classes:

Validator For claim What it checks Bound to contract field
command_zero_exit done Specific shell command ran with exit 0 command_hints, backtick_tokens, paths
pytest_pass done N successful pytest runs against a specific target paths, backtick_tokens
file_diff_present done An Edit/Write/MultiEdit ran against the named path paths
http_2xx done A fetch tool returned an explicit 2xx pattern in stdout for the URL urls, backtick_tokens
fetch_attempt_proof delegation ≥3 distinct fetch tools attempted and failed for the keyword (keyword fallback only)
ladder_exhaustion_proof blocked All capability-ladder hops in the contract were attempted, recent failure exists (multi-token hop match)

v4 critical change: validator args MUST hard-bind to a SPECIFIC contract field, not just share keywords with the objective text. e.g. file_diff_present REFUSES if the supplied path doesn't substring-match an entry in contract.target_fields.paths. Legacy soft-fallback (bag-of-words on objective text) only triggers when ALL of {paths, urls, command_hints, backtick_tokens, keywords} are empty (legacy contracts only).

Usage:

/opt/agent/scripts/brian_validator.py \
    --session <sid> \
    --claim done \
    --validator command_zero_exit \
    --args '{"command_substring":"<cmd-fragment-from-objective>","keyword":"<other>"}'

Exit codes:
- 0 = PASS (HMAC entry written to ledger)
- 2 = FAIL (no matching evidence — keep working, run more tools)
- 3 = REFUSED (wrong validator-for-claim, weak args, no contract, or arg fails hard-bind)

Where things live

/opt/agent/data/agent_runtime/
├── ledger/<session_id>.jsonl              # hash-chained tool-call log
├── ledger/_archive/<YYYY-MM>/              # rotated >30d, atomic gzip
├── ledger/_gate_extraction_errors.log      # gate's empty-extraction events
├── ledger/_hook_errors.log                 # ledger-hook errors
├── artifacts/<session_id>/<seq>/           # raw cmd + stdout + stderr per call
├── artifacts/_archive/<YYYY-MM>/<sid>.tar.gz  # rotated >30d, atomic
├── contracts/<session_id>/<seq>.yml        # per-turn objective + target_fields + ladder
├── contracts/_archive/<YYYY-MM>/<sid>/<seq>.yml.gz  # preserves session subdir
├── overrides/<session_id>.jsonl            # detected override prompts
├── overrides/_breach_marker.json           # one-per-24h spam guard
├── _hook_crashes.jsonl                     # all hook crashes (across hooks)
├── _hook_crash_breach.json                 # one-per-hook-per-day breach marker
└── _rotate.log                             # rotation cron output

/opt/agent/scripts/
├── brian_validator.py                      # the validator runner CLI
└── evidence_gate_rotate.py                 # daily rotation cron (03:30 UTC)

/opt/agent/tests/evidence_gate/             # stable test location
├── test_gate.py                            # 19 cases — gate logic
├── test_d1_contract_intake.py              # 15 cases — contract intake
├── test_d2_schema.py                       # 11 cases — schema enforcement
├── test_d3_override.py                     # 25 cases — override audit (T1, F3 isolation issues)
├── test_d4_d6_validators.py                # 15 cases — D4-D6 validators
├── test_v3_fixes.py                        # 22 cases — v3, r2, r3 fixes
└── test_e2e.sh                             # 5 steps — full integration

/root/.claude/system/
├── policies/claim_binding.yaml             # claim-type → allowed validators (Jonah-owned, chmod 0444)
├── validators/*.yaml                       # 6 validator definitions (Jonah-owned, chmod 0444)
└── secrets/validator_hmac.key              # HMAC signing key (chmod 0400 root)

/root/.claude/projects/-/memory/
├── feedback_override_pattern.md            # threshold breach trail (override-audit)
└── feedback_hook_crash_<hook-name>.md      # per-hook crash threshold breach trail (v4)

Common ops

Did the gate BLOCK on a session? Find the transcript file, then re-run the gate against it:

SID="<session_id>"
# v4 caveat: -print -quit picks ONE match. If two transcripts share IDs (rare),
# inspect manually with `find ... -name "${SID}.jsonl" -printf '%T@ %p\n' | sort -rn`.
TRANSCRIPT="$(find /root/.claude/projects -name "${SID}.jsonl" -print -quit)"
[ -n "$TRANSCRIPT" ] || { echo "transcript for $SID not found"; exit 1; }
echo "{\"session_id\":\"$SID\",\"transcript_path\":\"$TRANSCRIPT\"}" \
  | /root/.claude/hooks/brian-evidence-gate.py

Stdout = {"decision": "block", ...} if it blocked. Stderr = the BLOCK message. Empty output = pass.

Did extraction fail (different problem)? Check the extraction-errors log:

tail /opt/agent/data/agent_runtime/ledger/_gate_extraction_errors.log

Lines record parse_status so you can distinguish:
- parse_status=ok + text_empty=True — benign in-flight turn (tool calls only, no text yet)
- parse_status=unparseable — zero JSON parses (transcript schema fully broken) → fail-closed BLOCK fired
- parse_status=schema_drift — assistant turn lines exist but none have recognized type → BLOCK fired
- parse_status=unreadable — file missing or unreadable

Did a hook crash? v4 adds visibility:

tail /opt/agent/data/agent_runtime/_hook_crashes.jsonl
ls /root/.claude/projects/-/memory/feedback_hook_crash_*.md  # per-hook breach trails
cat /opt/agent/data/agent_runtime/_hook_crash_breach.json

Did Jonah override the gate today?

ls /opt/agent/data/agent_runtime/overrides/
tail /root/.claude/projects/-/memory/feedback_override_pattern.md

Reset the 24h breach markers (after recalibration conversation):

rm -f /opt/agent/data/agent_runtime/overrides/_breach_marker.json
rm -f /opt/agent/data/agent_runtime/_hook_crash_breach.json  # v4: separate marker for hook crashes

Run rotation manually:

python3 /opt/agent/scripts/evidence_gate_rotate.py
# or override defaults: EG_HOT_DAYS=14 EG_MAX_HOT_BYTES=20000000 python3 ...

Add a new validator (4 steps — all required):
1. Write /root/.claude/system/validators/<id>.yaml describing the predicate (declarative metadata only), then chmod 0444 and chown root:root it.
2. Add the actual hardcoded predicate logic to evaluate_validator() in /opt/agent/scripts/brian_validator.py — the YAML is documentation, the Python is behavior.
3. Reference <id> in /root/.claude/system/policies/claim_binding.yaml under required_validator_classes for the claim, AND under min_validator_args if any policy keys apply. Add the validator to VALIDATOR_FIELD_MAP inside check_claim_allows_validator() so target-field hard-binding picks the right ranked fields.
4. Add a regression test in /opt/agent/tests/evidence_gate/.

Rotate the HMAC key (DESTRUCTIVE — invalidates all existing VALIDATOR_PASS entries; schedule between sessions):

# Schedule when no active CC sessions hold uncommitted claims.
python3 -c "import secrets; open('/root/.claude/system/secrets/validator_hmac.key','w').write(secrets.token_hex(32))"
chmod 0400 /root/.claude/system/secrets/validator_hmac.key
chown root:root /root/.claude/system/secrets/validator_hmac.key

Warning — single-key model has no safe rotation window. Every prior VALIDATOR_PASS entry becomes cryptographically unverifiable the moment the key is replaced. If a session is mid-claim, its next "done" message will BLOCK until Brian re-runs the relevant validator. Day 4 backlog item: implement key versioning (current + previous overlap) for non-destructive rotation.

Test suites

python3 /opt/agent/tests/evidence_gate/test_gate.py                  # 19 cases
python3 /opt/agent/tests/evidence_gate/test_d1_contract_intake.py    # 15 cases
python3 /opt/agent/tests/evidence_gate/test_d2_schema.py             # 11 cases
python3 /opt/agent/tests/evidence_gate/test_d3_override.py           # 25 cases (2 known isolation)
python3 /opt/agent/tests/evidence_gate/test_d4_d6_validators.py      # 15 cases
python3 /opt/agent/tests/evidence_gate/test_v3_fixes.py              # 22 cases (v3+r2+r3)
bash    /opt/agent/tests/evidence_gate/test_e2e.sh                   # 5 steps

Total: 107 test cases across 7 suites. 105 pass. T1 + F3 in D3 are pre-existing test-isolation issues from prior sessions; not v3-related.

Round-table review history

Round Members Defects found
R1 design Codex / Gemini / Vibe 5 (Codex R3 patches)
R4 Gemini code review Gemini 3 (gate hook bugs)
R6 Codex impl review Codex 5 (HMAC, race, gate)
D2 schema Codex / Gemini / Vibe 4
D3 override Codex / Gemini / Jules 8
D4-D6 validators Codex / Gemini / Jules 6
Guide v1-v3 reviews Codex / Gemini / Jules 19 (across 3 iterations)
v3 fixes review Codex / Gemini 13 (round 1)
v3-r2 hardenings Codex 2 more (schema_drift scope, time-true backward stream)

Total: 65 unique defects caught + fixed across 9 rounds. All 7 test suites green except 2 pre-existing D3 isolation issues.

v3 hardenings (2026-05-07) — what changed from v2 (Day 1)

Fix #1: Extraction anomaly detection (no more 5MB threshold)
- v2: BLOCK if transcript >5MB AND extracted text empty.
- v3: BLOCK based on parse_status (unparseable / schema_drift), not file size. 200K-context Opus sessions no longer false-BLOCK. Schema drift catches CC mid-flight format changes.

Fix #2: Hook crash watchdog
- New _brian_hook_crash.py shared module. All 4 hooks call record_crash in except block.
- Threshold 3/h same hook → per-hook MEMORY.md entry + breach marker (one-fire-per-day).
- v3-r3: time-based backward stream (chunk-by-chunk to BOF or 1h-ago) — bounded memory + truly time-based scan.

Fix #3: Ledger / artifact / contract / override rotation
- New evidence_gate_rotate.py cron at 03:30 UTC daily.
- Files >30d → atomic gzip + move to _archive/<YYYY-MM>/.
- Hot ledger >50MB → trim oldest under per-session flock; tail of 10K lines stays hot; head archived; append-only .chain sidecar preserves chain history.
- Active-session protection: never archive entries mtime'd in last 1h.
- Contracts archive preserves session_id subdir so collisions across sessions don't overwrite.
- Files in _archive/ >365d → deleted.

Fix #4: Hard contract-field binding
- v2: validator args must share ≥1 word with full objective text (bag-of-words).
- v3: contract-intake extracts structured target_fields.{paths,urls,backtick_tokens,command_hints,keywords}. Validator's command_must_match_objective_keywords policy now requires substring match against the appropriate target_field list. e.g., file_diff_present.path must match contract.paths.
- Soft fallback (legacy keyword overlap) only when ALL target_fields are empty (pre-v3 contracts).
- v3-r2: removed keywords from ranked_fields for command_zero_exit / pytest_pass / http_2xx so an unrelated command containing a single objective keyword can no longer bypass.
- v3-r2: PATH_RE now matches single-segment paths like /Makefile, /etc/hosts. Added RELPATH_RE for src/foo.py and ./relative. BACKTICK_RE 200-char cap removed.

Known weaknesses (still honest)

  1. Lexical "blocked" false-positives in discussion context. Gate catches "blocked" inside test descriptions / hedged talk ("what gets blocked that shouldn't"). Tightening fix queued (~15 min): exempt "blocked" inside backtick-quoted regions and after negation modifiers.
  2. Trust boundary partly honor-system. HMAC key is chmod 0400 root, but Brian process currently runs as root in dev. Real OS isolation needs uid separation (Day 4+).
  3. No real-world calibration data yet. All tests synthetic. First days of real usage will surface tuning gaps. Watch the extraction-errors log + override audit log + hook crash log for signal.
  4. HMAC rotation breaks in-flight VALIDATOR_PASS entries. Schedule rotations between sessions or implement key versioning (Day 4+).
  5. D3 test isolation — T1 + F3 in test_d3_override.py have pre-existing isolation issues (other tests pollute OVERRIDES dir before T1 runs). Doesn't affect production behavior.

What's NOT yet built

Architectural axiom (Codex R5, locked)

Absence of an approved validator for the claimed scope is itself a blocking condition; Brian may report only attempted actions and uncertainty, not completion.

Locked decisions — 2026-05-07 ratification round

Decision Value Authority
L4 fenced-block name brian_final Jonah ratified 2026-05-07 22:54 Beirut
Plan-vs-drift filter Hard MEMORY.md rule — every open item carries origin_workstream / why_it_matters / critical_path: yes/no; mixing workstreams in one status report is itself a behavioral failure Jonah ratified 2026-05-07 22:54 Beirut
Fail-closed gate Outer crash handler exits 2, not 0; stop_hook_active carve-out preserved at __main__ scope Landed 2026-05-07 22:57 Beirut

L4 schema enforcement — design contract (in flight)

Actionable turns must carry one fenced JSON block:

```brian_final
{
  "type": "done|partial|blocked|delegation",
  "summary": "...",
  "claims": [{"claim_type":"done","object":"...","validator_id":"...","ledger_entry_id":"...","artifact_hash":"..."}],
  "what_succeeded": "...",
  "what_remains": "...",
  "next_route_or_question": "..."
}
```

Slot rules:
- done → ≥1 claim with signed ledger evidence
- blockedladder_exhaustion_proof validator
- delegationfetch_attempt_proof validator
- partial → three human-readable slots, may include zero done claims

Integration points:
- L1 contract-intake writes intent: actionable and required_response_mode: schema into the contract YAML (current intake only emits a contract when ACTION_VERBS matches, so the file's existence already encodes "actionable")
- L3 Stop hook reads contract; if actionable, requires fenced block; parses JSON via stdlib only; validates slots
- Phrase tripwire runs only on prose-remainder after fenced region is stripped

Hard sequencing per Codex R2: fail-closed → schema → calibration. Block format = JSON, not YAML (stdlib parser, no version drift).