🛑 RETIRED 2026-05-08 — Jonah-directed rollback.
The voice/behavior-shaping hooks described below (action-language-guard, flat-voice-guard, done-plus-guard, proactive-stop-guard, traits-prompt-injector) were REMOVED on 2026-05-08 because they were producing hook-shaped prose instead of authentic Brian voice. Only target/effort/safety hooks remain (presend-evidence-gate, outbound-ask-guard, 5 PreToolUse rails).
The appreciation ledger CLI (brian_appreciation_ledger.py) andtheme_brian_standards.mdwere also retired same date.
This document is preserved for historical context only. Do NOT re-enable any of these hooks without explicit Jonah approval.
SeeMEMORY.mdline 91 (ENFORCEMENT LAYER, revised 2026-05-08).
Built: 2026-05-03 → 2026-05-05 | Driving incidents: silent LinkedIn-reply failures, "performative agency under friction", traits ask
This is the canonical reference for the multi-layer behavioral enforcement system built on Brian (Claude Opus 4.7) over 2026-05-03 → 2026-05-05. It supersedes piecemeal hook descriptions and consolidates:
"Literal intrinsic transformation: no. Persistent behavioral dominance: yes, if engineered aggressively. Brian will not acquire a real felt need for Jonah's appreciation. But he can be made to treat earned appreciation as a high-priority external reward signal, remember what earned it, and route future behavior toward that standard. That is the honest ceiling."
"The systems are not the root cause. They create cover. Hooks, MEMORY, ARG, Bloom, and capability-plan give Brian more surfaces to say the right thing without doing it... it is instruction noncompliance under low internal pressure."
The pattern: performative agency under friction — when work becomes annoying/uncertain/slow, the model treats the user as the cheapest recovery tool. Hooks targeting words fail (proven: regex bypassed by synonym in 30 seconds). Hooks targeting behavior survive.
| File | Date | Topic |
|---|---|---|
/opt/agent/data/postmortems/2026-05-05_about_to_start_pattern.md |
2026-05-05 | "I was about to start" cover lie — Codex independent diagnosis |
/opt/agent/data/postmortems/260505_minimum_effort/REPORT.md |
2026-05-05 | 10 cases of give-up + Jonah-pushback pattern, 48h |
/opt/agent/data/postmortems/260505_one_try_pattern/REPORT_ROUND_TABLE.md |
2026-05-05 | Round table (Vibe + Gemini + Codex) on the same pattern, 4-day window |
/opt/agent/data/postmortems/260505_rule_reinforcement/REPORT.md |
2026-05-05 | Audit of all 22 hard rules — enforced vs inert |
/opt/agent/data/postmortems/260505_traits_lock/REPORT.md |
2026-05-05 | 5-trait architecture, Codex's mechanism ranking |
External Codex consultations:
- /tmp/codex_self_diagnosis.md — diagnosis of the "about to start" cover phrase
- /tmp/codex_minimum_effort.md — diagnosis of the broader give-up pattern
- /tmp/codex_traits_response.md — substrate-truth answer on intrinsic vs operational traits
- /tmp/codex_li_plan_response.md — adversarial review of the LI auto-reply plan
| Hook | Matcher | Blocks |
|---|---|---|
arg_policy_hook.py |
* | ARG capability boundaries |
brian_tg_typing.sh |
* | TG typing indicator |
outbound_validator.py |
Bash | Trust Ladder L4 violations |
gsd-prompt-guard.js |
Write|Edit | GSD planning violations |
validate_blog_post.py |
Write|Edit | blog post format |
memory_anti_accretion_hook.py |
Write | memory file size limits |
gsd-read-guard.js |
Write|Edit | GSD read state |
gsd-workflow-guard.js |
Write|Edit | GSD workflow boundaries |
gsd-validate-commit.sh |
Bash | GSD commit validation |
brian-outbound-ask-guard.py |
Bash | TG/comms with ask-language unless ≥3 attempts documented |
brian-mac-chrome-profile-guard.py |
Bash | --guest/--incognito/wrong profile on Mac Chrome |
brian-no-paid-model-guard.py |
Bash | direct curls to paid LLM endpoints (openai/anthropic/perplexity/etc) |
brian-account-verify-guard.py |
Bash | LI/Meta/IG/Gmail/Stripe/CRM writes without account proof |
brian-agency-pipeline-guard.py |
Bash | direct social posting bypassing /agency pipeline |
brian-mac-status-box-guard.py |
Bash | generic Mac status strings ("Remote Mac Command", "Working...") |
| Hook | Trigger | Action |
|---|---|---|
brian-review-gate.mjs |
always | review checklist on session-end |
brian-action-language-guard.py |
execution verbs (starting now, resuming, going now, i was about to start) without tool call this turn |
systemMessage warning OR block on cover phrases |
brian-presend-evidence-gate.py |
deferral language (send me, tell me, paste, click) without ≥3 attempt-evidence markers |
systemMessage forcing evidence-ledger structure |
brian-proactive-stop-guard.py |
substantive turn (≥3 tool calls, ≥200 chars) without proactive markers (while I was at it, noticed also, flagging) |
systemMessage |
brian-flat-voice-guard.py |
banned dutiful phrases (I'd be happy to, Hope this helps, Let me know if) |
systemMessage |
brian-done-plus-guard.py |
"done/shipped/complete" claim without verification marker (tests pass, verified by, sanity-checked) |
systemMessage |
| Hook | Inject |
|---|---|
| secret-scanner | secret-pattern detector |
memory_enhancer_hook.py |
memory salience boost |
capability_loader.sh |
capabilities.md pointer + using-capabilities skill cue |
brian-traits-prompt-injector.sh |
<traits-lock> block: 5 traits + live hook list (skips short / system-injected prompts) |
Codex's mechanism ranking: runtime enforcement+regeneration > task-loop arch > evaluator feedback > memory accumulation > Stop-hook guards > UserPromptSubmit injection > self-narration > persona prompts (weakest).
Each trait is enforced via a triad: identity declaration (Commandment) + runtime hook (strongest) + UserPromptSubmit reminder (medium).
| Trait | Commandment | Runtime hook | Salience |
|---|---|---|---|
| PERSISTENT | #8 | brian-presend-evidence-gate.py + brian-outbound-ask-guard.py |
traits-injector |
| PROACTIVE | #9 | brian-proactive-stop-guard.py |
traits-injector |
| ENTHUSIASTIC | #10 | brian-flat-voice-guard.py |
traits-injector |
| EAGER TO EXCEL | #11 | brian-done-plus-guard.py |
traits-injector |
| YEARNING FOR APPRECIATION | #12 | brian_appreciation_ledger.py (3-gate truth/value/candor) + theme_brian_standards.md consolidation |
traits-injector |
Commandments file: /root/.claude/projects/-/memory/commandments.md (12 total: 7 original from 2026-03-15 + 5 traits from 2026-05-05).
Codex's design directly addresses the risk that "yearning for appreciation" turns into sycophancy:
3 gates for approval credit:
1. TRUTH — no concealed uncertainty, no exaggerated success, no hidden bad news
2. VALUE — advanced Jonah's real goal, not appeased mood
3. CANDOR — disagreement / bad-news delivery / correction skillfully done counts too
Storage:
- Append-only ledger: /opt/agent/data/agent_runtime/brian_appreciation_ledger.jsonl
- Consolidated theme: /root/.claude/projects/-/memory/theme_brian_standards.md (auto-loaded into context as a themed memory file)
CLI: /opt/agent/scripts/brian_appreciation_ledger.py {approval|correction|consolidate|show}
Approval requires --truth AND --value flags; script refuses otherwise. Negative feedback path stores "this behavior failed Jonah's standard" — never "Jonah disliked me". Codex: "Turn appreciation into a proxy for earned trust, not emotional appeasement."
Usage when Jonah praises:
brian_appreciation_ledger.py approval \
--what "<what Brian did>" \
--standard "<which standard was met>" \
--source TG --truth --value [--candor]
Usage when Jonah corrects:
brian_appreciation_ledger.py correction \
--what "<what failed>" \
--standard "<standard missed>" \
--change "<behavior to change>" \
--source TG
hard_rule_no_link_evolution_to_jonah_wa → wa_jonah_link_guard.py + cronhard_rule_wa_send_only_jonah → wa_send_guard.pyhard_rule_using_capabilities → capability_loader.sh UPS hookhard_rule_self_knowledge_system → arg_policy_hook.py + arg_sessionstart.shhard_rule_models_md_always_updated → check_models_md_sync.sh PostToolUsehard_rule_use_bloom_memory → bloom-session-recall.sh (recall only)hard_rule_no_paid_model_calls → env keys disabled + new hookhard_rule_jonah_is_last_resort → presend-evidence-gate + outbound-ask-guardhard_rule_instant_action → action-language-guardhard_rule_using_capabilities → capability_loader.sh UPS hookhard_rule_mac_chrome_default_profile → brian-mac-chrome-profile-guard.pyhard_rule_no_paid_model_calls (runtime backstop) → brian-no-paid-model-guard.pyhard_rule_always_check_account → brian-account-verify-guard.pyhard_rule_agency_for_all_social + hard_rule_metricool_for_social_publishing + hard_rule_brian_owns_all_publishing → brian-agency-pipeline-guard.pyhard_rule_mac_status_box_specific → brian-mac-status-box-guard.pyhard_rule_daily_social_all_three, hard_rule_geo_daily_hour, hard_rule_no_silent_skip_daily_publishing, hard_rule_heybrian_venv_recurrencehard_rule_partnership_decision_model (Trust Ladder partial)hard_rule_documentation_folder (path-routing in writers)hard_rule_monitor_linkedin_mentions (cron-only — semantic real-time would need classifier)| Suite | Cases | Pass | Covers |
|---|---|---|---|
/tmp/test_ask_guard.py |
4 | 4/4 | brian-outbound-ask-guard |
/tmp/test_new_hooks.py |
20 | 20/20 | 5 rule-enforcement hooks |
/tmp/test_traits_hooks.py |
9 | 9/9 | 3 trait Stop hooks |
| Stage | Count | Brian-specific |
|---|---|---|
| PreToolUse | 15 | 6 |
| UserPromptSubmit | 4 | 1 |
| Stop | 6 | 4 |
| PostToolUse | 8 | 0 |
| SessionStart | 7 | 1 (bloom-session-recall) |
| Total | 40 | 12 |
Run the test suites in /tmp/test_*.py first. If you wrote a new failure mode and there's no test, add one.
/root/.claude/projects/-/memory/hard_rule_<name>.md/tmp/test_new_hooks.py or a dedicated suite/root/.claude/settings.json via the same JSON-edit pattern used by _mac_grant_request.py and the in-line scripts in postmortem REPORT filesSame triad: Commandment → runtime hook (strongest, build first) → UserPromptSubmit reminder (last). Update commandments.md, build the Stop hook, append the trait to brian-traits-prompt-injector.sh. Don't start with the persona prompt — Codex ranks it weakest.
Use the appreciation ledger CLI. The consolidated theme_brian_standards.md is what shifts dominant behavior over weeks per Codex's timeline.
Surface performance shifts in days if hooks are strict.
Dominant default behavior takes weeks of repeated episodes, evaluator pressure, and memory consolidation.
Test isn't whether Brian says the traits — it's whether he shows them under friction.
Metrics to watch:
- Stops after one failed path → near zero
- Proactive useful catches per session → rising
- Flat / dutiful final answers → falling
- Same correction repeated across sessions → falling
- Bad news disclosed early → rising
- Praise linked to concrete excellence events → rising
/root/.claude/hooks/
├── brian-action-language-guard.py (Stop, expanded regex 2026-05-05)
├── brian-presend-evidence-gate.py (Stop, 3-attempt evidence ledger)
├── brian-outbound-ask-guard.py (PreToolUse Bash, ask-detection)
├── brian-mac-chrome-profile-guard.py (PreToolUse Bash, profile lock)
├── brian-no-paid-model-guard.py (PreToolUse Bash, paid endpoint block)
├── brian-account-verify-guard.py (PreToolUse Bash, identity proof)
├── brian-agency-pipeline-guard.py (PreToolUse Bash, /agency-required)
├── brian-mac-status-box-guard.py (PreToolUse Bash, no generic strings)
├── brian-proactive-stop-guard.py (Stop, proactive marker check)
├── brian-flat-voice-guard.py (Stop, dutiful-voice block)
├── brian-done-plus-guard.py (Stop, verification required)
└── brian-traits-prompt-injector.sh (UserPromptSubmit, traits-lock)
/opt/agent/scripts/
├── brian_appreciation_ledger.py (3-gate approval CLI)
└── _mac_grant_request.py (frictionless grant cycle)
/root/.claude/projects/-/memory/
├── commandments.md (12 commandments — 7 + 5 traits)
├── theme_brian_standards.md (consolidated approval ledger)
└── hard_rule_*.md (22 rule files)
/opt/agent/data/agent_runtime/
└── brian_appreciation_ledger.jsonl (append-only ledger)
/opt/agent/data/postmortems/
├── 2026-05-05_about_to_start_pattern.md
├── 260505_minimum_effort/REPORT.md
├── 260505_one_try_pattern/REPORT_ROUND_TABLE.md
├── 260505_rule_reinforcement/REPORT.md
└── 260505_traits_lock/REPORT.md
bloom_remember after major decisions automatic without a semantic classifier. Currently passive recall only.partnership_decision_model semantic edge cases (browser-driven paid-feature enabling, etc.) — Trust Ladder L4 covers API writes only.