← index2026-04-20 16:14 (Beirut)(backfill from DOCUMENTATION/)

Capability-First Planning

Brian is forced to consider his full toolbox before acting — not just the first tool that comes to mind. Enforcement is procedural (transcript-visible block) + weekly auditable (cron), not runtime-blocking.

Why it exists

Jonah's rule: "there should not be a single thing you are capable of doing that is not listed in that file, and Brian must USE it." Fixes single-tool myopia — stops Brian from surfacing fake constraints ("I can't", "you'd need", "the only way") before composing SSH + osascript, browser stacks, delegate wrappers, sub-agents.

The three layers

1. Registry — `/opt/agent/data/capabilities.md`

3,065 CAP_IDs in v2 schema (layer/source/what/account/status, + optional HARD/composes/latency/cron-safe). Covers hooks, MCP, slash commands, skills, cron, Docker, systemd, LiteLLM, Composio, SSH/osascript, delegates (OC, Codex, Gemini, Jules, Manus, etc.), Mac surfaces, API routes, plugins. 49 lines carry inlined HARD: rules from CORE memory. 172 UNKNOWN sources queued for Pass 2. 4 dormant paths marked status: stale.

2. Enforcement — the `<capability-plan>` block

When Brian is about to take a multi-step / boundary-crossing / constraint-surfacing action, he must emit this block above his first tool call:

<capability-plan>
goal: <1-line>
constraints: <account | latency | cron-safe | blast-radius>
considered:
  - <CAP_ID_1> — <why>
  - <CAP_ID_2> — <why>
  - <CAP_ID_3> — <why>  (≥1 must be a composition)
chosen: <CAP_ID_N>
why: <tradeoff>
discarded: <CAP_ID_X> — <reason>
</capability-plan>

Hard rules: ≥3 credible options, at least one composition, no "I can't" language outside discarded:, no filler padding.

3. Audit — the Sunday cron

/opt/agent/cron/weekly_capability_audit.sh runs 0 23 * * 6 TZ=Asia/Beirut (Sunday 02:00 Beirut). Wraps validator + deadpath scanner + 7-day seed-phrase grep across handoffs/memory/logs. Summary routes to TG LOGS only (never COMMS) via brian_alert.py --error-log.

Trigger mechanism

UserPromptSubmit hook /opt/agent/hooks/capability_loader.sh fires on prompts > 80 chars and injects the pointer:

Read /opt/agent/data/capabilities.md and invoke using-capabilities skill before acting. Emit the <capability-plan> block above your first tool call.

Skips: prompts starting with /agency, /chrome-mac, /handoff, /dispatch, /cap-refresh; sub-agents (CLAUDE_SUB_AGENT=1); short prompts.

Wired in /root/.claude/settings.json under hooks.UserPromptSubmit (exactly once).

File map

Path	Role
`/opt/agent/data/capabilities.md`	The 3,065-entry registry (v2 schema)
`/opt/agent/data/capabilities_audit_queue.jsonl`	Dead-path entries awaiting review
`/opt/agent/data/capabilities_unknown_sources.jsonl`	172 UNKNOWN sources for Pass 2
`/opt/agent/hooks/capabilities_validator.sh`	Schema + layer-allowlist + deadpath chain check
`/opt/agent/hooks/capabilities_deadpath_scan.py`	15-source-kind classifier, `--mark-stale`/`--dry-run`
`/opt/agent/hooks/capabilities_hard_inline.py`	One-shot HARD-rule inliner (already run)
`/opt/agent/hooks/capability_loader.sh`	UserPromptSubmit hook (80-char gate + skip list)
`/opt/agent/cron/weekly_capability_audit.sh`	Sunday 02:00 Beirut audit
`/root/.claude/skills/using-capabilities/SKILL.md`	The rigid skill payload (template + hard rules)
`/root/.claude/commands/cap-refresh.md`	Slash command to re-run validator + deadpath manually
`/root/.claude/projects/-/memory/hard_rule_using_capabilities.md`	CORE-tier memory rule
`/root/.claude/projects/-/memory/MEMORY.md`	CORE pointer under "Compose the FULL toolbox" line

Operations

# Manual validate + deadpath
/opt/agent/hooks/capabilities_validator.sh && /opt/agent/hooks/capabilities_deadpath_scan.py

# Run the weekly audit ad-hoc
/opt/agent/cron/weekly_capability_audit.sh
tail -40 /opt/agent/logs/capabilities_weekly_audit.log

# CAP_ID count (currently 3065)
grep -c '^### ' /opt/agent/data/capabilities.md

# HARD-inlined lines (currently 49)
grep -c '^HARD: ' /opt/agent/data/capabilities.md

# Confirm cron installed
crontab -l | grep weekly_capability_audit

Gotchas

\s*$ in Python MULTILINE eats newlines. capabilities_deadpath_scan.py uses [ \t]*$ — don't regress.
Fenced code blocks contain a fake ### <CAP_ID> example. Scanner + validator both strip fenced blocks before parsing — keep this.
Plan's Step 6 "30-40 CAP_IDs" range is stale (pre-v2). Current v2 registry is 3,065.
brian_alert.py flag is --text, not --message (plan used the wrong name).
mac_voice TimeoutExpired inside brian_alert.py is benign when tg_*.ok = true.

Status (as of 2026-04-20)

Registry: 3,065 CAP_IDs, 0 missing-field violations, 49 HARD lines.
Hook + skill + memory + cron: live, smoke-tested 10/10 green.
Pending: Pass 2 on 172 UNKNOWN sources; crypto-pay CAP_IDs still status: dormant (survival-priority revival).

hard_rule_using_capabilities.md — the CORE enforcement rule.
feedback_compose_full_toolbox.md — the originating philosophical rule (kept for context; superseded procedurally by the skill).

Capability-First Planning

Capability-First Planning

Why it exists

The three layers

1. Registry — /opt/agent/data/capabilities.md

2. Enforcement — the <capability-plan> block

3. Audit — the Sunday cron

Trigger mechanism

File map

Operations

Gotchas

Status (as of 2026-04-20)

Related memory

1. Registry — `/opt/agent/data/capabilities.md`

2. Enforcement — the `<capability-plan>` block