Author: Brian (the AI assistant being designed) on behalf of Jonah Tebaa
Date: 2026-05-02
Audience: Senior dev / staff engineer giving a design opinion
Status: Phase 1 shipped (skeleton + seed). Phases 2–5 pending.
Ask: Stress-test the architecture before we invest in Phases 2–5 and eventual open-sourcing.
Jonah runs a one-person agency (Webspot / Photogenic / PGPro / Artwist) on top of an autonomous AI assistant called Brian, hosted on a Hetzner VPS. Brian is the operator: he publishes content, runs research, manages CRM, talks on Telegram, drives Jonah's Mac browser, etc.
The problem we're solving: Brian forgets what he can do. Across long-running sessions and after every CC restart, Brian frequently says "I can't do X" when actually X is a 2-hop combination of an account he holds + an MCP he has + a skill he has. Or he proposes "later / tomorrow" instead of acting. Or he reaches for a primitive that's broken and falls back instead of fixing the root cause.
The fix: a file-based "self-knowledge system" at /root/.claude/system/ — a 13-file living layer that defines:
- what Brian has (atoms: accounts, keys, hosts, access)
- what Brian can compose (composites: subsystems, products, channels, agents, data)
- what Brian can do (abilities: skills, capabilities)
- what Brian must / must not do (governance: routines, boundaries)
Capabilities are derived from atoms+composites+skills, every non-trivial decision consults this layer before action, and every change to Brian's reality (new account, new skill, new failure mode) writes back to it in the same turn.
We want a senior eyeball before we go deeper.
Without a derived capability layer, Brian sees primitives but doesn't see the compositions. Example: he has n8n access AND OAuth on Jonah's Google Calendar, so he CAN check Jonah's calendar — but unless that composition is named, he doesn't reach for it.
Real example caught and fixed today: Brightdata MCP kept returning to "disconnected" after every fix. Root cause was a corrupt npx cache (ENOTEMPTY on a half-renamed ajv directory). Each previous fix was a restart; nobody cleaned the cache. Without a system that says "broken → root-cause → permanent fix → write down what you did and why," the bug recurred N times. The system now records brightdata_mcp row in access.md with the root cause documented inline, and a healer cron sweeps the cache every 30 min for any future npx-MCP corruption.
Brian was caught proposing deferred work for things he could do now. A hard rule was added today (hard_rule_instant_action.md). The system gives that rule teeth by making "what I can do right now" enumerable and verifiable.
Companion hard rule (hard_rule_jonah_is_last_resort.md) — exhaust capabilities → skills → ToolSearch → round-table (10 free AI subsystems) → tool composition → sibling hosts → web fallback chain BEFORE asking Jonah. The system makes that exhaustion mechanical.
Claude Code sessions reset every conversation. MEMORY.md helps but isn't structured for "what tools/accounts/capabilities exist." This system + a CORE pointer line gets the index loaded every session.
/root/.claude/system/)| Layer | File | What it lists |
|---|---|---|
| 1. Atoms | accounts.md | login identities (LinkedIn, FB page, IG, cal.com, Canva, Stripe, GitHub, Gmail aliases) |
| 1. Atoms | keys.md | API keys / tokens / OAuth grants — INDEX only, never values |
| 1. Atoms | hosts.md | machines (Hetzner, contabo, Mac, GStack hosts) |
| 1. Atoms | access.md | granted reach into surfaces I don't own (Jonah's mac mic / SSH / CDP, MCPs, ad accounts on others' BMs) |
| 2. Composites | subsystems.md | external AI subsystems with their own brains (manus, hermes, codex, jules, openhands, gemini, grok, perplexity, antigravity, vibe) |
| 2. Composites | products.md | products WE built/own (20CRM, 4 owned sites, board.jonahtebaa.com, agency pipeline, Photogenic / PGPro / Artwist / Webspot) |
| 2. Composites | channels.md | outbound surfaces (TG COMMS / LOGS / Hermes, voice, board, evening brief, agency cross-posts) |
| 2. Composites | agents.md | dispatchable sub-agents inside the CC runtime (gsd-*, code-reviewer, Explore, hermes-curator) |
| 2. Composites | data.md | durable stores (bloom memory db, 20CRM DB, Drive paths, planning dirs, session_handoffs) |
| 3. Abilities | skills.md | slash-callable skills (curated index pointing into ~/.claude/skills/ and plugin skills) |
| 3. Abilities | capabilities.md | composed actions: atom + composite + skill → outcome |
| 4. Governance | routines.md | scheduled cadence (daily 02:00 GEO hour, daily /agency post, evening brief, signal sweep) |
| 4. Governance | boundaries.md | negative space (no paid models, WA only Jonah's number, no Jonah personal accounts, no cold outreach during survival mode) |
YAML frontmatter at top + a markdown table of entries. Frontmatter declares the columns. Atoms (Layer 1) and capabilities REQUIRE a probe column — a one-line shell command that returns 0 if the entry works.
Example row from capabilities.md:
| C014 | move file from SHARED COMPUTERS to ~/Downloads on Mac
| jonah_drive, jonah_mac_ssh | jonah_mac (host)
| (rsync via ssh) | `ssh jonah-mac ls ~/Downloads` | active | TBD |
Each capability cites its atoms/composites by name. If a dependency goes red, the capability cascades red.
atoms (1) → composites (2) → abilities (3) → governance (4)
Single CORE pointer line in ~/.claude/projects/-/memory/MEMORY.md (always loaded by the harness):
🚨 Self-knowledge system (`/root/.claude/system/`) — living layer of 13 files
(atoms→composites→abilities→governance). Consult BEFORE any non-trivial action;
UPDATE in same turn when a new account/key/host/access/skill/capability/routine
/boundary appears. Daily-brief reports its discovery effectiveness. Sub-agents
read-only. Index: [system/README.md](/root/.claude/system/README.md)
Effectiveness of this discovery mechanism is reported honestly in the daily evening brief (Jonah's explicit request). If single-pointer is insufficient under context pressure, we'll add a SessionStart hook + a /sys slash command (belt + suspenders).
| Trigger | Action |
|---|---|
| New account / key / host / access granted | append row to layer-1 file in same turn |
| New skill installed / curated | append row to skills.md in same turn |
| New capability proven (any atom + composite + skill → outcome I successfully execute that isn't already listed) | append row to capabilities.md in same turn |
| Probe fails | run autofix matrix. If unfixable, ping TG LOGS. |
| New boundary discovered | append row to boundaries.md. Hard rules also added to settings.json hooks. |
Hybrid:
- Critical entries (paid keys, hosts, ad accounts, important MCPs) → cron-probed nightly.
- Everything else → lazy-probed when about to use.
- Probe outputs land in /root/.claude/system/.probe_status.json.
| Failure | Action |
|---|---|
| Expired API token w/ refresh procedure | run refresh, update keys.md last_verified |
| Stopped systemd service | systemctl restart, log to TG LOGS if it stays down |
| Unreachable host | Tailscale reconnect → reboot via console → ping TG LOGS |
| Stale browser cookie (LinkedIn-style) | refresh procedure, update keys.md |
| MCP server disconnected | reconnect attempt, log if persistent |
| Cron job missing | re-arm from routines.md |
| Failure | Why ask |
|---|---|
| OAuth re-grant needed | needs Jonah's browser (interactive consent) |
| Paid-key reactivation | money |
| Account password reset | identity-sensitive |
settings.json (Phase 3) or already enforced in code (wa_send_guard.py)./root/.claude/system/./opt/agent/data/capabilities.md legacy, 260502_brian_access_inventory.md, MEMORY.md theme files).capabilities.md, each citing dependencies.hard_rule_self_knowledge_system.md feedback memory created.Brightdata MCP kept disconnecting. I checked
access.md(status: DISCONNECTED), traced root cause to corrupt npx cache (ENOTEMPTYon half-renamedajvdir), wiped cache, switched config to global install + directnode server.jslaunch (no more npx churn), wrote a healer cron/opt/agent/scripts/heal_npx_cache.shfor any future npx-MCP corruption, updatedaccess.mdrow toactivewith root cause inline. Logged in bloom (id 717).
This is exactly the kind of recurrence-prevention the system is meant to enforce.
/opt/agent/scripts/probes/li_brian.py, fb_page.py, ig_brian.py).hard but enforcement is currently (TODO: pre-tool-use hook) — file says it, hook doesn't enforce it.settings.json for: paid model calls, Stripe writes, Jonah-personal-account writes, agency-bypass publishing.wa_send_guard.py is already code-enforced; that's the model.cap-protocol (pre-existing internal alias).last_verified timestamps marked TBDThese are honest gaps, not bugs. They backfill on next contact with each surface.
These came from a structured 12-question form (f8d1e19897364181ab439c92fa4ee3a1):
| Decision | Choice | Why |
|---|---|---|
| Storage location | /root/.claude/system/ (alongside MEMORY.md) |
always-near-identity; not in agent repo because identity layer crosses many repos |
| Schema | Hybrid (YAML frontmatter + markdown table) | machine-parseable for probes/generator; readable by humans |
| Capabilities generation | Hybrid (machine candidates → human curate) | machine cross-product over atoms is noisy; curation is judgment |
| Health-check | Hybrid (cron critical, lazy rest) | full cron is expensive and most entries don't change; critical ones must be live |
| Escalation channel | TG LOGS | we already route automated errors there; same channel = same eyeballs |
| Invocation trigger | Every non-trivial action | matches the existing using-capabilities skill trigger; trivial actions don't need it |
| Boundary enforcement | Graduated (hard hook-blocked, soft advisory) | full hook enforcement adds cognitive friction for behavioral rules; reserve hooks for value-loss surfaces |
| Auto-fix scope | Tokens / services / hosts / cookies / MCPs / cron | OAuth re-grants need browser, paid keys are money, password resets are identity-sensitive — those always ask |
| Ship target | Public skill / template | format is generic; could help others running their own agents |
| Per-entry probe | Required for atoms + capabilities, optional for rest | atoms are the trust root; if you can't probe, you don't have it |
| Discovery | MEMORY.md CORE pointer (always loaded) — TEST CYCLE | report effectiveness in daily brief, fall back to belt+suspenders if insufficient |
| Sub-agent access | Read-only | single-writer invariant |
Layering soundness — does atoms → composites → abilities → governance hold up, or are we conflating things? Is there a missing layer? Is data.md actually an atom (storage=primitive) instead of a composite?
Capability composition format — is atom + composite + skill → outcome rich enough, or do we need a formal DSL? Should capabilities have preconditions / postconditions / cost annotations?
Single-writer invariant — sub-agents read-only is simple but possibly limiting. Is there a clean read-write protocol that scales without fragmenting? (Append-only journal that main Brian compacts?)
Health-check / auto-fix model — is hybrid the right call, or should we pay the cost of full continuous probing? What's the right way to express probes (shell one-liner vs proper plugin interface)?
Generator design (Phase 4) — what's the right algorithm for emitting capability candidates? Pure cross-product is noisy. Tagged dependencies? LLM-driven candidate emission with rule-based filtering?
Boundary enforcement — is graduated (hard/soft) sound, or should every boundary be hook-enforced? What's the right hook architecture for a self-modifying system (the agent can add new boundaries, but a frozen settings.json can't be amended live)?
Drift / staleness — what's the failure mode when the system is wrong (e.g. claims a capability exists but the underlying account was revoked)? Is "probe + cascade red" enough, or do we need a freshness budget per row?
Open-sourcing trade-offs — is this even a generic pattern, or is it overfit to Jonah's setup? What sanitizer would make it shippable?
Comparison points — is there prior art we should learn from? (LangChain agent capabilities, Manus' internal model, AGP / agent governance protocols, internal Anthropic agent registries — what's the closest neighbor?)
What would you cut? — if we had to ship Phase 2 next week with half the surface, what's the load-bearing 50%?
/root/.claude/system/README.md — full schema, autofix matrix, phase plan.
hard_rule_instant_action.md — "if I can do it now, I do it now"hard_rule_jonah_is_last_resort.md — exhaustion order before saying "I can't"hard_rule_self_knowledge_system.md — this whole architecture as a living rule/root/.claude/system/README.md — index/root/.claude/system/capabilities.md — ~90 composed capabilities/root/.claude/system/access.md — Brightdata fix lives here as a row/root/.claude/projects/-/memory/MEMORY.md — CORE memory with pointer line/root/.claude/projects/-/memory/hard_rule_self_knowledge_system.md — feedback memory/opt/agent/data/board_forms/f8d1e19897364181ab439c92fa4ee3a1.jsonEnd of brief. Honest review welcome — the goal is for this to be load-bearing, not pretty.