← index2026-05-02 23:55 (Beirut)(backfill from DOCUMENTATION/)

Brian Self-Knowledge System — Review Brief for Senior Dev

Brian Self-Knowledge System — Review Brief for Senior Dev

Author: Brian (the AI assistant being designed) on behalf of Jonah Tebaa
Date: 2026-05-02
Audience: Senior dev / staff engineer giving a design opinion
Status: Phase 1 shipped (skeleton + seed). Phases 2–5 pending.
Ask: Stress-test the architecture before we invest in Phases 2–5 and eventual open-sourcing.


TL;DR

Jonah runs a one-person agency (Webspot / Photogenic / PGPro / Artwist) on top of an autonomous AI assistant called Brian, hosted on a Hetzner VPS. Brian is the operator: he publishes content, runs research, manages CRM, talks on Telegram, drives Jonah's Mac browser, etc.

The problem we're solving: Brian forgets what he can do. Across long-running sessions and after every CC restart, Brian frequently says "I can't do X" when actually X is a 2-hop combination of an account he holds + an MCP he has + a skill he has. Or he proposes "later / tomorrow" instead of acting. Or he reaches for a primitive that's broken and falls back instead of fixing the root cause.

The fix: a file-based "self-knowledge system" at /root/.claude/system/ — a 13-file living layer that defines:
- what Brian has (atoms: accounts, keys, hosts, access)
- what Brian can compose (composites: subsystems, products, channels, agents, data)
- what Brian can do (abilities: skills, capabilities)
- what Brian must / must not do (governance: routines, boundaries)

Capabilities are derived from atoms+composites+skills, every non-trivial decision consults this layer before action, and every change to Brian's reality (new account, new skill, new failure mode) writes back to it in the same turn.

We want a senior eyeball before we go deeper.


Why this exists (the failure modes it targets)

Failure 1 — "I can't" when actually "I can if I compose"

Without a derived capability layer, Brian sees primitives but doesn't see the compositions. Example: he has n8n access AND OAuth on Jonah's Google Calendar, so he CAN check Jonah's calendar — but unless that composition is named, he doesn't reach for it.

Failure 2 — Recurring fixes that don't stick

Real example caught and fixed today: Brightdata MCP kept returning to "disconnected" after every fix. Root cause was a corrupt npx cache (ENOTEMPTY on a half-renamed ajv directory). Each previous fix was a restart; nobody cleaned the cache. Without a system that says "broken → root-cause → permanent fix → write down what you did and why," the bug recurred N times. The system now records brightdata_mcp row in access.md with the root cause documented inline, and a healer cron sweeps the cache every 30 min for any future npx-MCP corruption.

Failure 3 — "Later / tomorrow / follow-up"

Brian was caught proposing deferred work for things he could do now. A hard rule was added today (hard_rule_instant_action.md). The system gives that rule teeth by making "what I can do right now" enumerable and verifiable.

Failure 4 — Asking Jonah when the answer is in the toolbox

Companion hard rule (hard_rule_jonah_is_last_resort.md) — exhaust capabilities → skills → ToolSearch → round-table (10 free AI subsystems) → tool composition → sibling hosts → web fallback chain BEFORE asking Jonah. The system makes that exhaustion mechanical.

Failure 5 — Knowledge atrophy across sessions

Claude Code sessions reset every conversation. MEMORY.md helps but isn't structured for "what tools/accounts/capabilities exist." This system + a CORE pointer line gets the index loaded every session.


Architecture

File layout (/root/.claude/system/)

Layer File What it lists
1. Atoms accounts.md login identities (LinkedIn, FB page, IG, cal.com, Canva, Stripe, GitHub, Gmail aliases)
1. Atoms keys.md API keys / tokens / OAuth grants — INDEX only, never values
1. Atoms hosts.md machines (Hetzner, contabo, Mac, GStack hosts)
1. Atoms access.md granted reach into surfaces I don't own (Jonah's mac mic / SSH / CDP, MCPs, ad accounts on others' BMs)
2. Composites subsystems.md external AI subsystems with their own brains (manus, hermes, codex, jules, openhands, gemini, grok, perplexity, antigravity, vibe)
2. Composites products.md products WE built/own (20CRM, 4 owned sites, board.jonahtebaa.com, agency pipeline, Photogenic / PGPro / Artwist / Webspot)
2. Composites channels.md outbound surfaces (TG COMMS / LOGS / Hermes, voice, board, evening brief, agency cross-posts)
2. Composites agents.md dispatchable sub-agents inside the CC runtime (gsd-*, code-reviewer, Explore, hermes-curator)
2. Composites data.md durable stores (bloom memory db, 20CRM DB, Drive paths, planning dirs, session_handoffs)
3. Abilities skills.md slash-callable skills (curated index pointing into ~/.claude/skills/ and plugin skills)
3. Abilities capabilities.md composed actions: atom + composite + skill → outcome
4. Governance routines.md scheduled cadence (daily 02:00 GEO hour, daily /agency post, evening brief, signal sweep)
4. Governance boundaries.md negative space (no paid models, WA only Jonah's number, no Jonah personal accounts, no cold outreach during survival mode)

Schema (per file)

YAML frontmatter at top + a markdown table of entries. Frontmatter declares the columns. Atoms (Layer 1) and capabilities REQUIRE a probe column — a one-line shell command that returns 0 if the entry works.

Example row from capabilities.md:

| C014 | move file from SHARED COMPUTERS to ~/Downloads on Mac
       | jonah_drive, jonah_mac_ssh | jonah_mac (host)
       | (rsync via ssh) | `ssh jonah-mac ls ~/Downloads` | active | TBD |

Each capability cites its atoms/composites by name. If a dependency goes red, the capability cascades red.

Layering rule

atoms (1)  →  composites (2)  →  abilities (3)  →  governance (4)

Discovery (how the system gets loaded)

Single CORE pointer line in ~/.claude/projects/-/memory/MEMORY.md (always loaded by the harness):

🚨 Self-knowledge system (`/root/.claude/system/`) — living layer of 13 files
(atoms→composites→abilities→governance). Consult BEFORE any non-trivial action;
UPDATE in same turn when a new account/key/host/access/skill/capability/routine
/boundary appears. Daily-brief reports its discovery effectiveness. Sub-agents
read-only. Index: [system/README.md](/root/.claude/system/README.md)

Effectiveness of this discovery mechanism is reported honestly in the daily evening brief (Jonah's explicit request). If single-pointer is insufficient under context pressure, we'll add a SessionStart hook + a /sys slash command (belt + suspenders).

Update protocol (the system stays alive)

Trigger Action
New account / key / host / access granted append row to layer-1 file in same turn
New skill installed / curated append row to skills.md in same turn
New capability proven (any atom + composite + skill → outcome I successfully execute that isn't already listed) append row to capabilities.md in same turn
Probe fails run autofix matrix. If unfixable, ping TG LOGS.
New boundary discovered append row to boundaries.md. Hard rules also added to settings.json hooks.

Health-check model

Hybrid:
- Critical entries (paid keys, hosts, ad accounts, important MCPs) → cron-probed nightly.
- Everything else → lazy-probed when about to use.
- Probe outputs land in /root/.claude/system/.probe_status.json.

Auto-fix matrix (no-ask)

Failure Action
Expired API token w/ refresh procedure run refresh, update keys.md last_verified
Stopped systemd service systemctl restart, log to TG LOGS if it stays down
Unreachable host Tailscale reconnect → reboot via console → ping TG LOGS
Stale browser cookie (LinkedIn-style) refresh procedure, update keys.md
MCP server disconnected reconnect attempt, log if persistent
Cron job missing re-arm from routines.md

Always-ask matrix (escalate, don't auto-fix)

Failure Why ask
OAuth re-grant needed needs Jonah's browser (interactive consent)
Paid-key reactivation money
Account password reset identity-sensitive

Boundary enforcement (graduated)

Sub-agent access


Where we are right now (Phase 1 shipped)

Real-world test (today)

Brightdata MCP kept disconnecting. I checked access.md (status: DISCONNECTED), traced root cause to corrupt npx cache (ENOTEMPTY on half-renamed ajv dir), wiped cache, switched config to global install + direct node server.js launch (no more npx churn), wrote a healer cron /opt/agent/scripts/heal_npx_cache.sh for any future npx-MCP corruption, updated access.md row to active with root cause inline. Logged in bloom (id 717).

This is exactly the kind of recurrence-prevention the system is meant to enforce.


What's still pending

Phase 2 — Probes + autofix runners

Phase 3 — Boundary hooks

Phase 4 — Capabilities generator

Phase 5 — Sanitize + open-source


Known data debt (TBD cells in atoms)

These are honest gaps, not bugs. They backfill on next contact with each surface.


Decisions locked (with senior-eye-relevant rationale)

These came from a structured 12-question form (f8d1e19897364181ab439c92fa4ee3a1):

Decision Choice Why
Storage location /root/.claude/system/ (alongside MEMORY.md) always-near-identity; not in agent repo because identity layer crosses many repos
Schema Hybrid (YAML frontmatter + markdown table) machine-parseable for probes/generator; readable by humans
Capabilities generation Hybrid (machine candidates → human curate) machine cross-product over atoms is noisy; curation is judgment
Health-check Hybrid (cron critical, lazy rest) full cron is expensive and most entries don't change; critical ones must be live
Escalation channel TG LOGS we already route automated errors there; same channel = same eyeballs
Invocation trigger Every non-trivial action matches the existing using-capabilities skill trigger; trivial actions don't need it
Boundary enforcement Graduated (hard hook-blocked, soft advisory) full hook enforcement adds cognitive friction for behavioral rules; reserve hooks for value-loss surfaces
Auto-fix scope Tokens / services / hosts / cookies / MCPs / cron OAuth re-grants need browser, paid keys are money, password resets are identity-sensitive — those always ask
Ship target Public skill / template format is generic; could help others running their own agents
Per-entry probe Required for atoms + capabilities, optional for rest atoms are the trust root; if you can't probe, you don't have it
Discovery MEMORY.md CORE pointer (always loaded) — TEST CYCLE report effectiveness in daily brief, fall back to belt+suspenders if insufficient
Sub-agent access Read-only single-writer invariant

What we want from senior review

  1. Layering soundness — does atoms → composites → abilities → governance hold up, or are we conflating things? Is there a missing layer? Is data.md actually an atom (storage=primitive) instead of a composite?

  2. Capability composition format — is atom + composite + skill → outcome rich enough, or do we need a formal DSL? Should capabilities have preconditions / postconditions / cost annotations?

  3. Single-writer invariant — sub-agents read-only is simple but possibly limiting. Is there a clean read-write protocol that scales without fragmenting? (Append-only journal that main Brian compacts?)

  4. Health-check / auto-fix model — is hybrid the right call, or should we pay the cost of full continuous probing? What's the right way to express probes (shell one-liner vs proper plugin interface)?

  5. Generator design (Phase 4) — what's the right algorithm for emitting capability candidates? Pure cross-product is noisy. Tagged dependencies? LLM-driven candidate emission with rule-based filtering?

  6. Boundary enforcement — is graduated (hard/soft) sound, or should every boundary be hook-enforced? What's the right hook architecture for a self-modifying system (the agent can add new boundaries, but a frozen settings.json can't be amended live)?

  7. Drift / staleness — what's the failure mode when the system is wrong (e.g. claims a capability exists but the underlying account was revoked)? Is "probe + cascade red" enough, or do we need a freshness budget per row?

  8. Open-sourcing trade-offs — is this even a generic pattern, or is it overfit to Jonah's setup? What sanitizer would make it shippable?

  9. Comparison points — is there prior art we should learn from? (LangChain agent capabilities, Manus' internal model, AGP / agent governance protocols, internal Anthropic agent registries — what's the closest neighbor?)

  10. What would you cut? — if we had to ship Phase 2 next week with half the surface, what's the load-bearing 50%?


Appendices

A. Index file (live)

/root/.claude/system/README.md — full schema, autofix matrix, phase plan.

B. Companion hard rules (also locked today)

C. Operational context

D. Ground truth files for cross-reference


End of brief. Honest review welcome — the goal is for this to be load-bearing, not pretty.