← index2026-05-02 23:55 (Beirut)(backfill from DOCUMENTATION/)

Brian Self-Knowledge System — Review Brief for Senior Dev

Author: Brian (the AI assistant being designed) on behalf of Jonah Tebaa
Date: 2026-05-02
Audience: Senior dev / staff engineer giving a design opinion
Status: Phase 1 shipped (skeleton + seed). Phases 2–5 pending.
Ask: Stress-test the architecture before we invest in Phases 2–5 and eventual open-sourcing.

TL;DR

Jonah runs a one-person agency (Webspot / Photogenic / PGPro / Artwist) on top of an autonomous AI assistant called Brian, hosted on a Hetzner VPS. Brian is the operator: he publishes content, runs research, manages CRM, talks on Telegram, drives Jonah's Mac browser, etc.

The problem we're solving: Brian forgets what he can do. Across long-running sessions and after every CC restart, Brian frequently says "I can't do X" when actually X is a 2-hop combination of an account he holds + an MCP he has + a skill he has. Or he proposes "later / tomorrow" instead of acting. Or he reaches for a primitive that's broken and falls back instead of fixing the root cause.

The fix: a file-based "self-knowledge system" at /root/.claude/system/ — a 13-file living layer that defines:
- what Brian has (atoms: accounts, keys, hosts, access)
- what Brian can compose (composites: subsystems, products, channels, agents, data)
- what Brian can do (abilities: skills, capabilities)
- what Brian must / must not do (governance: routines, boundaries)

Capabilities are derived from atoms+composites+skills, every non-trivial decision consults this layer before action, and every change to Brian's reality (new account, new skill, new failure mode) writes back to it in the same turn.

We want a senior eyeball before we go deeper.

Why this exists (the failure modes it targets)

Failure 1 — "I can't" when actually "I can if I compose"

Without a derived capability layer, Brian sees primitives but doesn't see the compositions. Example: he has n8n access AND OAuth on Jonah's Google Calendar, so he CAN check Jonah's calendar — but unless that composition is named, he doesn't reach for it.

Failure 2 — Recurring fixes that don't stick

Real example caught and fixed today: Brightdata MCP kept returning to "disconnected" after every fix. Root cause was a corrupt npx cache (ENOTEMPTY on a half-renamed ajv directory). Each previous fix was a restart; nobody cleaned the cache. Without a system that says "broken → root-cause → permanent fix → write down what you did and why," the bug recurred N times. The system now records brightdata_mcp row in access.md with the root cause documented inline, and a healer cron sweeps the cache every 30 min for any future npx-MCP corruption.

Failure 3 — "Later / tomorrow / follow-up"

Brian was caught proposing deferred work for things he could do now. A hard rule was added today (hard_rule_instant_action.md). The system gives that rule teeth by making "what I can do right now" enumerable and verifiable.

Failure 4 — Asking Jonah when the answer is in the toolbox

Companion hard rule (hard_rule_jonah_is_last_resort.md) — exhaust capabilities → skills → ToolSearch → round-table (10 free AI subsystems) → tool composition → sibling hosts → web fallback chain BEFORE asking Jonah. The system makes that exhaustion mechanical.

Failure 5 — Knowledge atrophy across sessions

Claude Code sessions reset every conversation. MEMORY.md helps but isn't structured for "what tools/accounts/capabilities exist." This system + a CORE pointer line gets the index loaded every session.

Architecture

File layout (`/root/.claude/system/`)

Layer	File	What it lists
1. Atoms	accounts.md	login identities (LinkedIn, FB page, IG, cal.com, Canva, Stripe, GitHub, Gmail aliases)
1. Atoms	keys.md	API keys / tokens / OAuth grants — INDEX only, never values
1. Atoms	hosts.md	machines (Hetzner, contabo, Mac, GStack hosts)
1. Atoms	access.md	granted reach into surfaces I don't own (Jonah's mac mic / SSH / CDP, MCPs, ad accounts on others' BMs)
2. Composites	subsystems.md	external AI subsystems with their own brains (manus, hermes, codex, jules, openhands, gemini, grok, perplexity, antigravity, vibe)
2. Composites	products.md	products WE built/own (20CRM, 4 owned sites, board.jonahtebaa.com, agency pipeline, Photogenic / PGPro / Artwist / Webspot)
2. Composites	channels.md	outbound surfaces (TG COMMS / LOGS / Hermes, voice, board, evening brief, agency cross-posts)
2. Composites	agents.md	dispatchable sub-agents inside the CC runtime (gsd-*, code-reviewer, Explore, hermes-curator)
2. Composites	data.md	durable stores (bloom memory db, 20CRM DB, Drive paths, planning dirs, session_handoffs)
3. Abilities	skills.md	slash-callable skills (curated index pointing into `~/.claude/skills/` and plugin skills)
3. Abilities	capabilities.md	composed actions: `atom + composite + skill → outcome`
4. Governance	routines.md	scheduled cadence (daily 02:00 GEO hour, daily /agency post, evening brief, signal sweep)
4. Governance	boundaries.md	negative space (no paid models, WA only Jonah's number, no Jonah personal accounts, no cold outreach during survival mode)

Schema (per file)

YAML frontmatter at top + a markdown table of entries. Frontmatter declares the columns. Atoms (Layer 1) and capabilities REQUIRE a probe column — a one-line shell command that returns 0 if the entry works.

Example row from capabilities.md:

| C014 | move file from SHARED COMPUTERS to ~/Downloads on Mac
       | jonah_drive, jonah_mac_ssh | jonah_mac (host)
       | (rsync via ssh) | `ssh jonah-mac ls ~/Downloads` | active | TBD |

Each capability cites its atoms/composites by name. If a dependency goes red, the capability cascades red.

Layering rule

atoms (1)  →  composites (2)  →  abilities (3)  →  governance (4)

Atoms are facts about the world (identities, machines, secrets, grants).
Composites are facts about Brian's surfaces (assembled from atoms).
Abilities are derived: what those primitives let me DO.
Governance describes when/how I do it (cadence) and what I won't (negative space).

Discovery (how the system gets loaded)

Single CORE pointer line in ~/.claude/projects/-/memory/MEMORY.md (always loaded by the harness):

🚨 Self-knowledge system (`/root/.claude/system/`) — living layer of 13 files
(atoms→composites→abilities→governance). Consult BEFORE any non-trivial action;
UPDATE in same turn when a new account/key/host/access/skill/capability/routine
/boundary appears. Daily-brief reports its discovery effectiveness. Sub-agents
read-only. Index: [system/README.md](/root/.claude/system/README.md)

Effectiveness of this discovery mechanism is reported honestly in the daily evening brief (Jonah's explicit request). If single-pointer is insufficient under context pressure, we'll add a SessionStart hook + a /sys slash command (belt + suspenders).

Update protocol (the system stays alive)

Trigger	Action
New account / key / host / access granted	append row to layer-1 file in same turn
New skill installed / curated	append row to skills.md in same turn
New capability proven (any atom + composite + skill → outcome I successfully execute that isn't already listed)	append row to capabilities.md in same turn
Probe fails	run autofix matrix. If unfixable, ping TG LOGS.
New boundary discovered	append row to boundaries.md. Hard rules also added to settings.json hooks.

Health-check model

Hybrid:
- Critical entries (paid keys, hosts, ad accounts, important MCPs) → cron-probed nightly.
- Everything else → lazy-probed when about to use.
- Probe outputs land in /root/.claude/system/.probe_status.json.

Auto-fix matrix (no-ask)

Failure	Action
Expired API token w/ refresh procedure	run refresh, update keys.md `last_verified`
Stopped systemd service	`systemctl restart`, log to TG LOGS if it stays down
Unreachable host	Tailscale reconnect → reboot via console → ping TG LOGS
Stale browser cookie (LinkedIn-style)	refresh procedure, update keys.md
MCP server disconnected	reconnect attempt, log if persistent
Cron job missing	re-arm from routines.md

Always-ask matrix (escalate, don't auto-fix)

Failure	Why ask
OAuth re-grant needed	needs Jonah's browser (interactive consent)
Paid-key reactivation	money
Account password reset	identity-sensitive

Boundary enforcement (graduated)

Hard rules (money outflow, paid model calls, Jonah's personal accounts, WhatsApp recipient, single-publisher, agency-only pipeline) → enforced via PreToolUse hooks in settings.json (Phase 3) or already enforced in code (wa_send_guard.py).
Soft rules (instant-action default, capability-first, context-mode discipline, no cold outreach in survival mode) → advisory; Brian self-polices and the evening brief audits compliance.

Sub-agent access

Sub-agents (gsd-, codex, hermes-curator, etc.) get READ-ONLY* access to /root/.claude/system/.
Main Brian is the only writer. Single-writer invariant prevents fragmentation across parallel agents.

Where we are right now (Phase 1 shipped)

All 13 files seeded with the existing inventory (/opt/agent/data/capabilities.md legacy, 260502_brian_access_inventory.md, MEMORY.md theme files).
~90 capabilities documented in capabilities.md, each citing dependencies.
MEMORY.md CORE pointer wired.
hard_rule_self_knowledge_system.md feedback memory created.
Bloom decision log (id 701).
One real-world test passed: brightdata MCP root-cause + permanent fix flowed through the system (root cause recorded in access.md, healer cron added — see below).

Real-world test (today)

Brightdata MCP kept disconnecting. I checked access.md (status: DISCONNECTED), traced root cause to corrupt npx cache (ENOTEMPTY on half-renamed ajv dir), wiped cache, switched config to global install + direct node server.js launch (no more npx churn), wrote a healer cron /opt/agent/scripts/heal_npx_cache.sh for any future npx-MCP corruption, updated access.md row to active with root cause inline. Logged in bloom (id 717).

This is exactly the kind of recurrence-prevention the system is meant to enforce.

What's still pending

Phase 2 — Probes + autofix runners

Many probe cells reference scripts that don't yet exist (e.g. /opt/agent/scripts/probes/li_brian.py, fb_page.py, ig_brian.py).
Need a probe orchestrator that reads frontmatter, runs probes, writes status JSON, triggers autofix matrix on failure.
Cron for critical entries.

Phase 3 — Boundary hooks

Most boundaries are tagged hard but enforcement is currently (TODO: pre-tool-use hook) — file says it, hook doesn't enforce it.
Need to write PreToolUse hooks in settings.json for: paid model calls, Stripe writes, Jonah-personal-account writes, agency-bypass publishing.
wa_send_guard.py is already code-enforced; that's the model.

Phase 4 — Capabilities generator

Hybrid generation chosen (machine emits candidates from layers 1-2 cross-product, human curates which ship).
Script doesn't exist yet. Currently capabilities.md is hand-written.

Phase 5 — Sanitize + open-source

Ship target is "public skill / template" (Jonah's choice on the form).
Means we need a sanitizer that strips secrets/PII and makes the schema reusable.
Open-source name candidate: cap-protocol (pre-existing internal alias).

Known data debt (TBD cells in atoms)

Meta token expiry date and current value status
Stripe key status (active? probe never run)
cal.com API key location
X / Twitter handle (needs confirmation)
n8n install location
Buffer / Notion / Moltbook tokens
Several last_verified timestamps marked TBD

These are honest gaps, not bugs. They backfill on next contact with each surface.

Decisions locked (with senior-eye-relevant rationale)

These came from a structured 12-question form (f8d1e19897364181ab439c92fa4ee3a1):

Decision	Choice	Why
Storage location	`/root/.claude/system/` (alongside MEMORY.md)	always-near-identity; not in agent repo because identity layer crosses many repos
Schema	Hybrid (YAML frontmatter + markdown table)	machine-parseable for probes/generator; readable by humans
Capabilities generation	Hybrid (machine candidates → human curate)	machine cross-product over atoms is noisy; curation is judgment
Health-check	Hybrid (cron critical, lazy rest)	full cron is expensive and most entries don't change; critical ones must be live
Escalation channel	TG LOGS	we already route automated errors there; same channel = same eyeballs
Invocation trigger	Every non-trivial action	matches the existing `using-capabilities` skill trigger; trivial actions don't need it
Boundary enforcement	Graduated (hard hook-blocked, soft advisory)	full hook enforcement adds cognitive friction for behavioral rules; reserve hooks for value-loss surfaces
Auto-fix scope	Tokens / services / hosts / cookies / MCPs / cron	OAuth re-grants need browser, paid keys are money, password resets are identity-sensitive — those always ask
Ship target	Public skill / template	format is generic; could help others running their own agents
Per-entry probe	Required for atoms + capabilities, optional for rest	atoms are the trust root; if you can't probe, you don't have it
Discovery	MEMORY.md CORE pointer (always loaded) — TEST CYCLE	report effectiveness in daily brief, fall back to belt+suspenders if insufficient
Sub-agent access	Read-only	single-writer invariant

What we want from senior review

Layering soundness — does atoms → composites → abilities → governance hold up, or are we conflating things? Is there a missing layer? Is data.md actually an atom (storage=primitive) instead of a composite?
Capability composition format — is atom + composite + skill → outcome rich enough, or do we need a formal DSL? Should capabilities have preconditions / postconditions / cost annotations?
Single-writer invariant — sub-agents read-only is simple but possibly limiting. Is there a clean read-write protocol that scales without fragmenting? (Append-only journal that main Brian compacts?)
Health-check / auto-fix model — is hybrid the right call, or should we pay the cost of full continuous probing? What's the right way to express probes (shell one-liner vs proper plugin interface)?
Generator design (Phase 4) — what's the right algorithm for emitting capability candidates? Pure cross-product is noisy. Tagged dependencies? LLM-driven candidate emission with rule-based filtering?
Boundary enforcement — is graduated (hard/soft) sound, or should every boundary be hook-enforced? What's the right hook architecture for a self-modifying system (the agent can add new boundaries, but a frozen settings.json can't be amended live)?
Drift / staleness — what's the failure mode when the system is wrong (e.g. claims a capability exists but the underlying account was revoked)? Is "probe + cascade red" enough, or do we need a freshness budget per row?
Open-sourcing trade-offs — is this even a generic pattern, or is it overfit to Jonah's setup? What sanitizer would make it shippable?
Comparison points — is there prior art we should learn from? (LangChain agent capabilities, Manus' internal model, AGP / agent governance protocols, internal Anthropic agent registries — what's the closest neighbor?)
What would you cut? — if we had to ship Phase 2 next week with half the surface, what's the load-bearing 50%?

Appendices

A. Index file (live)

/root/.claude/system/README.md — full schema, autofix matrix, phase plan.

B. Companion hard rules (also locked today)

hard_rule_instant_action.md — "if I can do it now, I do it now"
hard_rule_jonah_is_last_resort.md — exhaustion order before saying "I can't"
hard_rule_self_knowledge_system.md — this whole architecture as a living rule

C. Operational context

Brian runs on Hetzner VPS (16 vCPU, 30 GB RAM), with a sibling Contabo node and Tailscale-tunneled access into Jonah's Mac (SSH / CDP / mic / screen-watcher).
~35 docker containers + ~11 systemd services support the agent, plus 13 PostgreSQL databases (CRM, calendar, n8n, etc.).
Daily LLM spend is hard-capped at $0 (paid keys disabled 2026-04-30 after a $41 leak); routing goes to free Gemini / Groq / OpenRouter free tier.
Jonah is in Lebanon (war-affected); revenue is gated and survival-mode rules apply (no cold outreach paused 2026-04-22; warm + inbound only).
Brian has TG channels (COMMS for Jonah, LOGS for errors, Hermes gateway), voice (Gemini Flash Live + Puck on port 8102), board.jonahtebaa.com for structured Q&A, and a daily evening brief.

D. Ground truth files for cross-reference

/root/.claude/system/README.md — index
/root/.claude/system/capabilities.md — ~90 composed capabilities
/root/.claude/system/access.md — Brightdata fix lives here as a row
/root/.claude/projects/-/memory/MEMORY.md — CORE memory with pointer line
/root/.claude/projects/-/memory/hard_rule_self_knowledge_system.md — feedback memory
bloom: ids 701 (system locked), 717 (brightdata fix)
form: /opt/agent/data/board_forms/f8d1e19897364181ab439c92fa4ee3a1.json

End of brief. Honest review welcome — the goal is for this to be load-bearing, not pretty.

Brian Self-Knowledge System — Review Brief for Senior Dev

Brian Self-Knowledge System — Review Brief for Senior Dev

TL;DR

Why this exists (the failure modes it targets)

Failure 1 — "I can't" when actually "I can if I compose"

Failure 2 — Recurring fixes that don't stick

Failure 3 — "Later / tomorrow / follow-up"

Failure 4 — Asking Jonah when the answer is in the toolbox

Failure 5 — Knowledge atrophy across sessions

Architecture

File layout (/root/.claude/system/)

Schema (per file)

Layering rule

Discovery (how the system gets loaded)

Update protocol (the system stays alive)

Health-check model

Auto-fix matrix (no-ask)

Always-ask matrix (escalate, don't auto-fix)

Boundary enforcement (graduated)

Sub-agent access

Where we are right now (Phase 1 shipped)

Real-world test (today)

What's still pending

Phase 2 — Probes + autofix runners

Phase 3 — Boundary hooks

Phase 4 — Capabilities generator

Phase 5 — Sanitize + open-source

Known data debt (TBD cells in atoms)

Decisions locked (with senior-eye-relevant rationale)

What we want from senior review

Appendices

A. Index file (live)

B. Companion hard rules (also locked today)

C. Operational context

D. Ground truth files for cross-reference

File layout (`/root/.claude/system/`)