← index2026-05-03 02:36 (Beirut)(backfill from DOCUMENTATION/)

ARG — items deferred to Phase 5 (now LANDED)

Date: 2026-05-02 (overnight 02→03 Beirut)
Author: Brian (overnight ARG queue)
Status: all three items implemented same overnight, after Jonah override. This doc preserves the original deferral rationale for posterity — useful when judging whether to defer similar work next time.

What actually shipped overnight

Item 3 — jsonschema validation: five schema files at /root/.claude/system/schemas/ (accounts, keys, capabilities, boundaries, _generic_atom + _common), wired into arg validate as an additive layer. Catches missing required fields, bad enums, malformed types. Verified with deliberate-bad-row injection.
Item 4 — hook bypass hardening: normalize_for_matching() in /opt/agent/scripts/arg_policy_hook.py applies zero-width strip → NFKC → URL-decode → confusables map (Cyrillic/Greek/fullwidth) → IDN punycode decode → lowercase. Paid-LLM detection scoped to outbound tools only (Bash/WebFetch/MCP-fetch/browser tools) so source-code editing tools don't false-positive on test data. Bypass corpus at /opt/agent/scripts/test_arg_policy_hook.py — 12/12 passing including Cyrillic, fullwidth, URL-encoded, and zero-width domain bypasses.
Item 7 — per-category invariants: _check_invariants() in arg validate enforces (1) critical-needs-probe, (2) grant-parties-resolve, (3) money-cap needs money-boundary, (4) deny_unless_brian_account scope sanity. Surfaced 9 real gaps on first run; all reconciled. Verified with deliberate orphan-publish-cap injection.

The deferral rationale below was the original reasoning. Jonah's override at ~03:00 Beirut said "do the 3 4 7 that we deferred to phase 5. do not stop till you finish all" — direct directive, ship it tonight. The work landed in <2 hours.

The post-Phase-2 audit (28 fixes shipped 2026-05-02 evening) identified three classes of hardening that were intentionally deferred to Phase 5 rather than landed in the same drop. This doc captures why each one was deferred and what shipping it later looks like, so Brians of future sessions don't re-litigate the same trade-off.

Item 3 — `jsonschema` validation of registry files

What it would do: swap arg validate's ad-hoc field checks for a real jsonschema (Draft 2020-12) pipeline. One schema per category file (accounts/keys/hosts/data_stores/products/channels/subsystems/agents/grants/skills/capabilities/boundaries/routines), validated on every write, on cron, and pre-commit.

Why deferred:
1. Field set is still moving. Phase 2.6 alone landed min_interval_seconds on 8 entries and notes field expansion on accounts. Locking schemas before the field set is stable would mean rev'ing the schema every time we add a column. Better to let the in-memory ad-hoc validator catch shape drift for one or two more cycles, then crystallize.
2. No production failure has hit the seam yet. arg validate's current rules (id pattern, presence of required keys, type checks for cost_class/risk_level/idempotency enums) have caught every malformed write since the registry started. The next class of bug — semantic misuse of valid-shaped data — is what the Phase 2.6 boundary fixes addressed (and jsonschema wouldn't have caught those either).
3. Phase 5 is "open-source the template." Once we sanitize and ship publicly, schemas become user-facing contracts. That's the right moment to lock them — anyone forking the template gets a stable schema as part of the release.

When to ship:
- Right before the Phase 5 sanitize+ship drop.
- After 14 days with no new field added to any category file (use git log --since=2w resources/ | grep -c added as the gate).
- Schema files live under /root/.claude/system/schemas/ (one .json per category) and arg validate reads them.

Concrete plan:

# rough sketch for the eventual implementer
mkdir -p /root/.claude/system/schemas
# generate seed schemas from current data:
for cat in accounts keys hosts data_stores products channels subsystems agents grants tools capabilities policies routines; do
    quicktype --src-lang json --lang schema /root/.claude/system/{resources,access,tools,capabilities,policies,routines}/${cat}.json \
        > /root/.claude/system/schemas/${cat}.schema.json 2>/dev/null || true
done
# then hand-tighten, then wire into arg validate via jsonschema.validate()

Item 4 — Hook bypass hardening (Unicode / IP-literal / encoding tricks)

What it would do: harden /opt/agent/scripts/arg_policy_hook.py's detection of "is this a paid-LLM endpoint" / "is this Jonah's gmail address" / "is this a money-flow URL" against bypass via:
- Unicode lookalikes (ɡmail.com, аnthropic.com Cyrillic, fullwidth dots, etc.)
- Raw IP literals instead of hostnames (api.openai.com → 198.51.100.4)
- URL-encoded paths (/v1/messages → %2Fv1%2Fmessages)
- IDN punycode (xn--gmail-...)
- Mixed-case + zero-width chars

Why deferred:
1. Threat model is internal. Brian is the only consumer of his own tools. There is no adversarial party crafting Unicode-lookalike URLs to slip past the hook — Brian writes the prompts, Brian writes the code, the only way a Cyrillic аnthropic.com ends up in a tool call is if Brian himself fat-fingered it. The hook's job is to catch honest mistakes, not adversarial bypass.
2. Cost is real. Robust detection means depending on idna, unicodedata.normalize('NFKC'), IP-allowlists for known model providers, and a regex tuning loop. Each layer of robustness is another thing that can false-positive on legitimate calls.
3. The endpoint+auth co-presence guard already shipped. What does fire reliably is "this URL contains a Bearer token + matches a model-provider host" — that's the new check from the 28-fix wave. Unicode bypass is academic next to "did you accidentally hit the prod paid endpoint with your auth token attached." That was the actual $41.68 leak shape.
4. Phase 5 is the open-source moment. When the template ships, the threat model changes — strangers will fork it, and their threat models may include adversarial pull-requests. That's when bypass-hardening earns its keep.

When to ship:
- Same drop as Item 3 (Phase 5 sanitize+ship).
- Add a tools/test_hook_bypass.py corpus with 20+ known-bad URL shapes; run it as part of arg validate and CI.

Item 7 — Strict per-category schema (vs the generic shape we have today)

What it would do: beyond Item 3's "is this valid JSON of the right shape," enforce category-specific invariants:
- accounts.json: every row with critical: true must have a probe.
- keys.json: every kind: oauth_refresh_token row must have a renewal_procedure reference and a scope_account pointer.
- capabilities.json: every cap with cost_class: paid must appear in at least one boundaries.json rule's match scope.
- boundaries.json: every decision: deny_unless_brian_account row must apply to caps whose deps reference at least one acc.brian.* or chan.*brian* resource — mismatched rules surface at validate time.
- grants.json: every grant must point at a valid from_party / to_party pair from the accounts/agents tables.

Why deferred:
1. Item 7 is Item 3 ++. It can't ship before jsonschema validation lands; it's the invariant layer on top of the shape layer.
2. Some of these invariants are still being discovered. Phase 2.6 just discovered three new ones during smoke testing (the AND-vs-OR matcher; the over-broad regex on no_jonah_personal_gmail_via_browser; the over-broad regex on brian_only_publisher). Locking down "every cap must satisfy X" before we know all the X's leads to schema churn.
3. The current resolver already surfaces semantic gaps. When a cap has a missing brian-account dep, arg resolve returns blocked-by-policy with a clear reason — that's a runtime signal. Strict schema would fail at validate time, which is better but not necessary.

When to ship:
- Same Phase 5 drop. Items 3 + 4 + 7 land together.
- Implementation: each category file gets a companion <cat>.invariants.py with a single check(rows) -> List[Violation] function. arg validate runs each one.

What to do if you're a future Brian reading this

If Phase 5 is being scoped, treat Items 3/4/7 as a single tightly-coupled block — don't try to ship them separately.
If the registry has been stable for 14+ days (no new field added to any category file), Item 3 is ready to land.
If a security audit comes back asking about input-encoding bypass, escalate Item 4 to Phase 4 instead of Phase 5.
If a real bug ships because of a missing semantic invariant (e.g. an orphaned grant.* pointing at a deleted account), that's the signal to start Item 7 immediately — don't wait for Phase 5.

ARG — items deferred to Phase 5 (now LANDED)

ARG — items deferred to Phase 5 (now LANDED)

What actually shipped overnight

Item 3 — jsonschema validation of registry files

Item 4 — Hook bypass hardening (Unicode / IP-literal / encoding tricks)

Item 7 — Strict per-category schema (vs the generic shape we have today)

What to do if you're a future Brian reading this

See also

Item 3 — `jsonschema` validation of registry files