| Phase | Date | What |
|---|---|---|
| 1 | 2026-05-02 morning | Skeleton + seed JSON files |
| 2 | 2026-05-02 evening | Structured registry + probes + resolver + journal + inbox + hook + belt+suspenders discovery |
| 2.5 | 2026-05-02 evening (overnight) | 28 audit fixes; ARG cron live (probe / autofix / heal-npx / journal-rotate); flap detection; endpoint+auth co-presence in policy hook; sanitizer idempotence-test mode |
| 2.6 | 2026-05-02 → 03 overnight | Smoke-test bug-fix wave (3 boundary semantics fixes); 23 new probes wired; Phase C atom audit; cleanup |
| 5-Item-3 | 2026-05-03 ~03:00 Beirut | jsonschema validation wired into arg validate (5 schema files + common refs) |
| 5-Item-4 | 2026-05-03 ~03:15 Beirut | Hook bypass hardening (NFKC + confusables + URL-decode + IDN + zero-width); outbound-tool scoping; 12-case bypass corpus |
| 5-Item-7 | 2026-05-03 ~03:30 Beirut | Per-category invariants (critical-needs-probe, grant-parties-resolve, money-cap-needs-money-policy, deny_unless_brian_account scope sanity) |
| Docs | 2026-05-03 ~03:40 Beirut | This documentation set written to DOCUMENTATION/ARG/ |
Phases 1–2 + 2.5/2.6 + 5 items 3/4/7 all shipped. Smoke tests passing; status was 51/0/6/210 at handoff start, 71/0/13/183 after all overnight work, with the +7 reds being honest probe-surfaced gaps (Meta rate-limit, Reddit creds gap, Mac CDP transient).
Goals:
- Expand boundary hook coverage (more rules with hard enforcement).
- Idempotency-aware autofix runners (currently a flat list; could be per-class with retry-window heuristics).
- Add arg remove subcommand for clean row deletion (today: manual JSON edit).
Goals:
- Miner is wired but conservative. Move it from "drop into inbox with approval_required=True" to "auto-promote when N+ successful resolves, K+ days steady, no policy_block events."
- Reverse-direction miner: surface caps in registry that have NEVER resolved successfully (dead caps).
The big one. Items 3 (jsonschema) + 4 (bypass-hardening) + 7 (invariants) already landed; remaining:
- Round-table review (in flight as of this doc — see ROUND_TABLE_REVIEW.md once complete).
- Pick license (MIT confirmed by Jonah 2026-05-03).
- Write a 1-page README aimed at outsiders.
- Push sanitized tree to a public repo.
- Optional: a 5-min Loom-style walkthrough video.
Sanitizer is idempotent. Schemas exist. Threat model documented. Most of the work for Phase 5 is presentation, not code.
| Item | Why not now |
|---|---|
| IP-literal allowlist for paid providers | Threat model doesn't justify the maintenance burden of keeping IP lists fresh. |
Adversarial DNS denylist (*.anthropic-mirror.example) |
Same reason — internal threat model only. |
| Multi-writer support | Would require locking + transactions; single-writer invariant is a feature, not a limitation, at current scale. |
| Vector-search over events journal | Current arg events grep is fast (gzip-aware). At >100k events/day, revisit. |
| Web UI | A markdown view + CLI is fine for one-agent scale. |
| ARG-as-MCP-server | Would let other CLAUDE Code projects share the registry. Defer until at least 2 projects need it. |
ARG is "done" when:
- ✅ Brian has a deterministic answer to "can I do X?" before attempting X.
- ✅ Brian's hard rules are enforced at runtime, not just documented.
- ✅ The registry survives a fresh session start without re-explanation.
- ✅ Sub-agents can propose changes without breaking the single-writer invariant.
- ✅ Probes verify reality on a cadence; flap detection prevents storms.
- ✅ Bypass attempts via Unicode tricks fail.
- ⏳ The system can be sanitized and shipped as a template another agent could fork.
The last criterion is what Phase 5 closes.