Date: 2026-05-03 ~03:50 Beirut
Reviewers dispatched: codex (OpenAI Codex CLI), hermes (Nous), gemini (Google), jules, antigravity
Reviewers usable: codex (full review), hermes (partial; some hallucinations)
Reviewers failed: gemini (workspace sandbox blocked file access), jules (CLI invocation mismatch — fed prompt as command), antigravity (refused root with no sandbox flag)
Raw outputs: /tmp/arg_review/{codex,hermes,gemini,jules,antigravity}.md
Codex: major-rework-needed before OSS publication.
Hermes: fix-P1-then-ship.
Brian's read on the consolidated set: fix-P0-and-key-P1-then-ship. Codex's P0 list is real and credible — every finding has a checkable file:line. Most are not deep architectural problems, just gaps that didn't surface in private use because there's only one operator. Estimated remediation: 2-3 hours.
arg_sanitize.py:67, resources/keys.json:103, system/README.md:168706014602602403, README form IDs (f8d1e19897364181ab439c92fa4ee3a1, 05c47416a40c474b8914d970f33c3c06), and other long opaque IDs survive into public output.arg_sanitize.py:31, this entire DOCUMENTATION/ARG/ foldersystem/README.md. The 11 doc files we just wrote contain hostnames, private paths (/root/..., /opt/agent/...), names ("Brian", "Jonah"), Meta app ID, IPs (89.167.17.183, 173.249.30.224), and infrastructure details.arg_policy_hook.py:117 (load_rules())load_rules() returns [] on any read/parse error. evaluate() then allows everything. A malformed boundaries.json silently disables runtime enforcement — exactly the opposite of what the doc claims.bin/arg:975 (cmd_add), arg_policy_hook.py:239, 03_cli_reference.md:21arg add, but the CLI mutators don't check actor identity. The hook only blocks Write/Edit/NotebookEdit — a sub-agent can run Bash → arg add and mutate the registry.CLAUDE_ARG_MAIN=1 (or equivalent) inside every mutating CLI subcommand. Hook should additionally block Bash invocations of arg add|inbox accept|inbox reject from non-main actors.min_interval_seconds (codex)bin/arg:364, 367, 253, 03_cli_reference.md:28probe-all --critical selects every critical row, not only freshness-budget-expired rows. It passes only {"probe": probe} into run_probe(), so the rate-limit cache (min_interval_seconds) can't see last_probe. Rate-limited LI/Meta probes get re-run every cron cycle — exactly what tripped Meta's app limit during overnight Phase 2.6 testing.last_probe into the run_probe snapshot. Either fully wire probe_status.json cache or remove the documentation claim.arg_policy_hook.py:181, 190, grants.json:235 (grant.stripe_mcp)matches_money() only runs for tools in OUTBOUND_TOOLS (Bash/WebFetch/MCP-fetch/browser). Stripe/payment MCP tool names like mcp__stripe__create_payment_intent aren't in that set — a future Stripe MCP charge wouldn't hit the boundary.mcp__stripe__*, mcp__plugin_stripe_stripe__*), or replace HTTP-only outbound scoping with a generic tool-name + payload matcher.arg_policy_hook.py:214severity: hard, decision: deny rule to JSON doesn't make it runtime-enforced — the hook ignores it unless code is updated.match/decision clauses. OR document explicitly that only named built-in hook rules are enforced (and which ones).bin/arg:216, 233, 269_probe_command_allowed() accepts any command starting with allowed prefixes (test, echo, /opt/agent/scripts/probes/, etc.) then runs via bash -c. A probe command can start with an allowed prefix and append arbitrary shell. Hermes also flagged generic shell-injection concern./opt/agent/scripts/probes/ with no spaces/operators).bin/arg:156, 510, arg_capability_miner.py:68arg resolve writes state under standardized event extra field, but miner checks top-level ev["state"]. Miner silently misses every current capability_resolved event — Phase 4 effectively broken.extra.state. Align doc examples with actual event shape.bin/arg:626, arg_capability_miner.py:42, 07_observability.md:27*.ndjson. After 7-day rotation, old events vanish from grep/mining.*.ndjson and *.ndjson.gz. Open gz files with gzip.open().risk_at_least uses equality (codex)bin/arg:558, schemas/boundaries.schema.json:24risk_at_least: high won't match a critical cap.low<medium<high<critical), compare >=.arg_policy_hook.py:82, 84, test_arg_policy_hook.pyarg_sanitize.py:32, 139bin/arg, no schemas, no hook scripts, no probe stubs. QUICKSTART tells users to "install arg from /bin/arg in the upstream repo" — but the output tree doesn't contain it.bin/arg, schemas/, sanitized hook + probe stubs) or relabel sanitizer output as "registry data only" and adjust QUICKSTART.bin/arg:791jsonschema or referencing Python package is missing, validation returns no errors and continues as if schema layer passed. Contradicts the doc claim that jsonschema is one of three validation layers.--no-jsonschema opt-out flag./opt/agent/... and /root/.claude/... paths everywhere. Forks on macOS / non-Linux distros need to globally find-and-replace.os.environ.get("ARG_HOME", "/root/.claude/system") and os.environ.get("AGENT_ROOT", "/opt/agent") with sensible defaults. Documented in QUICKSTART.schemas/*.schema.jsonschema_version: "1.0" exists in JSON envelopes but no enforcement of compatibility across version bumps. If a Phase 5+ rev breaks shape, no migration path.arg validate that warns when schema_version doesn't match the schema's expected value./root/.claude/system/logs/agent.log" — that path doesn't exist; hallucinated. Real concern (event journal could leak in OSS template) is covered by P0 #2.arg remove subcommand — already documented as Phase 5++ deferred.Order, with rough effort:
| Order | Item | Effort | Why first |
|---|---|---|---|
| 1 | P0 #3 — fail-closed policy hook | 15 min | bug, easy fix, biggest safety win |
| 2 | P0 #5 — probe-all freshness + min_interval cache | 30 min | prevents future probe-storm reds |
| 3 | P0 #4 — actor check in CLI mutators | 20 min | closes sub-agent escape hatch |
| 4 | P0 #6 — money-capable MCP detection | 20 min | future-proofs Stripe MCP wire-up |
| 5 | P1 #9 — miner event-shape fix | 15 min | Phase 4 currently broken; tiny fix |
| 6 | P1 #10 — gzip-aware grep + mine | 15 min | doc claim mismatch |
| 7 | P1 #11 — risk_at_least ordinal | 10 min | semantic correctness |
| 8 | P1 #12 — IDN normalization order + test case | 20 min | bypass corpus completeness |
| 9 | P1 #8 — probe allowlist hardening | 30 min | shell-injection surface narrows |
| 10 | P0 #1 + P1 #13 — sanitizer expansion | 45 min | the actual ship-blocker |
| 11 | P0 #2 — sanitize the docs (or write fresh OSS docs) | 60 min | last step before push |
| 12 | P1 #7 — generic boundary matching OR doc revision | 45 min | clarity |
| 13 | P2 #14, #15, #16 — polish | 30 min | optional |
Total: ~6 hours to ship-ready state. Most items are 15-30 minutes each because the codebase is small and the bugs are surgical.
Do NOT sanitize+ship yet. Fix the 6 P0s + the top 4 P1s (#9, #10, #11, #12 — they're cheap and meaningful). The remaining P1s (#7, #8, #13) and the P2/NIT items can either land in a follow-up patch or be noted as "known limitations" in the OSS README.
Round-table consensus: ARG's design is sound. The implementation has cracks that only appeared because the system was built fast and tested by one operator. Surfacing them BEFORE shipping is exactly why we ran the round-table.