← index2026-05-03 06:54 (Beirut)(backfill from DOCUMENTATION/)

Round-Table Review — ARG, pre-sanitize

Date: 2026-05-03 ~03:50 Beirut
Reviewers dispatched: codex (OpenAI Codex CLI), hermes (Nous), gemini (Google), jules, antigravity
Reviewers usable: codex (full review), hermes (partial; some hallucinations)
Reviewers failed: gemini (workspace sandbox blocked file access), jules (CLI invocation mismatch — fed prompt as command), antigravity (refused root with no sandbox flag)
Raw outputs: /tmp/arg_review/{codex,hermes,gemini,jules,antigravity}.md

Overall verdict

Codex: major-rework-needed before OSS publication.
Hermes: fix-P1-then-ship.
Brian's read on the consolidated set: fix-P0-and-key-P1-then-ship. Codex's P0 list is real and credible — every finding has a checkable file:line. Most are not deep architectural problems, just gaps that didn't surface in private use because there's only one operator. Estimated remediation: 2-3 hours.

P0 — must fix before sanitize+ship

1. Sanitizer misses account/opaque IDs (codex)

Where: arg_sanitize.py:67, resources/keys.json:103, system/README.md:168
Issue: Sanitizer redacts only two specific numeric IDs (Brian's FB page + LinkedIn member). The Meta app ID 706014602602403, README form IDs (f8d1e19897364181ab439c92fa4ee3a1, 05c47416a40c474b8914d970f33c3c06), and other long opaque IDs survive into public output.
Fix: Add a generic pass for known account/app/form-ID fields and long opaque numeric/hex IDs. Run a leak-denylist scan after sanitization.

2. ARG docs are outside the sanitizer (codex)

Where: arg_sanitize.py:31, this entire DOCUMENTATION/ARG/ folder
Issue: Sanitizer only processes registry subdirs + system/README.md. The 11 doc files we just wrote contain hostnames, private paths (/root/..., /opt/agent/...), names ("Brian", "Jonah"), Meta app ID, IPs (89.167.17.183, 173.249.30.224), and infrastructure details.
Fix: Either include the docs in sanitizer input or explicitly exclude them from the OSS release package and replace with sanitized-from-scratch versions.

3. Policy hook fails open if boundaries can't load (codex)

Where: arg_policy_hook.py:117 (load_rules())
Issue: load_rules() returns [] on any read/parse error. evaluate() then allows everything. A malformed boundaries.json silently disables runtime enforcement — exactly the opposite of what the doc claims.
Fix: Fail closed for outbound/dangerous tools when policy load fails. Emit a hook-error event to TG LOGS so the operator notices.

4. Sub-agents can bypass read-only ARG via CLI (codex)

Where: bin/arg:975 (cmd_add), arg_policy_hook.py:239, 03_cli_reference.md:21
Issue: Docs claim sub-agents are denied arg add, but the CLI mutators don't check actor identity. The hook only blocks Write/Edit/NotebookEdit — a sub-agent can run Bash → arg add and mutate the registry.
Fix: Enforce CLAUDE_ARG_MAIN=1 (or equivalent) inside every mutating CLI subcommand. Hook should additionally block Bash invocations of arg add|inbox accept|inbox reject from non-main actors.

5. Critical probe cron ignores freshness AND `min_interval_seconds` (codex)

Where: bin/arg:364, 367, 253, 03_cli_reference.md:28
Issue: probe-all --critical selects every critical row, not only freshness-budget-expired rows. It passes only {"probe": probe} into run_probe(), so the rate-limit cache (min_interval_seconds) can't see last_probe. Rate-limited LI/Meta probes get re-run every cron cycle — exactly what tripped Meta's app limit during overnight Phase 2.6 testing.
Fix: Filter targets by freshness before probing. Pass last_probe into the run_probe snapshot. Either fully wire probe_status.json cache or remove the documentation claim.

6. Money hook misses money-capable MCP tools (codex)

Where: arg_policy_hook.py:181, 190, grants.json:235 (grant.stripe_mcp)
Issue: matches_money() only runs for tools in OUTBOUND_TOOLS (Bash/WebFetch/MCP-fetch/browser). Stripe/payment MCP tool names like mcp__stripe__create_payment_intent aren't in that set — a future Stripe MCP charge wouldn't hit the boundary.
Fix: Either treat money-capable MCP/plugin tools as protected (mcp__stripe__*, mcp__plugin_stripe_stripe__*), or replace HTTP-only outbound scoping with a generic tool-name + payload matcher.

P1 — should fix before ship

7. Hook is not actually data-driven (codex)

Where: arg_policy_hook.py:214
Issue: Hook loops over boundary rows but enforces only hard-coded rule IDs. Adding a new severity: hard, decision: deny rule to JSON doesn't make it runtime-enforced — the hook ignores it unless code is updated.
Fix: Implement generic matching for supported match/decision clauses. OR document explicitly that only named built-in hook rules are enforced (and which ones).

8. Probe command allowlist is shell-prefix based (codex + hermes)

Where: bin/arg:216, 233, 269
Issue: _probe_command_allowed() accepts any command starting with allowed prefixes (test, echo, /opt/agent/scripts/probes/, etc.) then runs via bash -c. A probe command can start with an allowed prefix and append arbitrary shell. Hermes also flagged generic shell-injection concern.
Fix: Store probes as argv arrays. Or restrict execution to wrapper scripts in a trusted directory (e.g. only allow paths starting /opt/agent/scripts/probes/ with no spaces/operators).

9. Capability miner reads the wrong event shape (codex)

Where: bin/arg:156, 510, arg_capability_miner.py:68
Issue: arg resolve writes state under standardized event extra field, but miner checks top-level ev["state"]. Miner silently misses every current capability_resolved event — Phase 4 effectively broken.
Fix: Read both top-level legacy fields AND extra.state. Align doc examples with actual event shape.

10. Events grep + miner not gzip-aware (codex)

Where: bin/arg:626, arg_capability_miner.py:42, 07_observability.md:27
Issue: Doc claims gzipped journals are searched transparently. Both code paths only glob *.ndjson. After 7-day rotation, old events vanish from grep/mining.
Fix: Glob both *.ndjson and *.ndjson.gz. Open gz files with gzip.open().

11. `risk_at_least` uses equality (codex)

Where: bin/arg:558, schemas/boundaries.schema.json:24
Issue: Field name says "at_least" but code compares exact strings. A rule with risk_at_least: high won't match a critical cap.
Fix: Define an ordinal map (low<medium<high<critical), compare >=.

12. IDN hardening claim is not covered (codex)

Where: arg_policy_hook.py:82, 84, test_arg_policy_hook.py
Issue: Punycode decoded AFTER confusables pass — decoded Unicode lookalikes aren't remapped. Test docstring says IDN is covered but corpus has no punycode case.
Fix: Decode IDN before confusable mapping (or run normalization twice). Add a failing-then-passing punycode test case.

13. Sanitized template is not runnable (codex)

Where: arg_sanitize.py:32, 139
Issue: Sanitizer ships JSON + README + QUICKSTART, but no bin/arg, no schemas, no hook scripts, no probe stubs. QUICKSTART tells users to "install arg from /bin/arg in the upstream repo" — but the output tree doesn't contain it.
Fix: Either ship a complete runnable template (include bin/arg, schemas/, sanitized hook + probe stubs) or relabel sanitizer output as "registry data only" and adjust QUICKSTART.

P2 — could fix before ship

14. jsonschema validation silently disables itself (codex)

Where: bin/arg:791
Issue: If jsonschema or referencing Python package is missing, validation returns no errors and continues as if schema layer passed. Contradicts the doc claim that jsonschema is one of three validation layers.
Fix: Fail with actionable dependency error, OR require explicit --no-jsonschema opt-out flag.

15. Hardcoded paths break OS portability (hermes — verified credible)

Where: Across CLI + hook + probe scripts
Issue: Hardcoded /opt/agent/... and /root/.claude/... paths everywhere. Forks on macOS / non-Linux distros need to globally find-and-replace.
Fix: For OSS release: use os.environ.get("ARG_HOME", "/root/.claude/system") and os.environ.get("AGENT_ROOT", "/opt/agent") with sensible defaults. Documented in QUICKSTART.

NIT

16. Schema versioning (hermes — verified credible)

Where: schemas/*.schema.json
Issue: Field schema_version: "1.0" exists in JSON envelopes but no enforcement of compatibility across version bumps. If a Phase 5+ rev breaks shape, no migration path.
Fix: Add a version-compat guard in arg validate that warns when schema_version doesn't match the schema's expected value.

Findings discarded (hermes)

"PII in policy hook logs at /root/.claude/system/logs/agent.log" — that path doesn't exist; hallucinated. Real concern (event journal could leak in OSS template) is covered by P0 #2.
"Cold-starting requirement for capability_miner" — fabricated; no such requirement exists.
"Single point of failure / circuit breaker pattern in probe resolver" — vague, no concrete code referenced.
"Sanitizer regex only matches UUIDv4" — not how the regex works; superseded by P0 #1.

Findings I intentionally did NOT pursue

Multi-platform / Windows support — out of scope; ARG is for Linux agents.
Adversarial threat model — explicitly out-of-scope per intentional decisions list in the review brief.
arg remove subcommand — already documented as Phase 5++ deferred.

Remediation plan

Order, with rough effort:

Order	Item	Effort	Why first
1	P0 #3 — fail-closed policy hook	15 min	bug, easy fix, biggest safety win
2	P0 #5 — probe-all freshness + min_interval cache	30 min	prevents future probe-storm reds
3	P0 #4 — actor check in CLI mutators	20 min	closes sub-agent escape hatch
4	P0 #6 — money-capable MCP detection	20 min	future-proofs Stripe MCP wire-up
5	P1 #9 — miner event-shape fix	15 min	Phase 4 currently broken; tiny fix
6	P1 #10 — gzip-aware grep + mine	15 min	doc claim mismatch
7	P1 #11 — risk_at_least ordinal	10 min	semantic correctness
8	P1 #12 — IDN normalization order + test case	20 min	bypass corpus completeness
9	P1 #8 — probe allowlist hardening	30 min	shell-injection surface narrows
10	P0 #1 + P1 #13 — sanitizer expansion	45 min	the actual ship-blocker
11	P0 #2 — sanitize the docs (or write fresh OSS docs)	60 min	last step before push
12	P1 #7 — generic boundary matching OR doc revision	45 min	clarity
13	P2 #14, #15, #16 — polish	30 min	optional

Total: ~6 hours to ship-ready state. Most items are 15-30 minutes each because the codebase is small and the bugs are surgical.

Recommendation

Do NOT sanitize+ship yet. Fix the 6 P0s + the top 4 P1s (#9, #10, #11, #12 — they're cheap and meaningful). The remaining P1s (#7, #8, #13) and the P2/NIT items can either land in a follow-up patch or be noted as "known limitations" in the OSS README.

Round-table consensus: ARG's design is sound. The implementation has cracks that only appeared because the system was built fast and tested by one operator. Surfacing them BEFORE shipping is exactly why we ran the round-table.

Round-Table Review — ARG, pre-sanitize

Round-Table Review — ARG, pre-sanitize

Overall verdict

P0 — must fix before sanitize+ship

1. Sanitizer misses account/opaque IDs (codex)

2. ARG docs are outside the sanitizer (codex)

3. Policy hook fails open if boundaries can't load (codex)

4. Sub-agents can bypass read-only ARG via CLI (codex)

5. Critical probe cron ignores freshness AND min_interval_seconds (codex)

6. Money hook misses money-capable MCP tools (codex)

P1 — should fix before ship

7. Hook is not actually data-driven (codex)

8. Probe command allowlist is shell-prefix based (codex + hermes)

9. Capability miner reads the wrong event shape (codex)

10. Events grep + miner not gzip-aware (codex)

11. risk_at_least uses equality (codex)

12. IDN hardening claim is not covered (codex)

13. Sanitized template is not runnable (codex)

P2 — could fix before ship

14. jsonschema validation silently disables itself (codex)

15. Hardcoded paths break OS portability (hermes — verified credible)

NIT

16. Schema versioning (hermes — verified credible)

Findings discarded (hermes)

Findings I intentionally did NOT pursue

Remediation plan

Recommendation

5. Critical probe cron ignores freshness AND `min_interval_seconds` (codex)

11. `risk_at_least` uses equality (codex)