Date: 2026-05-10
Round-table: 4 panelists invited; 2 responded (Codex GPT-5.5 via codex CLI, Gemini 2.5 Pro free via brian_dreams key rotation). 2 skipped: DeepSeek (no free OpenRouter route — deepseek/deepseek-chat-v3.1:free and deepseek/deepseek-r1:free both 404; direct API is paid-only) and Qwen (rate-limited twice on qwen/qwen3-coder:free and qwen/qwen3-next-80b-a3b-instruct:free; Llama-3.3-70B fallback also 429). Per Brian's no-paid-model rule, no paid retries attempted.
Raw outputs: /tmp/r_codex.md, /tmp/r_gemini.md, /tmp/r_deepseek.md (skip note), /tmp/r_qwen.md (empty).
Build /opt/agent/webspot_proposal_agent/ — a Python service on Hetzner that (a) ingests ~200 existing Webspot proposals from Google Drive PDFs and Canva designs into a canonical JSON schema, (b) uses Qdrant + bge-m3 embeddings to retrieve similar past proposals, sections, and Jonah's edit-diffs as few-shot exemplars, (c) generates new proposals section-by-section via free-tier Gemini 2.5 Pro, and (d) emits an editable Canva design by duplicating Jonah's existing master template and removing/filling pages. Week 1 ships a CLI that takes a brief and returns the 5 most similar past proposals plus relevant sections — useful even before generation works. Learning happens only when Jonah explicitly clicks "Final, learn from this" — never on every Canva save (avoids poisoning the corpus with half-finished edits). Style profile lives as jonah_style_profile.yaml, rewrites itself after every 10 finalized proposals.
+------------------------+
| Google Drive folder | +-----------------+
| (200 PDF proposals) |------->| PDF ingest |
+------------------------+ | PyMuPDF + |
| pdfplumber + |
+------------------------+ | ocrmypdf |
| Canva designs |------->+--------+--------+
| (Connect API) | |
+------------------------+ v
+------------------------+ +-----------------+
| Canva master template |------->| Canonical JSON |
| (page library source) | | per proposal |
+------------------------+ | (sections+blocks|
| +metadata) |
+--------+--------+
|
+---------------------------------------+
| |
v v
+---------------------+ +-----------------------+
| Qdrant (Docker) | | Postgres (diffs) |
| - proposal_summaries| | before/after blocks |
| - proposal_sections | | + operation tags |
| - proposal_blocks | +-----------------------+
| - edit_diffs | ^
| - style_rules | |
| bge-m3 embeddings | |
| bge-reranker-v2-m3 | +--------------+--------------+
+----------+----------+ | Style miner (weekly) |
| | diff-match-patch + Gemini |
| | -> jonah_style_profile.yaml|
v +--------------+--------------+
+---------------------+ ^
| Generator pipeline |<-----------------------------+
| brief -> retrieve |
| -> outline |
| -> sections | +--------------------+
| -> style pass |-->| Canva Connect API |
| Gemini 2.5 Pro free| | Duplicate master, |
+----------+----------+ | fill+remove pages,|
| | return edit_url |
v +---------+----------+
+---------------------+ |
| generated_v1.pdf |<------------+
| snapshot store |
+----------+----------+
|
| Jonah edits in Canva
v
+---------------------+
| "Mark final & |--> export PDF --> diff vs generated_v1
| learn" button | --> store diffs in Qdrant + Postgres
| (Flask UI) | --> trigger style-rule promotion
+---------------------+
Consensus (Codex + Gemini): Use Google Drive API (files.list / files.get / files.export) for PDFs and Canva Connect API (GET /designs, GET /designs/{id}/pages) for live designs. Extract text with PyMuPDF (both panelists). OCR fallback via tesseract (Codex: ocrmypdf wrapper; Gemini: pytesseract). Store as nested JSON: {proposal_id, source, metadata, sections[{section_id, title, page_indexes, blocks[{block_id, type, text, bbox, style_hint}]}]}.
Dissent:
- Source of truth for Canva content — Codex says page-API JSON exposes mostly metadata/thumbnails so export-to-PDF is the real text source. Gemini says page-API JSON is rich enough for direct extraction.
- Decision: Go with Codex on this — safer to treat PDF export as canonical, fall back to page-API JSON only for element IDs needed to write back.
- Extra extractors Codex adds that Gemini missed: pdfplumber for pricing tables, python-pptx for PPTX exports, BeautifulSoup for HTML standalone exports. Adopt all.
Consensus: Option (a) duplicate the master and toggle/remove pages for the MVP. Both panelists agree.
Dissent:
- Long-term scaling — Codex argues the durable scaling model is (b) page library (recipes like master_cover, scope_website, pricing_retainer, case_study_healthcare) where RAG selects which page recipe to assemble, not full generation. Gemini stays on (a) indefinitely.
- Decision: Ship (a) in week 3, but build the canonical JSON so each section carries a page_recipe_id pointer — that gets the page-library evolution for free in v2.
- Canva Brand Template Autofill — Codex flags this requires Canva Enterprise; if Jonah doesn't have it, fall back to PPTX-via-python-pptx then Canva Design Import API. Open question for Jonah: do we have Canva Enterprise?
SPLIT:
- Codex: Qdrant (Docker, self-hosted), bge-m3 embeddings, bge-reranker-v2-m3. 5 collections: proposal_summaries, proposal_sections, proposal_blocks, edit_diffs, style_rules. Embed at three levels: proposal-summary, section, atomic paragraph/table. Retrieve top-5 proposals + top-12 sections + top-8 diffs.
- Gemini: ChromaDB + Postgres for diffs, MiniLM embeddings. Embed at section + content_block levels.
Decision: Codex's stack. Qdrant scales further than Chroma, bge-m3 outperforms MiniLM for English+multilingual proposal corpora, the reranker matters for top-k quality. Postgres stays for diff metadata + the audit trail (Gemini's contribution there is correct and adopted).
Diff schema (adopted from Codex):
{"diff_id":"diff_123","proposal_id":"acme_2025","section_type":"pricing","before":"...","after":"...","operation":"rewrite|delete|expand|compress|price_adjust|tone_adjust","tags":["more_confident","specific_deliverables"],"extracted_rule":"...","embedding_text":"..."}
Consensus: Mine diffs with an LLM, store as a structured rubric, refresh on a schedule (not per-edit).
Dissent:
- Format — Codex: YAML (jonah_style_profile.yaml) with global_rules, section_rules, banned_phrases, preferred_phrases, rewrite_checks (regex-driven). Gemini: JSON with tone_adjectives, vocabulary_replacements, structural_preferences.
- Cadence — Codex: rewrite weekly OR every 10 finalized proposals; promote rule only after it appears in 3+ finalized proposals OR Jonah approves. Gemini: bi-weekly OR every 50–100 diffs.
- Decision: Codex's promotion threshold (3 occurrences) and YAML format. Gemini's vocabulary_replacements table is a useful sub-structure — fold it under Codex's rewrite_checks. Rebuild trigger: 10 finalized proposals OR weekly cron, whichever first.
Consensus: Free Gemini 2.5 Pro via /opt/agent/brian_dreams/call_gemini.py key rotation does the heavy lifting. Multi-stage prompt chain: brief → retrieval → outline → section-by-section drafting → style pass → Canva assembly.
Dissent:
- Local fallback — Codex adds Qwen3-32B-AWQ or Qwen3-14B-Q4_K_M via Ollama/llama.cpp as fallback when Gemini quota dies. Gemini doesn't propose a fallback.
- Decision: Adopt Codex's fallback chain — Gemini 2.5 Pro free → Gemini 2.5 Flash free (via call_gemini.py's existing fallback) → Ollama Qwen3-14B. Brian's Hermes gateway (530 skills, installed 2026-05-02) can host the Ollama serving.
- Brief normalization output schema (Codex): JSON with client, industry, project_type, goals, constraints, likely_services, budget_band, required_sections, optional_sections. Adopt.
Consensus: Don't trust webhooks for content-change diffs. Both agree current Canva webhooks are share/comment/approval-oriented, not "design text changed."
Dissent:
- Codex: Manual finalization gating — Jonah clicks "Mark final and learn"; backend exports PDF/PPTX/HTML, re-extracts, aligns vs generated_v1. Optional nightly polling snapshot but don't auto-learn from it.
- Gemini: Periodic-snapshot polling every 15-30 min for active proposals, daily for older, JSON-diff via jsondiff + diff_match_patch.
- Decision: Codex's pattern (manual gate is the truth signal). Adopt Gemini's diff_match_patch library for the actual text diffing inside the gated finalization.
Strong consensus: Manual gating, not auto-learn. Both panelists explicitly warn that auto-ingest poisons the corpus with half-finished edits.
Codex adds (adopt all):
- States: draft_generated → jonah_editing → ready_for_review → final_learn → archived_do_not_learn
- Optional learn-scope checkboxes: "Learn style only / Learn pricing logic / Learn structure / Do not learn from this client-specific language"
- Two-tier: minor examples enter RAG immediately; global style profile only updates when promotion threshold is met.
Strong consensus on the week split. See section 6 below for the consolidated roadmap.
Directory: /opt/agent/webspot_proposal_agent/
webspot_proposal_agent/
├── README.md
├── .env.example
├── docker-compose.yml # Qdrant + Postgres
├── requirements.txt
├── ingest/
│ ├── drive_ingest.py # Google Drive API + PDF fetch
│ ├── canva_ingest.py # Canva Connect API + page extraction
│ ├── pdf_parser.py # PyMuPDF + pdfplumber + ocrmypdf
│ ├── section_classifier.py # cover/intro/scope/pricing/terms/etc.
│ └── canonical.py # JSON schema + serializer
├── rag/
│ ├── embed.py # bge-m3 (sentence-transformers)
│ ├── rerank.py # bge-reranker-v2-m3
│ ├── qdrant_client.py # 5 collections: summaries/sections/blocks/diffs/rules
│ └── retrieve.py # top-k similar proposals + sections + diffs
├── generate/
│ ├── brief_parser.py # raw brief -> structured JSON
│ ├── outline.py # which sections, which case studies
│ ├── section_writer.py # per-section prompt chain via Gemini
│ ├── style_pass.py # apply jonah_style_profile.yaml
│ └── canva_assembler.py # duplicate master, fill, hide pages
├── feedback/
│ ├── snapshot.py # generated_v1 capture
│ ├── final_diff.py # diff_match_patch alignment
│ ├── style_miner.py # diffs -> proposed style rules
│ └── ui.py # Flask "Mark final and learn" button
├── data/
│ ├── jonah_style_profile.yaml
│ ├── master_template.json # cached master template structure
│ └── snapshots/ # generated_v1 PDFs per proposal
└── cli.py # entry point
Env vars (add to /opt/agent/core/.env):
- CANVA_CONNECT_TOKEN — OAuth bearer for Canva Connect API
- GOOGLE_DRIVE_OAUTH_* — reuse existing Composio path (per feedback_composio_is_single_google_path.md)
- WEBSPOT_PROPOSAL_FOLDER_ID — Drive folder ID for the 200 PDFs
- WEBSPOT_MASTER_DESIGN_ID — Canva design ID of the master template
- QDRANT_URL=http://localhost:6333
- POSTGRES_DSN=postgres://...@localhost:5432/webspot_proposals
- Gemini keys: already in /opt/agent/core/.env (rotation via call_gemini.py)
Cron schedule:
- */30 * * * * — poll Drive folder for new PDFs, re-ingest if changed (don't auto-learn)
- 0 3 * * 0 — weekly style-rule miner (Sunday 3am Beirut)
- 0 4 * * * — daily snapshot of all final_learn proposals to Qdrant (idempotent)
| Step | Primary | Fallback 1 | Fallback 2 |
|---|---|---|---|
| Brief normalization | Gemini 2.5 Flash free (cheap, fast) | Gemini 2.5 Pro free | Ollama Qwen3-8B local |
| Section drafting | Gemini 2.5 Pro free (call_gemini.py key rotation) |
Gemini 2.5 Flash | Ollama Qwen3-14B-Q4_K_M |
| Style miner | Gemini 2.5 Pro free | Gemini 2.5 Flash | — (skip until quota recovers) |
| Diff classifier | Gemini 2.5 Flash free | Ollama Qwen3-8B | regex-only fallback |
| Embeddings | local BAAI/bge-m3 (sentence-transformers, no API) |
— | — |
| Reranker | local BAAI/bge-reranker-v2-m3 |
— | — |
Key rotation: reuse /opt/agent/brian_dreams/call_gemini.py exactly — same 11-key pool (GEMINI_FREE_API_KEY, _2, GEMINI_API_KEY, GOOGLE_AI_STUDIO_KEY 1-4, GEMINI_KEY_1-4), same Pro→Flash fallback, same /root/.bashrc + /opt/agent/core/.env discovery.
No paid calls anywhere — verified against hard_rule_no_paid_model_calls.md.
Week 1 — Ingestion + Search (smallest useful slice):
- Drive folder → PDF download → PyMuPDF/pdfplumber/ocrmypdf extraction
- Canva Connect API: list designs, export to PDF for text source
- Canonical JSON serializer + section classifier (heuristic, not LLM yet)
- Qdrant in Docker, bge-m3 embeddings
- CLI: webspot-agent search "<brief>" returns top-5 similar proposals + top-12 sections
- Useful even before generation works — Jonah can already use it as a search tool
Week 2 — Draft Generator:
- Brief parser → structured JSON
- RAG retrieval with reranker
- Section-by-section drafter via Gemini 2.5 Pro free
- Static hand-written jonah_style_profile.yaml v0 (Jonah dictates 10-15 rules)
- Output: complete proposal in structured Markdown + JSON (no Canva yet)
Week 3 — Canva Output:
- Duplicate master template via Canva Connect API
- Fill text elements via PUT /designs/{id}/pages/{page_id}/elements/{element_id}
- Hide/remove non-needed pages
- Return Canva edit URL
- Snapshot generated_v1.pdf
- First end-to-end deliverable: brief in, editable Canva proposal out
Week 4 — Feedback Loop + Style Miner:
- Flask "Mark final and learn" button
- Final PDF export + diff alignment vs generated_v1
- Diff classifier (Gemini Flash) → operation tags
- Style miner: cluster diffs → propose new rules → Jonah approve/reject UI
- Promote rule when seen in 3+ finalized proposals
- Auto-rewrite jonah_style_profile.yaml weekly cron
reference_tg_hm_bot.md.)/tmp/r_codex.md (11.8 KB, full response with Qdrant + bge-m3 + page-library scaling argument)brian_dreams/call_gemini.py) — /tmp/r_gemini.md (15.9 KB, full response with ChromaDB + JSON style profile + polling round-trip)/tmp/r_deepseek.md (skipped: no free OpenRouter route for deepseek-chat-v3.1:free or deepseek-r1:free as of 2026-05-10; direct API is paid)/tmp/r_qwen.md (skipped: rate-limited twice, both qwen3-coder:free and qwen3-next-80b-a3b-instruct:free returned 429 from upstream provider Venice; Llama-3.3-70B fallback also 429)Round-table integrity note: With only 2 of 4 panelists, the synthesis cannot weight by majority on the 5 disputed sub-questions (vector store, embedding granularity, style-profile format, round-trip mechanism, scaling pattern). Decisions above pick the more-defensible answer based on infra maturity (Qdrant > Chroma at this scale), corpus size (bge-m3 > MiniLM), and Brian's existing patterns (manual finalization gating matches Brian's "approve before learn" discipline). If any decision feels wrong on first contact with reality, re-run the round-table with Hermes (530-skill local agent installed 2026-05-02) and a working Qwen route as the additional voices.