Compare commits

..

1 commit

Author SHA1 Message Date
planner-bot
5c4bce54f6 chore: planner run 2026-04-18
- Updated prerequisite tree: added Nomad+Vault orchestration (S0-S5 done),
  #982 engagement data loss, stuck issues section
- Created vault procurement for #758 (ops branch protection)
- Graph: 226 nodes, 223 edges
2026-04-18 16:21:25 +00:00
3 changed files with 63 additions and 115 deletions

View file

@ -1,5 +1,5 @@
# Prerequisite Tree
<!-- Last updated: 2026-04-08 -->
<!-- Last updated: 2026-04-18 -->
## Objective: Foundation — Core agent loop (dev → CI → review → merge)
- [x] dev-agent picks up backlog issues (dev/dev-agent.sh exists)
@ -18,7 +18,7 @@ Status: DONE
## Objective: Foundation — Planner gap analysis against vision
- [x] Planner formula exists (run-planner.toml v4)
- [x] planner-run.sh cron wrapper exists
- [x] Planning runs established and maintaining prerequisite tree (run 1: 2026-04-05, run 2: 2026-04-08)
- [x] Planning runs established and maintaining prerequisite tree (runs 14)
Status: DONE
## Objective: Foundation — Multi-project support
@ -29,7 +29,7 @@ Status: DONE
## Objective: Foundation — Knowledge graph for structural defect detection
- [x] networkx package installed in agents container (#220 — closed)
- [x] build-graph.py exists in lib/
- [x] Graph report generating successfully (165 nodes, 137 edges as of 2026-04-08)
- [x] Graph report generating successfully (226 nodes, 223 edges as of 2026-04-18)
Status: DONE
## Objective: Foundation — Predictor-planner adversarial feedback loop
@ -45,8 +45,10 @@ Status: DONE
- [x] disinto init re-run stability (#158 — closed)
- [x] disinto init repo creation API endpoint (#164 — closed)
- [x] Prediction labels created during init (#225 — closed)
- [ ] Ops repo migration for existing deployments (#425 — backlog+priority)
Status: BLOCKED — #425 ops repo missing dirs on existing deployments
- [x] Ops repo migration issue filed (#425 — closed)
- [ ] Ops repo branch protection blocks remote writes (#758 — blocked, HUMAN_BLOCKED, blocked-on-vault vault/pending/disinto-ops-branch-protection.md)
- [ ] Re-seed ops repo directories (#820 — backlog+priority, blocked on #758)
Status: BLOCKED — #758 ops repo branch protection needs human admin action
## Objective: Adoption — Built-in Forgejo + Woodpecker CI
- [x] Docker compose with Forgejo + Woodpecker
@ -54,30 +56,45 @@ Status: BLOCKED — #425 ops repo missing dirs on existing deployments
- [x] WOODPECKER_HOST override fix (#178 — closed)
Status: DONE
## Objective: Adoption — Nomad+Vault orchestration
- [x] Step 0: Nomad+Vault installers (cluster-up.sh, install.sh, vault-init.sh, lib-systemd.sh)
- [x] Step 1: Forgejo on Nomad (forgejo.hcl, deploy.sh, S1.3 wiring, S1.4 CI validation)
- [x] Step 2: Vault policies + secret import (S2.1S2.6, plus fixes S2-A through S2-G)
- [x] Step 3: Woodpecker on Nomad (S3.1S3.4 + OAuth + wiring, plus fixes S3-1 through S3-6)
- [x] Step 4: Agents on Nomad (S4.1 agents.hcl + S4.2 wiring, plus fixes S4-1 through S4-7)
- [x] Step 5: Edge + staging + chat + vault-runner on Nomad (S5.1S5.5, plus fixes S5-1 through S5-7)
- [ ] S5 full cutover: dispatcher Nomad backend + retire docker-compose dispatch (#981 — vision)
Status: READY — Steps 05 deployed; cutover to Nomad-only dispatch is vision-level design work
## Objective: Adoption — Landing page communicating value proposition
- [x] Website addressable exists (disinto.ai)
- [ ] Website observability — no engagement measurement (#426 — vision)
Status: BLOCKED — no evidence process connected to website
## Objective: Adoption — Example project demonstrating full lifecycle
- [ ] No example project exists
- [ ] Requires verified bootstrap (#425)
- [ ] No example project exists (#697 — vision+priority)
- [ ] Requires verified bootstrap (blocked on #758/#820)
Status: BLOCKED — depends on bootstrap completion and ops repo migration
## --- ADOPTION MILESTONE: IN PROGRESS ---
## Objective: Ship (Fold 2) — Deploy profiles per artifact type
- [ ] No deploy profiles defined
- [x] CI pipeline working (Woodpecker OAuth fixed)
- [x] Nomad jobspec infrastructure available (Steps 15 complete)
Status: BLOCKED — not started, needs design (vision-level)
## Objective: Ship (Fold 2) — Vault-gated fold transitions
- [x] Vault redesign complete (#73-#77 — all closed)
- [x] Vault PR workflow documented (docs/VAULT.md)
- [ ] Vault directories complete in ops repo (#425 — approved/fired/rejected missing)
Status: BLOCKED — #425 ops repo dirs needed for vault workflow
- [x] Vault + Nomad integration (template stanzas, JWT auth, policies)
- [ ] Vault lifecycle directories on remote ops repo (blocked on #758/#820)
Status: BLOCKED — #758/#820 ops repo dirs needed for vault workflow
## Objective: Ship (Fold 2) — Engagement measurement baked into deploy pipelines
- [ ] No engagement measurement exists
- [ ] No observables yet (AGENTS.md confirms)
- [ ] collect-engagement.sh data silently lost (#982 — backlog+priority)
Status: BLOCKED — depends on deploy profiles + website observability (#426)
## Objective: Ship (Fold 2) — Rent-a-human for gated channels
@ -100,3 +117,7 @@ Status: BLOCKED — depends on Ship milestone
## Objective: Learn (Fold 3) — Audience variation + signal detection
- [ ] Requires observables
Status: BLOCKED — depends on Ship milestone
## --- STUCK ISSUES ---
- #850 (Compose duplicate service detection) — 4+ consecutive ci_exhausted failures, needs human re-scope or different approach
- #758 (Ops repo branch protection) — HUMAN_BLOCKED since 2026-04-08, vault procurement pending

View file

@ -1,106 +0,0 @@
# Sprint: edge-subpath-chat
## Vision issues
- #623 — vision: subpath routing + Forgejo-OAuth-gated Claude chat inside the edge container
## What this enables
After this sprint, an operator running `disinto edge register` gets a single URL — `<project>.disinto.ai` — with Forgejo at `/forge/`, Woodpecker CI at `/ci/`, a staging preview at `/staging/`, and an OAuth-gated Claude Code chat at `/chat/`, all under one wildcard cert and one bootstrap password. The factory talks back to its operator through a chat window that sits next to the forge, CI, and live preview it is driving.
## What exists today
The majority of this vision is already implemented across issues #704#711:
- **Subpath routing**: Caddyfile generator produces `/forge/*`, `/ci/*`, `/staging/*`, `/chat/*` handlers (`lib/generators.sh:780822`). Forgejo `ROOT_URL` and Woodpecker `WOODPECKER_HOST` are set to subpath values when `EDGE_TUNNEL_FQDN` is present (`bin/disinto:842847`).
- **Chat container**: Full OAuth flow via Forgejo, HttpOnly session cookies, forward_auth defense-in-depth with `FORWARD_AUTH_SECRET`, per-user rate limiting (hourly/daily/token caps), conversation history in NDJSON (`docker/chat/server.py`).
- **Sandbox hardening**: Read-only rootfs, `cap_drop: ALL`, `no-new-privileges`, `pids_limit: 128`, `mem_limit: 512m`, no Docker socket. Verification script at `tools/edge-control/verify-chat-sandbox.sh`.
- **Edge control plane**: Tunnel registration, port allocation, Caddy admin API routing, wildcard `*.disinto.ai` cert via DNS-01 (`tools/edge-control/`).
- **Dependencies #620/#621/#622**: Admin password prompt, edge control plane, and reverse tunnel — all implemented and merged.
- **Subdomain fallback plan**: Fully documented at `docs/edge-routing-fallback.md` with pivot criteria.
## Complexity
- ~6 files touched across 3 subsystems (Caddy routing, chat backend, compose generation)
- Estimated 4 sub-issues
- ~90% gluecode (wiring existing pieces), ~10% greenfield (WebSocket streaming, end-to-end smoke test)
## Risks
- **Forgejo/Woodpecker subpath breakage**: Neither service is battle-tested under subpaths in this stack. Redirect loops, OAuth callback mismatches, or asset 404s are plausible. Mitigation: the fallback plan (`docs/edge-routing-fallback.md`) is already documented and estimated at under one day to pivot.
- **Cookie/CSRF collision**: Forgejo and chat share the same origin — cookie names or CSRF tokens could collide. Mitigation: chat uses a namespaced cookie (`disinto_chat_session`) and a separate OAuth app.
- **Streaming latency**: One-shot `claude --print` blocks until completion. Long responses leave the operator staring at a spinner. Not a correctness risk, but a UX risk that WebSocket streaming would fix.
## Cost — new infra to maintain
- **No new services** — all containers already exist in the compose stack
- **No new scheduled tasks or formulas** — chat is a passive request handler
- **One new smoke test** (CI) — end-to-end subpath routing verification
- **Ongoing**: monitoring Forgejo/Woodpecker upstream for subpath regressions on upgrades
## Recommendation
Worth it. The vision is ~80% implemented. The remaining work is integration hardening (confirming subpath routing works end-to-end with real Forgejo/Woodpecker) and one UX improvement (WebSocket streaming). The risk is low because a documented fallback to per-service subdomains exists. Ship this sprint to close the loop on the edge experience.
## Sub-issues
<!-- filer:begin -->
- id: subpath-routing-smoke-test
title: "vision(#623): end-to-end subpath routing smoke test for Forgejo + Woodpecker + chat"
labels: [backlog]
depends_on: []
body: |
## Goal
Verify that Forgejo, Woodpecker, and chat all function correctly when served
under /forge/, /ci/, and /chat/ subpaths on a single domain. Catch redirect
loops, OAuth callback failures, and asset 404s before they hit production.
## Acceptance criteria
- [ ] Forgejo login at /forge/ completes without redirect loops
- [ ] Forgejo OAuth callback for Woodpecker succeeds under subpath
- [ ] Woodpecker dashboard loads all assets at /ci/ (no 404s on JS/CSS)
- [ ] Chat OAuth login flow works at /chat/login
- [ ] Forward_auth on /chat/* rejects unauthenticated requests with 401
- [ ] Staging content loads at /staging/
- [ ] Root / redirects to /forge/
- [ ] CI pipeline added to .woodpecker/ to run this test on edge-related changes
- id: websocket-streaming-chat
title: "vision(#623): WebSocket streaming for chat UI to replace one-shot claude --print"
labels: [backlog]
depends_on: [subpath-routing-smoke-test]
body: |
## Goal
Replace the blocking one-shot claude --print invocation in the chat backend with
a WebSocket connection that streams tokens to the UI as they arrive.
## Acceptance criteria
- [ ] /chat/ws endpoint accepts WebSocket upgrade with valid session cookie
- [ ] /chat/ws rejects upgrade if session cookie is missing or expired
- [ ] Chat backend streams claude output over WebSocket as text frames
- [ ] UI renders tokens incrementally as they arrive
- [ ] Rate limiting still enforced on WebSocket messages
- [ ] Caddy proxies WebSocket upgrade correctly through /chat/ws with forward_auth
- id: chat-working-dir-scoping
title: "vision(#623): scope Claude chat working directory to project staging checkout"
labels: [backlog]
depends_on: [subpath-routing-smoke-test]
body: |
## Goal
Give the chat container Claude session read-write access to the project working
tree so the operator can inspect, explain, or modify code — scoped to that tree
only, with no access to factory internals, secrets, or Docker socket.
## Acceptance criteria
- [ ] Chat container bind-mounts the project working tree as a named volume
- [ ] Claude invocation in server.py sets cwd to the workspace directory
- [ ] Claude permission mode is acceptEdits (not bypassPermissions)
- [ ] verify-chat-sandbox.sh updated to assert workspace mount exists
- [ ] Compose generator adds the workspace volume conditionally
- id: subpath-fallback-automation
title: "vision(#623): automate subdomain fallback pivot if subpath routing fails"
labels: [backlog]
depends_on: [subpath-routing-smoke-test]
body: |
## Goal
If the smoke test reveals unfixable subpath issues, automate the pivot to
per-service subdomains so the switch is a single config change.
## Acceptance criteria
- [ ] generators.sh _generate_caddyfile_impl accepts EDGE_ROUTING_MODE env var
- [ ] In subdomain mode, Caddyfile emits four host blocks per edge-routing-fallback.md
- [ ] register.sh registers additional subdomain routes when EDGE_ROUTING_MODE=subdomain
- [ ] OAuth redirect URIs in ci-setup.sh respect routing mode
- [ ] .env template documents EDGE_ROUTING_MODE with a comment referencing the fallback doc
<!-- filer:end -->

View file

@ -0,0 +1,33 @@
# Request: Remove or relax ops repo branch protection for agent writes
## What
The ops repo (`disinto-ops`) has branch protection on `main` that blocks all agent writes. `planner-bot` has push access but cannot merge. `review-bot` can approve but cannot push. No admin token is available to agents.
## Why
Blocks #758 (ops repo branch protection), which blocks:
- #820 (re-seed ops repo directories) — vault lifecycle dirs, evidence dirs, portfolio.md, RESOURCES.md
- Vault-gated fold transitions — vault/pending/, approved/, fired/ directories needed
- All planner ops pushes — prerequisites.md frozen since run 2 (2026-04-08)
- Evidence collection for all agents
Waiting since 2026-04-08 (10 days). This is the single root blocker for the Adoption milestone.
## Human action
1. Go to Forgejo admin → `disinto-ops` repo → Settings → Branches → Branch protection for `main`
2. Either:
a. Add `planner-bot` to the push/merge allowlist, OR
b. Remove branch protection from `disinto-ops` `main` entirely (ops repo is internal, not public-facing)
3. Verify by pushing a test commit or confirming `planner-bot` can push
## Factory will then
- #758 auto-closes (dev-agent or planner detects push succeeds)
- #820 picks up immediately (re-seed ops directories)
- Planner pushes prerequisite tree, memory, and vault items to remote
- Vault lifecycle workflow becomes operational
- Evidence collection pipeline unblocks
## Unblocks
- #758 — ops repo branch protection blocks all agent writes
- #820 — re-seed ops repo directories
- #697 — example project (transitively, via bootstrap verification)
- Vault-gated fold transitions (Ship milestone)