Merge pull request 'chore: gardener housekeeping' (#715) from chore/gardener-20260411-2343 into main

2026-04-12 00:13:13 +00:00 · 2026-04-12 00:13:13 +00:00 · 545ccf9199
commit 545ccf9199
parent 13fe475cf8 0cd20e8eea
10 changed files with 53 additions and 23 deletions
--- a/AGENTS.md
+++ b/AGENTS.md
@ -1,4 +1,4 @@
-<!-- last-reviewed: 31f2cb7bfa38df3db8fbed28ec0899c412f06c49 -->
+<!-- last-reviewed: c51cc9dba649ed543b910b561231a5c8bd2130bc -->
 # Disinto — Agent Instructions

 ## What this repo is
--- a/architect/AGENTS.md
+++ b/architect/AGENTS.md
@ -1,4 +1,4 @@
-<!-- last-reviewed: 31f2cb7bfa38df3db8fbed28ec0899c412f06c49 -->
+<!-- last-reviewed: c51cc9dba649ed543b910b561231a5c8bd2130bc -->
 # Architect — Agent Instructions

 ## What this agent is
--- a/dev/AGENTS.md
+++ b/dev/AGENTS.md
@ -1,4 +1,4 @@
-<!-- last-reviewed: 31f2cb7bfa38df3db8fbed28ec0899c412f06c49 -->
+<!-- last-reviewed: c51cc9dba649ed543b910b561231a5c8bd2130bc -->
 # Dev Agent

 **Role**: Implement issues autonomously — write code, push branches, address
--- a/gardener/AGENTS.md
+++ b/gardener/AGENTS.md
@ -1,4 +1,4 @@
-<!-- last-reviewed: 31f2cb7bfa38df3db8fbed28ec0899c412f06c49 -->
+<!-- last-reviewed: c51cc9dba649ed543b910b561231a5c8bd2130bc -->
 # Gardener Agent

 **Role**: Backlog grooming — detect duplicate issues, missing acceptance
--- a/gardener/pending-actions.json
+++ b/gardener/pending-actions.json
@ -1,22 +1,52 @@
 [
  {
    "action": "edit_body",
-    "issue": 649,
-    "body": "Flagged by AI reviewer in PR #640.\n\n## Problem\n\n`register.sh:57` uses `awk '{$1=\"\"; print $0}' | tr -d ' '` to extract the base64 key from the pubkey string. When the pubkey has a comment (the default with `ssh-keygen -C`), the comment is concatenated directly to the key data after `tr -d ' '` removes spaces. For example:\n\n- Input pubkey: `ssh-ed25519 AAAA...== edge-tunnel@myproject`\n- Extracted key: `AAAA...==edge-tunnel@myproject` (invalid — comment appended)\n\nThis produces a malformed `full_pubkey` stored in the registry, which is then written verbatim into `disinto-tunnel`'s `authorized_keys`. OpenSSH rejects the malformed key, so reverse tunnels can never be established.\n\n## Fix\n\nChange line 57 from:\n```bash\nkey=$(echo \"$pubkey\" | awk '{$1=\"\"; print $0}' | tr -d ' ')\n```\nto:\n```bash\nkey=$(echo \"$pubkey\" | awk '{print $2}')\n```\n\nThis extracts only the second field (the base64 key data), ignoring the comment.\n\n---\n*Auto-created from AI review*\n\n## Affected files\n- `tools/edge-control/register.sh:57` — pubkey extraction logic\n\n## Acceptance criteria\n- [ ] `register.sh` line 57 uses `awk '{print $2}'` to extract only the base64 key field\n- [ ] Pubkeys with comments (e.g. `ssh-ed25519 AAAA...== edge-tunnel@myproject`) are stored correctly without the comment appended\n- [ ] The resulting `authorized_keys` entry is valid and accepted by OpenSSH\n"
-  },
-  {
-    "action": "add_label",
-    "issue": 649,
-    "label": "backlog"
+    "issue": 704,
+    "body": "## Goal\n\nExtend the existing edge `docker/Caddyfile` to route `/forge/*`, `/ci/*`, `/staging/*`, and `/chat/*` under a single host block for a registered project, and reconfigure Forgejo and Woodpecker so their own URL generation works under those subpaths.\n\n## Why\n\n- Foundation for the unified `<project>.disinto.ai` workspace vision (#623).\n- The current `docker/Caddyfile` already has subpath handlers for `/forgejo/*`, `/ci/*`, and default→staging — this issue brings it in line with the #623 plan (rename `/forgejo` → `/forge`, add explicit `/staging/*` vs default, add `/chat/*` placeholder returning 503 until the chat container lands).\n- Forgejo + Woodpecker only generate correct links when `ROOT_URL` / `WOODPECKER_HOST` match the public path.\n\n## Scope\n\n### Files to touch\n\n- `docker/Caddyfile` — rewrite the `:80` block:\n  - `handle /forge/*` → `reverse_proxy forgejo:3000` (decide during impl: `uri strip_prefix /forge` vs letting Forgejo `ROOT_URL` own the prefix — pick whichever keeps Forgejo asset links correct).\n  - `handle /ci/*` → `reverse_proxy woodpecker:8000` (Woodpecker v3 supports subpath natively via `WOODPECKER_HOST`).\n  - `handle /staging/*` → `reverse_proxy staging:80` (explicit, no default).\n  - `handle /chat/*` → `respond \"chat not yet deployed\" 503` (placeholder — replaced by #705).\n  - `handle /` → redirect to `/forge/` for now (revisit when chat lands).\n- `lib/generators.sh` `_generate_compose_impl()` — Forgejo service env: `ROOT_URL=https://${EDGE_TUNNEL_FQDN}/forge/` (only when `EDGE_TUNNEL_FQDN` is set; keep local-dev fallback untouched).\n- `lib/generators.sh` Woodpecker service env: `WOODPECKER_HOST=https://${EDGE_TUNNEL_FQDN}/ci` (conditional same way).\n- `bin/disinto setup_forge` — no code change expected; the new env vars flow through `.env` → compose.\n\n### Out of scope\n\n- TLS / cert wiring (handled by #621 wildcard cert on DO).\n- OAuth redirect URIs for the new paths — owned by #708.\n- `/chat/*` actual backend — owned by #705.\n\n## Acceptance\n\n- [ ] `docker compose up -d edge forgejo woodpecker` on a test project renders a Caddyfile that matches the four handlers above.\n- [ ] `curl -H \"Host: <project>.disinto.ai\" http://edge/forge/` returns the Forgejo dashboard HTML with all asset URLs under `/forge/...`.\n- [ ] `curl -H \"Host: <project>.disinto.ai\" http://edge/ci/` returns Woodpecker and the OAuth login link points at `/ci/authorize`.\n- [ ] `/chat/` returns a 503 until #705 replaces the placeholder.\n- [ ] Local dev without `EDGE_TUNNEL_FQDN` still brings up a working edge (fallback path unchanged).\n\n## Depends on\n\n- #622 — edge container must be the single entry point before subpath routing is load-bearing.\n\n## Notes\n\n- The current `docker/Caddyfile` is 19 lines — do not rewrite it from scratch, just change the handler blocks and add the placeholder.\n- Forgejo `ROOT_URL` suffix slash matters (`/forge/` not `/forge`) — Forgejo builds absolute links from it.\n- Woodpecker v3 is already in use here (see `lib/ci-setup.sh`), so `WOODPECKER_HOST` subpath support is available.\n\n## Boundaries for dev-agent\n\n- Do not invent a new reverse proxy. The existing `docker/Caddyfile` (19 lines) is the only file to edit for routing.\n- Do not refactor `lib/generators.sh` beyond adding two env var lines on existing services.\n- Do not touch `lib/ci-setup.sh` OAuth2 flow — that is #708.\n- No new language runtimes. Everything here is Caddyfile + bash + env vars.\n- Parent vision: #623.\n\n## Affected files\n\n- `docker/Caddyfile`\n- `lib/generators.sh`\n- `bin/disinto`\n\n"
  },
  {
    "action": "edit_body",
-    "issue": 680,
-    "body": "Flagged by AI reviewer in PR #679.\n\n## Problem\n\n`docs/CLAUDE-AUTH-CONCURRENCY.md:43` shows the Claude config directory layout with `credentials.json` (no leading dot), but Claude actually writes `.credentials.json` (hidden file). PR #679 fixed `docker/agents/entrypoint.sh` to match the real filename.\n\nSimilarly, `tests/smoke-init.sh` lines 336–357 create and migrate a file named `credentials.json`; if these tests are meant to exercise Claude's OAuth credential file, they should use `.credentials.json`.\n\n## Affected files\n- `docs/CLAUDE-AUTH-CONCURRENCY.md:43` — directory layout diagram\n- `tests/smoke-init.sh:336,338,339,341,357` — credential migration smoke test\n\n## Acceptance criteria\n- [ ] `docs/CLAUDE-AUTH-CONCURRENCY.md` references `.credentials.json` (with leading dot) in the directory layout diagram\n- [ ] `tests/smoke-init.sh` uses `.credentials.json` in the credential migration test (lines 336–357)\n- [ ] All references to the Claude credentials file in docs and tests use the correct hidden filename `.credentials.json`\n\n---\n*Auto-created from AI review*\n"
+    "issue": 705,
+    "body": "## Goal\n\nAdd a new `disinto-chat` docker service to the factory stack: a minimal HTTP+WebSocket backend that spawns `claude --print` (or attaches to a persistent tmux PTY) and a vanilla HTML+HTMX UI served at `/chat/*` behind the edge container. This issue delivers a no-auth local-dev version; sandboxing, identity isolation, and auth are separate chunks.\n\n## Why\n\n- Core of the #623 \"Claude chat inside the edge container\" vision.\n- A working no-auth scaffold lets #706 / #707 / #708 land as tight independent PRs instead of one giant change.\n\n## Scope\n\n### New files\n\n- `docker/chat/Dockerfile` — small Debian/Alpine base, installs the `claude` CLI via the same `CLAUDE_BIN_PLACEHOLDER` substitution pattern used for the agents at `lib/generators.sh:500-508`, plus the minimum runtime for the chosen backend (Python with `websockets`, or a small Go binary using `net/http` + `gorilla/websocket`). Pick whichever keeps the image smallest; justify in PR description.\n- `docker/chat/entrypoint-chat.sh` — start the backend, log to stdout, exec-replace.\n- `docker/chat/ui/index.html` + committed `static/htmx.min.js` — vanilla HTML + HTMX, single textarea, streaming message log, no SPA build.\n- `docker/chat/server.{py,go}` — routes:\n  - `GET /` → serves `index.html`.\n  - `GET /static/*` → serves assets.\n  - `POST /chat` (HTMX) → spawns `claude --print --output-format stream-json` with the user message, streams response back as HTMX swap fragments.\n  - `GET /ws` → reserved for future streaming upgrade; stub returning 501 for now.\n\n### Files to edit\n\n- `lib/generators.sh` `_generate_compose_impl()` — add a `chat` service block alongside existing `edge`, `forgejo`, `woodpecker`:\n  - Reuse the existing `CLAUDE_BIN_PLACEHOLDER` substitution logic at `lib/generators.sh:500-508` so the chat container gets the same `claude` binary as the agents.\n  - **Do NOT reuse `CLAUDE_SHARED_DIR`** — #707 will give chat its own config dir. For now, mount a throwaway named volume so it works locally.\n  - Expose on internal network only; `edge` reaches it at `chat:8080`.\n- `docker/Caddyfile` — replace the #704 `/chat/*` placeholder with `reverse_proxy chat:8080`.\n\n### Out of scope\n\n- Sandbox hardening (read-only rootfs, tmpfs, docker.sock removal) — #706.\n- Claude identity isolation from host `~/.claude` — #707.\n- Any auth — #708.\n- Conversation persistence — #710.\n\n## Acceptance\n\n- [ ] `docker compose up -d chat edge` brings the new container up.\n- [ ] `curl http://edge/chat/` returns the HTML page.\n- [ ] Typing a message in the UI and submitting sends it through HTMX → backend → `claude --print` → streams back into the page.\n- [ ] Killing the chat container does not bring down Caddy; `/chat/*` returns 502 only while down, back to 200 once restarted.\n- [ ] Image size under ~200MB (sanity ceiling; investigate if exceeded).\n\n## Depends on\n\n- #704 — needs the `/chat/*` handler from the subpath routing work.\n\n## Notes\n\n- Claude CLI invocation reference: the agents already do this — see `docker/agents/entrypoint.sh` for the `claude --print` pattern. Do not reinvent flag parsing.\n- HTMX was chosen in #623 deliberately: no npm build chain. Ship the `htmx.min.js` file committed to the repo, not a CDN import.\n- Streaming: `claude --output-format stream-json` emits NDJSON; pipe through a small translator into HTMX `hx-swap-oob` fragments. Existing NDJSON parsing lives in `lib/ci-log-reader.py` as a reference.\n\n## Boundaries for dev-agent\n\n- Do not pick React / Vue / Svelte. Vanilla HTML + HTMX, non-negotiable per #623.\n- Do not invent a Claude wrapper library. Spawn `claude --print` as a subprocess.\n- Do not copy the `~/.claude.json:ro` mount pattern from `lib/generators.sh:328` for this container — that is the exact anti-pattern #707 fixes. Use a throwaway volume for now.\n- Do not add auth, rate limits, sandboxing, or persistence here — they are separate chunks.\n- Parent vision: #623.\n\n## Affected files\n\n- `docker/chat/Dockerfile`\n- `docker/chat/entrypoint-chat.sh`\n- `docker/chat/ui/index.html`\n- `docker/chat/server.py`\n- `lib/generators.sh`\n- `docker/Caddyfile`\n\n"
  },
  {
-    "action": "add_label",
-    "issue": 680,
-    "label": "backlog"
+    "action": "edit_body",
+    "issue": 706,
+    "body": "## Goal\n\nLock down the `disinto-chat` container so a compromise of the Claude process cannot escape into the factory host or other containers. Apply the non-negotiable sandbox posture from #623: no docker.sock, read-only rootfs, tmpfs `/tmp`, memory + PID limits, minimal capabilities.\n\n## Why\n\n- Chat runs user-provided prompts against an LLM that is allowed to execute code inside its container. Anything the container can touch, a prompt-injected model can touch. This is a \"defense-in-depth against casual abuse\" line, not enterprise SaaS (#623 security posture).\n- Applied as a distinct chunk so the scaffold (#705) can land and be iterated on before hardening stabilises.\n\n## Scope\n\n### Files to touch\n\n- `lib/generators.sh` chat service block (added in #705):\n  - `read_only: true`\n  - `tmpfs: [/tmp]` with `size=64m`\n  - `security_opt: [no-new-privileges:true]`\n  - `cap_drop: [ALL]` (add back only what claude actually needs; start with nothing and iterate)\n  - `pids_limit: 128`\n  - `mem_limit: 512m`, `memswap_limit: 512m`\n  - **No** `/var/run/docker.sock` mount — verify with `grep docker.sock docker-compose.yml` after render (must return nothing for the chat service).\n  - **No** `${HOME}/.ssh` mount.\n  - **No** secrets bind mounts beyond what #707 defines for `~/.claude-chat/`.\n- `docker/chat/Dockerfile` — non-root user (`USER chat` with fixed uid `10001`), `HEALTHCHECK` that probes the HTTP port.\n- `docker/chat/entrypoint-chat.sh` — fail-fast sanity check on start: if `/var/run/docker.sock` exists or `$(id -u)` is `0`, exit 1 with a clear message.\n\n### Verification helper\n\n- `tools/edge-control/verify-chat-sandbox.sh` — one-shot script that runs against a live compose project and asserts:\n  - `docker inspect disinto-chat` shows `ReadonlyRootfs=true`, `CapAdd=null`, `Pids.Limit=128`.\n  - `docker exec disinto-chat touch /root/x` fails.\n  - `docker exec disinto-chat ls /var/run/docker.sock` fails.\n  - Exit 0 if all pass, non-zero otherwise.\n\n## Acceptance\n\n- [ ] `docker compose up -d chat` still yields a working chat container end-to-end.\n- [ ] `tools/edge-control/verify-chat-sandbox.sh` exits 0.\n- [ ] `docker exec disinto-chat cat /etc/shadow` fails (read-only rootfs + unprivileged user).\n- [ ] `docker stats disinto-chat` shows memory bounded at 512MB.\n- [ ] Intentionally leaking a `CAP_SYS_ADMIN` into the service block makes the verify script exit non-zero (negative test).\n\n## Depends on\n\n- #705.\n\n## Notes\n\n- tmpfs size matters: too small and `claude --print` stream buffering may OOM; 64MB is a starting guess, tune during testing.\n- If `claude --print` fails under `no-new-privileges`, the root cause is almost certainly a setuid binary inside the image — do not paper over it by dropping the flag.\n\n## Boundaries for dev-agent\n\n- Do not widen the sandbox to \"make it work\". If claude needs a capability, identify which one, justify in the PR, and add only that one.\n- Do not copy docker-socket mounts from the agents service. The agents service is the exact model to *avoid* for chat.\n- Do not write a new test framework. `verify-chat-sandbox.sh` is a plain bash script with `docker inspect` / `docker exec` + `grep`.\n- Parent vision: #623.\n\n## Affected files\n\n- `lib/generators.sh`\n- `docker/chat/Dockerfile`\n- `docker/chat/entrypoint-chat.sh`\n- `tools/edge-control/verify-chat-sandbox.sh`\n\n"
+  },
+  {
+    "action": "edit_body",
+    "issue": 707,
+    "body": "## Goal\n\nGive `disinto-chat` its own Claude identity mount so its OAuth refresh races cannot corrupt the factory agents' shared `~/.claude` credentials. Default to a separate `~/.claude-chat/` on the host; support `ANTHROPIC_API_KEY` as a fallback that skips OAuth entirely.\n\n## Why\n\n- #623 root-caused this: Claude Code's internal refresh lock in `~/.claude.lock` operates outside bind-mounted directories, so two containers sharing `~/.claude` can race during token refresh and invalidate each other. The factory has already had OAuth expiry incidents traced to multiple agents sharing credentials.\n- Scoping chat to its own identity dir means chat can be logged in as a different Anthropic account, or pinned to an API key, without touching agent credentials.\n\n## Scope\n\n### Files to touch\n\n- `lib/generators.sh` chat service block (from #705):\n  - Replace the throwaway named volume with `${CHAT_CLAUDE_DIR:-${HOME}/.claude-chat}:/home/chat/.claude-chat`.\n  - Env: `CLAUDE_CONFIG_DIR=/home/chat/.claude-chat/config`, `CLAUDE_CREDENTIALS_DIR=/home/chat/.claude-chat/config/credentials`.\n  - Conditional: if `ANTHROPIC_API_KEY` is set in `.env`, pass it through and **do not** mount `~/.claude-chat` at all (no credentials on disk in that mode).\n- `bin/disinto disinto_init()` — after #620's admin password prompt, add an optional prompt: `Use separate Anthropic identity for chat? (y/N)`. On yes, create `~/.claude-chat/` and invoke `claude login` in a subshell with `CLAUDE_CONFIG_DIR=~/.claude-chat/config`.\n- `lib/claude-config.sh` — factor out the existing `~/.claude` setup logic so a non-default `CLAUDE_CONFIG_DIR` is a first-class parameter. If it is already parameterised, just document it; if not, extract a helper `setup_claude_dir <dir>` and have the existing path call it with the default dir.\n- `docker/chat/Dockerfile` — declare `VOLUME /home/chat/.claude-chat`, set owner to the non-root chat user introduced in #706.\n\n### Out of scope\n\n- Cross-session lock coherence for multiple concurrent chat containers (single-chat-container assumption is fine for MVP).\n- Anthropic team / workspace support — single identity is enough.\n\n## Acceptance\n\n- [ ] Fresh `disinto init` with \"use separate chat identity\" answered yes creates `~/.claude-chat/` and logs in successfully.\n- [ ] With `ANTHROPIC_API_KEY=sk-ant-...` set in `.env`, chat starts without any `~/.claude-chat` mount (verified via `docker inspect disinto-chat`) and successfully completes a test prompt.\n- [ ] Running the factory agents AND chat simultaneously for 24h does not produce any OAuth refresh failures on either side (manual soak test — document result in PR).\n- [ ] `CLAUDE_CONFIG_DIR` and `CLAUDE_CREDENTIALS_DIR` inside the chat container resolve to `/home/chat/.claude-chat/config*`, not the shared factory path.\n\n## Depends on\n\n- #705 (chat scaffold).\n- #620 (admin password prompt — same init flow this adds a step to).\n\n## Notes\n\n- The factory's existing shared mount is `/var/lib/disinto/claude-shared` (see `lib/generators.sh:113,327,381,426`). Chat must NOT use this path.\n- `flock(\"${HOME}/.claude/session.lock\")` logic mentioned in #623 is load-bearing, not redundant — do not \"simplify\" it.\n- Prefer the API-key path for anyone running the factory on shared hardware; call this out in README updates.\n\n## Boundaries for dev-agent\n\n- Do not try to make chat share `~/.claude` with the agents \"just for convenience\". The whole point of this chunk is the opposite.\n- Do not add a third claude config dir. One for agents, one for chat, done.\n- Do not refactor `lib/claude-config.sh` beyond extracting a parameterised helper if needed.\n- Parent vision: #623.\n\n## Affected files\n\n- `lib/generators.sh`\n- `bin/disinto`\n- `lib/claude-config.sh`\n- `docker/chat/Dockerfile`\n\n"
+  },
+  {
+    "action": "edit_body",
+    "issue": 708,
+    "body": "## Goal\n\nGate `/chat/*` behind Forgejo OAuth. Register a second OAuth2 app on the internal Forgejo (alongside the existing Woodpecker one), implement the server-side authorization-code flow in the chat backend, and enforce that the logged-in user is `disinto-admin` (or a member of a configured allowlist).\n\n## Why\n\n- #623: the single `disinto-admin` password (bootstrap secret from #620) is the only auth credential; chat must reuse it via Forgejo OAuth, not invent a second password.\n- Keeps attack surface flat: exactly one identity provider for forge, CI, and chat.\n\n## Scope\n\n### Files to touch\n\n- `lib/ci-setup.sh` — generalise the Woodpecker OAuth-app creation helper `_create_woodpecker_oauth_impl()` (at `lib/ci-setup.sh:96`) into `_create_forgejo_oauth_app <name> <redirect_uri>` that both Woodpecker and chat can call. The existing Woodpecker callsite becomes a thin wrapper; no behaviour change for Woodpecker.\n- `bin/disinto disinto_init()` — after Woodpecker OAuth creation (around `bin/disinto:847`), add a call to create the chat OAuth app with redirect URI `https://${EDGE_TUNNEL_FQDN}/chat/oauth/callback`. Write `CHAT_OAUTH_CLIENT_ID` / `CHAT_OAUTH_CLIENT_SECRET` to `.env`.\n- `docker/chat/server.{py,go}` — new routes:\n  - `GET /chat/login` → 302 to Forgejo `/login/oauth/authorize?client_id=...&state=...`.\n  - `GET /chat/oauth/callback` → exchange code for token, fetch `/api/v1/user`, assert `login == \"disinto-admin\"` (or `DISINTO_CHAT_ALLOWED_USERS` CSV if set), set an `HttpOnly` session cookie, 302 to `/chat/`.\n  - Any other route with no valid session → 302 to `/chat/login`.\n  - Session store: in-memory map keyed by random token; TTL 24h; no persistence across container restarts (by design — forces re-auth after deploy).\n- `lib/generators.sh` chat service env: pass `FORGE_URL`, `CHAT_OAUTH_CLIENT_ID`, `CHAT_OAUTH_CLIENT_SECRET`, `EDGE_TUNNEL_FQDN`, `DISINTO_CHAT_ALLOWED_USERS`.\n\n### Out of scope\n\n- Defense-in-depth `Remote-User` header check — #709.\n- Team membership lookup via Forgejo API — start with a plain user allowlist, add teams later.\n- CSRF protection on `POST /chat` beyond the session cookie — add only if soak testing reveals a need.\n\n## Acceptance\n\n- [ ] `disinto init` on a fresh project creates TWO OAuth apps on Forgejo (woodpecker + chat), both visible in Forgejo admin UI.\n- [ ] `curl -c cookies -L http://edge/chat/` follows through the OAuth flow when seeded with `disinto-admin` credentials and lands on the chat UI.\n- [ ] Same flow as a non-admin Forgejo user lands on a \"not authorised\" page with a 403.\n- [ ] Expired / missing session cookie redirects to `/chat/login`.\n- [ ] `DISINTO_CHAT_ALLOWED_USERS=alice,bob` permits those users in addition to `disinto-admin`.\n\n## Depends on\n\n- #705 (chat scaffold).\n- #620 (admin password prompt — the password this auth leans on).\n\n## Notes\n\n- The existing Woodpecker OAuth exchange at `lib/ci-setup.sh:273-294` is the reference for the server-side code→token exchange. Read it before implementing; do not guess the Forgejo OAuth flow from docs.\n- Forgejo OAuth2 endpoints are `/login/oauth/authorize` and `/login/oauth/access_token`. Do not hit `/api/v1/...` for OAuth — that is the REST API, not the OAuth endpoints.\n- Session cookie must be `Secure; HttpOnly; SameSite=Lax` in prod; permit non-secure in local dev when `EDGE_TUNNEL_FQDN` is unset.\n\n## Boundaries for dev-agent\n\n- Do not add a new OAuth library dependency if the standard library of the chosen backend language has HTTP client + JSON — the flow is two HTTP calls.\n- Do not reuse the Woodpecker OAuth app. Create a second one with a distinct name and redirect URI. They are different principals.\n- Do not persist sessions to disk. In-memory is correct for MVP; persistence is a separate conversation.\n- Do not implement team membership this chunk. Static allowlist first; teams later.\n- Parent vision: #623.\n\n## Affected files\n\n- `lib/ci-setup.sh`\n- `bin/disinto`\n- `docker/chat/server.py`\n- `lib/generators.sh`\n\n"
+  },
+  {
+    "action": "edit_body",
+    "issue": 709,
+    "body": "## Goal\n\nAdd a second, independent auth check on top of #708: Caddy injects an `X-Forwarded-User` header from the validated Forgejo session, and the chat backend refuses any request whose session cookie disagrees with the header. This is the belt to #708's braces.\n\n## Why\n\n- #623 explicitly calls this out as defense-in-depth. If the chat backend session logic has a bug (forged cookie, state confusion), a correctly-configured Caddy `forward_auth` layer catches it — and vice versa.\n- Cheap to add on top of #704 and #708; expensive to bolt on after an incident.\n\n## Scope\n\n### Files to touch\n\n- `docker/Caddyfile` — the `/chat/*` block:\n  - Add `forward_auth chat:8080 { uri /chat/auth/verify; copy_headers X-Forwarded-User }`.\n  - Requests without a valid session are forwarded to `/chat/login` by chat itself; `forward_auth` just stamps the header when there is one.\n- `docker/chat/server.{py,go}`:\n  - New route `GET /chat/auth/verify` — reads the session cookie, returns 200 + `X-Forwarded-User: <login>` if valid, 401 otherwise.\n  - On `POST /chat` and other authenticated routes: read `X-Forwarded-User`, read the session cookie, assert both resolve to the same user. On mismatch: log a warning with the request ID and return 403.\n\n### Out of scope\n\n- Rewriting the session store. The verify endpoint reads the same in-memory map #708 introduced.\n\n## Acceptance\n\n- [ ] `curl http://edge/chat/` with a valid session cookie still works; chat backend logs show `X-Forwarded-User` matching the cookie user.\n- [ ] Editing the session cookie client-side to impersonate another user while keeping the forged cookie valid triggers a 403 with a clear log line (simulate by swapping cookies mid-session).\n- [ ] Removing the `forward_auth` block from Caddyfile and restarting causes the chat backend to fail-closed (all authenticated routes 403) — documented as the intended failure mode.\n- [ ] The verify endpoint does not accept arbitrary external requests from outside Caddy: the chat backend rejects calls to `/chat/auth/verify` that lack a shared-secret header (or whose origin IP is not the edge container).\n\n## Depends on\n\n- #704 (Caddy subpath routing).\n- #708 (chat OAuth gate — provides the session store this chunk reads).\n\n## Notes\n\n- Caddy `forward_auth` reference: https://caddyserver.com/docs/caddyfile/directives/forward_auth — stick to the documented directives, do not hand-roll header passing.\n- If network-level origin validation on `/chat/auth/verify` is fiddly, a shared-secret header between Caddy and chat is acceptable — but prefer network-level if possible.\n\n## Boundaries for dev-agent\n\n- Do not replace the #708 session store with something new. Read it, do not rewrite it.\n- Do not push the entire auth decision into Caddy. The chat backend is still the source of truth; Caddy adds a redundant check.\n- Parent vision: #623.\n\n## Affected files\n\n- `docker/Caddyfile`\n- `docker/chat/server.py`\n\n"
+  },
+  {
+    "action": "edit_body",
+    "issue": 710,
+    "body": "## Goal\n\nDecide and implement a conversation history persistence model for `disinto-chat`. MVP target: append-only per-user NDJSON files on a bind-mounted host volume, one file per conversation, with a simple history list endpoint and sidebar in the UI.\n\n## Why\n\n- Without history, every page refresh loses context. Claude is stateless per invocation; the chat UI is what makes it feel like a conversation.\n- A full database with search is overkill for a personal / small-team factory (#623 security posture). Flat files are enough and recoverable by `cat`.\n\n## Scope\n\n### Files to touch\n\n- `lib/generators.sh` chat service:\n  - Add a writable bind mount `${CHAT_HISTORY_DIR:-./state/chat-history}:/var/lib/chat/history` (one per-project host path; compose already pins the project root).\n  - Must coexist with #706's read-only rootfs (this is a separate mount, not part of rootfs — sanity-check the sandbox verify script still passes).\n- `docker/chat/server.{py,go}`:\n  - On each `POST /chat`, append one NDJSON line `{ts, user, role, content}` to `/var/lib/chat/history/<user>/<conversation_id>.ndjson`.\n  - `GET /chat/history` → returns the list of conversation ids and first-message previews for the logged-in user.\n  - `GET /chat/history/<id>` → returns the full conversation for the logged-in user; 404 if the file belongs to another user.\n  - New conversation: `POST /chat/new` → generates a fresh conversation_id (random 12-char hex) and returns it.\n  - UI: sidebar with conversation list, \"new chat\" button, load history into the log on click.\n- File naming: `<user>/<conversation_id>.ndjson` — user-scoped directory prevents cross-user leakage even if a bug leaks ids. `conversation_id` must match `^[0-9a-f]{12}$`, no slashes allowed.\n\n### Out of scope\n\n- Full-text search.\n- Database / SQLite.\n- History retention / rotation — unbounded for now.\n\n### In scope explicitly\n\n- Replaying prior turns back into the `claude --print` subprocess for follow-up turns: the backend must feed the prior NDJSON lines back into claude via whatever convention the agent code uses. Cross-check `docker/agents/entrypoint.sh` for how agents pass conversation state.\n\n## Acceptance\n\n- [ ] Sending 3 messages, refreshing the page, and clicking the conversation in the sidebar re-loads all 3 messages.\n- [ ] A new conversation starts with an empty context and does not see prior messages.\n- [ ] `ls state/chat-history/disinto-admin/` on the host shows one NDJSON file per conversation, each line is valid JSON.\n- [ ] A second user logging in via the #708 allowlist sees only their own conversations.\n- [ ] History endpoints are blocked for unauthenticated requests (inherits #708 / #709 auth).\n\n## Depends on\n\n- #705 (chat scaffold).\n\n## Notes\n\n- NDJSON, not JSON-array: append is O(1) and partial writes never corrupt prior lines. Mirrors the factory's CI log format at `lib/ci-log-reader.py`.\n- Per-user directory, not a single shared dir — path traversal via a crafted `conversation_id` is the main risk. The strict regex above is the mitigation.\n\n## Boundaries for dev-agent\n\n- Do not add SQLite, Postgres, or any database. Files.\n- Do not invent a conversation replay system. Whatever `claude --print` / the agents already do for context is the baseline — match it.\n- Do not store history inside the container's tmpfs — it has to survive container restarts.\n- Parent vision: #623.\n\n## Affected files\n\n- `lib/generators.sh`\n- `docker/chat/server.py`\n\n"
+  },
+  {
+    "action": "edit_body",
+    "issue": 711,
+    "body": "## Goal\n\nAdd per-user cost and request caps to `disinto-chat` so a compromised session (or a wedged browser tab firing requests in a loop) cannot run up an unbounded Anthropic bill or starve the agents' token budget.\n\n## Why\n\n- #623 \"Open questions\" explicitly calls this out. Chat is the only user-facing surface that spawns Claude on demand; no other factory surface does.\n- Cheap to enforce (counter + bash-style dict), expensive to forget.\n\n## Scope\n\n### Files to touch\n\n- `docker/chat/server.{py,go}`:\n  - Per-user sliding-window request counter: `CHAT_MAX_REQUESTS_PER_HOUR` (default `60`), `CHAT_MAX_REQUESTS_PER_DAY` (default `500`).\n  - Per-user token-cost counter: after each `claude --print`, parse the final `usage` event from `--output-format stream-json` if present; track cumulative tokens per day; reject if over `CHAT_MAX_TOKENS_PER_DAY` (default `1000000`).\n  - Counters stored in-memory; reset on container restart (acceptable for MVP; file-based persistence is a follow-up).\n  - Rejection response: 429 with `Retry-After` header and a friendly HTMX fragment explaining which cap was hit.\n- `lib/generators.sh` chat env: expose the three caps as overridable env vars with sane defaults baked in.\n\n### Out of scope\n\n- Billing dashboard.\n- Cross-container token budget coordination with the agents.\n- Cost tracking via Anthropic's billing API (not stable enough to depend on).\n\n## Acceptance\n\n- [ ] Sending 61 requests in an hour trips the hourly cap and returns 429 with `Retry-After: <seconds>`.\n- [ ] A single large completion that pushes daily tokens over the cap blocks the *next* request, not the current one (atomic check-then-consume is OK to skip for MVP).\n- [ ] Resetting the container clears counters (verified manually).\n- [ ] Caps are configurable via `.env` without rebuilding the image.\n\n## Depends on\n\n- #705 (chat scaffold).\n\n## Notes\n\n- Token accounting from `claude --print`: the stream-json mode emits a final `usage` event. If that event is absent or its format changes, fall back to a coarse request count only — do not block the user on parsing failures.\n- `Retry-After` must be an integer seconds value, not an HTTP-date, for HTMX to handle it cleanly client-side.\n\n## Boundaries for dev-agent\n\n- Do not add a rate-limiting library. A dict + timestamp list is sufficient for three counters.\n- Do not persist counters to disk this chunk. In-memory is the contract.\n- Do not block requests on Anthropic's own rate limiter. That is retried by `claude` itself; this layer is about *cost*, not throttling.\n- Parent vision: #623.\n\n## Affected files\n\n- `docker/chat/server.py`\n- `lib/generators.sh`\n\n"
+  },
+  {
+    "action": "edit_body",
+    "issue": 712,
+    "body": "## Goal\n\nLet `disinto-chat` perform scoped write actions against the factory — specifically: trigger a Woodpecker CI run, create a Forgejo issue, create a Forgejo PR — via explicit backend endpoints. The UI surfaces these as buttons the user clicks from a chat turn that proposes an action. The model never holds API tokens directly.\n\n## Why\n\n- #623 lists these escalations as the difference between \"chat that talks about the project\" and \"chat that moves the project forward\".\n- Routing through explicit backend endpoints (instead of giving the sandboxed claude process API tokens) keeps the trust model tight: the *user* authorises each action, not the model.\n\n## Scope\n\n### Files to touch\n\n- `docker/chat/server.{py,go}` — new authenticated endpoints (reuse #708 / #709 session check):\n  - `POST /chat/action/ci-run` — body `{repo, branch}` → calls Woodpecker API with `WOODPECKER_TOKEN` (already in `.env` from existing factory setup) to trigger a pipeline.\n  - `POST /chat/action/issue-create` — body `{title, body, labels}` → calls Forgejo API `/repos/<owner>/<repo>/issues` with `FORGE_TOKEN`.\n  - `POST /chat/action/pr-create` — body `{head, base, title, body}` → calls `/repos/<owner>/<repo>/pulls`.\n  - All actions record to #710's NDJSON history as `{role: \"action\", ...}` lines.\n- `docker/chat/ui/index.html` — small HTMX pattern: when claude's response contains a marker like `<action type=\"issue-create\">{...}</action>`, render a clickable button below the message; clicking POSTs to `/chat/action/<type>` with the payload.\n- `lib/generators.sh` chat env: pass `WOODPECKER_TOKEN`, `FORGE_TOKEN`, `FORGE_URL`, `FORGE_OWNER`, `FORGE_REPO`.\n\n### Out of scope\n\n- Destructive actions (branch delete, force push, secret rotation) — deliberately excluded.\n- Multi-step workflows / approval chains.\n- Arbitrary code execution in the chat container (that is what the agents exist for).\n\n## Acceptance\n\n- [ ] A chat turn that emits an `<action type=\"issue-create\">{...}</action>` block renders a button; clicking it creates an issue on Forgejo, visible via the API.\n- [ ] CI-trigger action creates a Woodpecker pipeline that can be seen in the CI UI.\n- [ ] PR-create action produces a Forgejo PR with the specified head / base.\n- [ ] All three actions are logged into the #710 history file with role `action` and the response from the API call.\n- [ ] Unauthenticated requests to `/chat/action/*` return 401 (inherits #708 gate).\n\n## Depends on\n\n- #708 (OAuth gate — actions are authorised by the logged-in user).\n- #710 (history — actions need to be logged alongside chat turns).\n\n## Notes\n\n- Forgejo API auth: the factory's `FORGE_TOKEN` is a long-lived admin token. For MVP, reuse it; a follow-up issue can scope it down to per-user Forgejo tokens derived from the OAuth flow.\n- Woodpecker API is at `http://woodpecker:8000/api/...`, reachable via the compose network — no need to go through the edge container.\n- The `<action>` marker is deliberately simple markup the model can emit in its response text. Do not implement tool-calling protocol; do not spin up an MCP server.\n\n## Boundaries for dev-agent\n\n- Do not give the claude subprocess direct API tokens. The chat backend holds them; the model only emits action markers the user clicks.\n- Do not add destructive actions (delete, force-push). Additive only.\n- Do not invent a new markup format beyond `<action type=\"...\">{JSON}</action>`.\n- Parent vision: #623.\n\n## Affected files\n\n- `docker/chat/server.py`\n- `docker/chat/ui/index.html`\n- `lib/generators.sh`\n\n"
+  },
+  {
+    "action": "edit_body",
+    "issue": 713,
+    "body": "## Goal\n\nContingency track: if the subpath routing + Forgejo OAuth combination from #704 and #708 proves unworkable (redirect loops, Forgejo `ROOT_URL` quirks, etc.), provide a documented fallback using per-service subdomains (`forge.<project>.disinto.ai`, `ci.<project>.disinto.ai`, `chat.<project>.disinto.ai`) under the same wildcard cert.\n\n## Why\n\n- #623 Scope Highlights mentions this as the fallback if subpath OAuth fails.\n- Documenting the fallback up front means we can pivot without a days-long investigation when the subpath approach hits a wall.\n- The wildcard cert from #621 already covers `*.disinto.ai` at no extra cost.\n\n## Scope\n\nThis issue is a **plan + small toggle**, not a full implementation. Implementation only happens if #704 or #708 get stuck.\n\n### Files to touch\n\n- `docs/edge-routing-fallback.md` (new) — documents the fallback topology, diffing concretely against #704 / #708:\n  - Caddyfile: four separate host blocks (`<project>.disinto.ai`, `forge.<project>...`, `ci.<project>...`, `chat.<project>...`), each a single `reverse_proxy` to the container.\n  - Forgejo `ROOT_URL` becomes `https://forge.<project>.disinto.ai/` (root path, not subpath).\n  - Woodpecker `WOODPECKER_HOST` becomes `https://ci.<project>.disinto.ai`.\n  - OAuth redirect URIs (chat, woodpecker) become sub-subdomain paths.\n  - DNS: all handled by the existing wildcard; no new records.\n- `lib/generators.sh` — no code change until pivot; document the env vars that would need to change (`EDGE_TUNNEL_FQDN_FORGE`, etc.) in a comment near `generate_compose`.\n- `tools/edge-control/register.sh` (from #621) — leave a TODO comment noting the fallback shape would need an additional subdomain parameter per project.\n\n### Out of scope (unless pivot)\n\n- Actually implementing the fallback — gated on #704 / #708 failing.\n\n## Acceptance\n\n- [ ] `docs/edge-routing-fallback.md` exists and is concrete enough that a follow-up PR to pivot would take under a day.\n- [ ] The doc names exactly which files / lines each pivot would touch (Caddyfile, `lib/generators.sh`, `lib/ci-setup.sh` redirect URI).\n- [ ] A pivot decision criterion is written into the doc: \"pivot if <specific symptom>, not if <symptom with a known fix>\".\n\n## Depends on\n\n- None — can be written in parallel to #704 / #708.\n\n## Notes\n\n- Keep the doc short. This is a pressure-release valve, not a parallel architecture.\n- Whichever chunk is implementing subpaths first should update this doc if they hit a blocker so the pivot decision is informed.\n\n## Boundaries for dev-agent\n\n- This is a documentation chunk. Do not implement the fallback unless someone explicitly says to pivot.\n- Do not make the main chunks \"fallback-ready\" — that is over-engineering for a contingency.\n- Parent vision: #623.\n\n## Affected files\n\n- `docs/edge-routing-fallback.md`\n- `lib/generators.sh`\n- `tools/edge-control/register.sh`\n\n"
  }
 ]
--- a/lib/AGENTS.md
+++ b/lib/AGENTS.md
@ -1,4 +1,4 @@
-<!-- last-reviewed: 31f2cb7bfa38df3db8fbed28ec0899c412f06c49 -->
+<!-- last-reviewed: c51cc9dba649ed543b910b561231a5c8bd2130bc -->
 # Shared Helpers (`lib/`)

 All agents source `lib/env.sh` as their first action. Additional helpers are
@ -12,7 +12,7 @@ sourced as needed.
 | `lib/ci-log-reader.py` | Python tool: reads CI logs from Woodpecker SQLite database. `<pipeline_number> [--step <name>]` — returns last 200 lines from failed steps (or specified step). Used by `ci_get_logs()` in ci-helpers.sh. Requires `WOODPECKER_DATA_DIR` (default: /woodpecker-data). | ci-helpers.sh |
 | `lib/load-project.sh` | Parses a `projects/*.toml` file into env vars (`PROJECT_NAME`, `FORGE_REPO`, `WOODPECKER_REPO_ID`, monitoring toggles, mirror config, etc.). Also exports `FORGE_REPO_OWNER` (the owner component of `FORGE_REPO`, e.g. `disinto-admin` from `disinto-admin/disinto`). Reads `repo_root` and `ops_repo_root` from the TOML for host-CLI callers. **Container path handling (#674)**: no longer derives `PROJECT_REPO_ROOT` or `OPS_REPO_ROOT` inside the script — container entrypoints export the correct paths before agent scripts source `env.sh`, and the `DISINTO_CONTAINER` guard (line 90) skips TOML overrides when those vars are already set. | env.sh (when `PROJECT_TOML` is set) |
 | `lib/parse-deps.sh` | Extracts dependency issue numbers from an issue body (stdin → stdout, one number per line). Matches `## Dependencies` / `## Depends on` / `## Blocked by` sections and inline `depends on #N` / `blocked by #N` patterns. Inline scan skips fenced code blocks to prevent false positives from code examples in issue bodies. Not sourced — executed via `bash lib/parse-deps.sh`. | dev-poll |
-| `lib/formula-session.sh` | `acquire_run_lock()`, `load_formula()`, `load_formula_or_profile()`, `build_context_block()`, `ensure_ops_repo()`, `ops_commit_and_push()`, `build_prompt_footer()`, `build_sdk_prompt_footer()`, `formula_worktree_setup()`, `formula_prepare_profile_context()`, `formula_lessons_block()`, `profile_write_journal()`, `profile_load_lessons()`, `ensure_profile_repo()`, `_profile_has_repo()`, `_count_undigested_journals()`, `_profile_digest_journals()`, `_profile_commit_and_push()`, `resolve_agent_identity()`, `build_graph_section()`, `build_scratch_instruction()`, `read_scratch_context()`, `cleanup_stale_crashed_worktrees()` — shared helpers for formula-driven polling-loop agents (lock, .profile repo management, prompt assembly, worktree setup). Memory guard is provided by `memory_guard()` in `lib/env.sh` (not duplicated here). `resolve_agent_identity()` — sets `FORGE_TOKEN`, `AGENT_IDENTITY`, `FORGE_REMOTE` from per-agent token env vars and FORGE_URL remote detection. `build_graph_section()` generates the structural-analysis section (runs `lib/build-graph.py`, formats JSON output) — previously duplicated in planner-run.sh and predictor-run.sh, now shared here. `cleanup_stale_crashed_worktrees()` — thin wrapper around `worktree_cleanup_stale()` from `lib/worktree.sh` (kept for backwards compatibility). | planner-run.sh, predictor-run.sh, gardener-run.sh, supervisor-run.sh, dev-agent.sh |
+| `lib/formula-session.sh` | `acquire_run_lock()`, `load_formula()`, `load_formula_or_profile()`, `build_context_block()`, `ensure_ops_repo()`, `ops_commit_and_push()`, `build_prompt_footer()`, `build_sdk_prompt_footer()`, `formula_worktree_setup()`, `formula_prepare_profile_context()`, `formula_lessons_block()`, `profile_write_journal()`, `profile_load_lessons()`, `ensure_profile_repo()`, `_profile_has_repo()`, `_count_undigested_journals()`, `_profile_digest_journals()`, `_profile_restore_lessons()`, `_profile_commit_and_push()`, `resolve_agent_identity()`, `build_graph_section()`, `build_scratch_instruction()`, `read_scratch_context()`, `cleanup_stale_crashed_worktrees()` — shared helpers for formula-driven polling-loop agents (lock, .profile repo management, prompt assembly, worktree setup). Memory guard is provided by `memory_guard()` in `lib/env.sh` (not duplicated here). `resolve_agent_identity()` — sets `FORGE_TOKEN`, `AGENT_IDENTITY`, `FORGE_REMOTE` from per-agent token env vars and FORGE_URL remote detection. `build_graph_section()` generates the structural-analysis section (runs `lib/build-graph.py`, formats JSON output) — previously duplicated in planner-run.sh and predictor-run.sh, now shared here. `cleanup_stale_crashed_worktrees()` — thin wrapper around `worktree_cleanup_stale()` from `lib/worktree.sh` (kept for backwards compatibility). **Journal digestion guards (#702)**: `_profile_digest_journals()` respects `PROFILE_DIGEST_TIMEOUT` (default 300s) and `PROFILE_DIGEST_MAX_BATCH` (default 5 journals per run); `_profile_restore_lessons()` restores the previous lessons-learned.md on digest failure. | planner-run.sh, predictor-run.sh, gardener-run.sh, supervisor-run.sh, dev-agent.sh |
 | `lib/guard.sh` | `check_active(agent_name)` — reads `$FACTORY_ROOT/state/.{agent_name}-active`; exits 0 (skip) if the file is absent. Factory is off by default — state files must be created to enable each agent. **Logs a message to stderr** when skipping (`[check_active] SKIP: state file not found`), so agent dropout is visible in loop logs. Sourced by dev-poll.sh, review-poll.sh, predictor-run.sh, supervisor-run.sh. | polling-loop entry points |
 | `lib/mirrors.sh` | `mirror_push()` — pushes `$PRIMARY_BRANCH` + tags to all configured mirror remotes (fire-and-forget background pushes). Reads `MIRROR_NAMES` and `MIRROR_*` vars exported by `load-project.sh` from the `[mirrors]` TOML section. Failures are logged but never block the pipeline. Sourced by dev-poll.sh — called after every successful merge. | dev-poll.sh |
 | `lib/build-graph.py` | Python tool: parses VISION.md, prerequisites.md (from ops repo), AGENTS.md, formulas/*.toml, evidence/ (from ops repo), and forge issues/labels into a NetworkX DiGraph. Runs structural analyses (orphaned objectives, stale prerequisites, thin evidence, circular deps) and outputs a JSON report. Used by `review-pr.sh` (per-PR changed-file analysis) and `predictor-run.sh` (full-project analysis) to provide structural context to Claude. | review-pr.sh, predictor-run.sh |
@ -30,6 +30,6 @@ sourced as needed.
 | `lib/git-creds.sh` | Shared git credential helper configuration. `configure_git_creds([HOME_DIR] [RUN_AS_CMD])` — writes a static credential helper script and configures git globally to use password-based HTTP auth (Forgejo 11.x rejects API tokens for `git push`, #361). `repair_baked_cred_urls([--as RUN_AS_CMD] DIR ...)` — rewrites any git remote URLs that have credentials baked in to use clean URLs instead; uses `safe.directory` bypass for root-owned repos (#671). Requires `FORGE_PASS`, `FORGE_URL`, `FORGE_TOKEN`. | entrypoints (agents, edge) |
 | `lib/ops-setup.sh` | `setup_ops_repo()` — creates ops repo on Forgejo if it doesn't exist, configures bot collaborators, clones/initializes ops repo locally, seeds directory structure (vault, knowledge, evidence, sprints). Evidence subdirectories seeded: engagement/, red-team/, holdout/, evolution/, user-test/. Also seeds sprints/ for architect output. Exports `_ACTUAL_OPS_SLUG`. `migrate_ops_repo(ops_root, [primary_branch])` — idempotent migration helper that seeds missing directories and .gitkeep files on existing ops repos (pre-#407 deployments). | bin/disinto (init) |
 | `lib/ci-setup.sh` | `_install_cron_impl()` — installs crontab entries for bare-metal deployments (compose mode uses polling loop instead). `_create_woodpecker_oauth_impl()` — creates OAuth2 app on Forgejo for Woodpecker. `_generate_woodpecker_token_impl()` — auto-generates WOODPECKER_TOKEN via OAuth2 flow. `_activate_woodpecker_repo_impl()` — activates repo in Woodpecker. All gated by `_load_ci_context()` which validates required env vars. | bin/disinto (init) |
-| `lib/generators.sh` | Template generation for `disinto init`: `generate_compose()` — docker-compose.yml (uses `codeberg.org/forgejo/forgejo:11.0` tag; adds `security_opt: [apparmor:unconfined]` to all services for rootless container compatibility), `generate_caddyfile()` — Caddyfile, `generate_staging_index()` — staging index, `generate_deploy_pipelines()` — Woodpecker deployment pipeline configs. Requires `FACTORY_ROOT`, `PROJECT_NAME`, `PRIMARY_BRANCH`. | bin/disinto (init) |
+| `lib/generators.sh` | Template generation for `disinto init`: `generate_compose()` — docker-compose.yml (uses `codeberg.org/forgejo/forgejo:11.0` tag; adds `security_opt: [apparmor:unconfined]` to all services for rootless container compatibility; Forgejo includes a healthcheck so dependent services use `condition: service_healthy` — fixes cold-start races, #665), `generate_caddyfile()` — Caddyfile, `generate_staging_index()` — staging index, `generate_deploy_pipelines()` — Woodpecker deployment pipeline configs. Requires `FACTORY_ROOT`, `PROJECT_NAME`, `PRIMARY_BRANCH`. | bin/disinto (init) |
 | `lib/hire-agent.sh` | `disinto_hire_an_agent()` — user creation, `.profile` repo setup, formula copying, branch protection, and state marker creation for hiring a new agent. Requires `FORGE_URL`, `FORGE_TOKEN`, `FACTORY_ROOT`, `PROJECT_NAME`. Extracted from `bin/disinto`. | bin/disinto (hire) |
 | `lib/release.sh` | `disinto_release()` — vault TOML creation, branch setup on ops repo, PR creation, and auto-merge request for a versioned release. `_assert_release_globals()` validates required env vars. Requires `FORGE_URL`, `FORGE_TOKEN`, `FORGE_OPS_REPO`, `FACTORY_ROOT`, `PRIMARY_BRANCH`. Extracted from `bin/disinto`. | bin/disinto (release) |
--- a/planner/AGENTS.md
+++ b/planner/AGENTS.md
@ -1,4 +1,4 @@
-<!-- last-reviewed: 31f2cb7bfa38df3db8fbed28ec0899c412f06c49 -->
+<!-- last-reviewed: c51cc9dba649ed543b910b561231a5c8bd2130bc -->
 # Planner Agent

 **Role**: Strategic planning using a Prerequisite Tree (Theory of Constraints),
--- a/predictor/AGENTS.md
+++ b/predictor/AGENTS.md
@ -1,4 +1,4 @@
-<!-- last-reviewed: 31f2cb7bfa38df3db8fbed28ec0899c412f06c49 -->
+<!-- last-reviewed: c51cc9dba649ed543b910b561231a5c8bd2130bc -->
 # Predictor Agent

 **Role**: Abstract adversary (the "goblin"). Runs a 2-step formula
--- a/review/AGENTS.md
+++ b/review/AGENTS.md
@ -1,4 +1,4 @@
-<!-- last-reviewed: 31f2cb7bfa38df3db8fbed28ec0899c412f06c49 -->
+<!-- last-reviewed: c51cc9dba649ed543b910b561231a5c8bd2130bc -->
 # Review Agent

 **Role**: AI-powered PR review — post structured findings and formal
--- a/supervisor/AGENTS.md
+++ b/supervisor/AGENTS.md
@ -1,4 +1,4 @@
-<!-- last-reviewed: 31f2cb7bfa38df3db8fbed28ec0899c412f06c49 -->
+<!-- last-reviewed: c51cc9dba649ed543b910b561231a5c8bd2130bc -->
 # Supervisor Agent

 **Role**: Health monitoring and auto-remediation, executed as a formula-driven