From c236350e00308b64416db758c924e7b4089a0be4 Mon Sep 17 00:00:00 2001 From: Claude Date: Thu, 16 Apr 2026 02:15:38 +0000 Subject: [PATCH] chore: gardener housekeeping 2026-04-16 - Bump AGENTS.md watermarks to HEAD (c363ee0) across all 9 per-directory files - supervisor/AGENTS.md: document dual-container trigger (agents + edge) and SUPERVISOR_INTERVAL env var added by P1/#801 - lib/AGENTS.md: document agents-llama-all compose service (all 7 roles) added to generators.sh by P1/#801 - pending-actions.json: comment #623 (all deps now closed, ready for planner decomposition), comment #758 (needs human Forgejo admin action to unblock ops repo writes) --- AGENTS.md | 2 +- architect/AGENTS.md | 2 +- dev/AGENTS.md | 2 +- gardener/AGENTS.md | 2 +- gardener/pending-actions.json | 60 +++-------------------------------- lib/AGENTS.md | 4 +-- planner/AGENTS.md | 2 +- predictor/AGENTS.md | 2 +- review/AGENTS.md | 2 +- supervisor/AGENTS.md | 15 ++++----- 10 files changed, 21 insertions(+), 72 deletions(-) diff --git a/AGENTS.md b/AGENTS.md index 735879f..c893b09 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -1,4 +1,4 @@ - + # Disinto — Agent Instructions ## What this repo is diff --git a/architect/AGENTS.md b/architect/AGENTS.md index 3c5c26c..deee9cf 100644 --- a/architect/AGENTS.md +++ b/architect/AGENTS.md @@ -1,4 +1,4 @@ - + # Architect — Agent Instructions ## What this agent is diff --git a/dev/AGENTS.md b/dev/AGENTS.md index 7f60a8a..4148f46 100644 --- a/dev/AGENTS.md +++ b/dev/AGENTS.md @@ -1,4 +1,4 @@ - + # Dev Agent **Role**: Implement issues autonomously — write code, push branches, address diff --git a/gardener/AGENTS.md b/gardener/AGENTS.md index b177774..1a2e08e 100644 --- a/gardener/AGENTS.md +++ b/gardener/AGENTS.md @@ -1,4 +1,4 @@ - + # Gardener Agent **Role**: Backlog grooming — detect duplicate issues, missing acceptance diff --git a/gardener/pending-actions.json b/gardener/pending-actions.json index e619a80..2c4c30f 100644 --- a/gardener/pending-actions.json +++ b/gardener/pending-actions.json @@ -1,62 +1,12 @@ [ { - "action": "edit_body", - "issue": 784, - "body": "Flagged by AI reviewer in PR #783.\n\n## Problem\n\n`_regen_file()` (added in PR #783, `bin/disinto` ~line 1424) moves the existing target file to a temp stash before calling the generator:\n\n```bash\nmv \"$target\" \"$stashed\"\n\"$generator\" \"$@\"\n```\n\nThe script runs under `set -euo pipefail`. If the generator exits non-zero, bash exits immediately and the original file remains stranded at `${target}.stash.XXXXXX` (never restored). The target file no longer exists, and `docker compose up` is never reached. Recovery requires the operator to manually locate and rename the hidden stash file.\n\n## Fix\n\nAdd an ERR trap inside `_regen_file` to restore the stash on failure, e.g.:\n```bash\n\"$generator\" \"$@\" || { mv \"$stashed\" \"$target\"; return 1; }\n```\n\n---\n*Auto-created from AI review*\n\n## Acceptance criteria\n\n- [ ] If the generator exits non-zero, the original target file is restored from the stash (not stranded at the temp path)\n- [ ] `_regen_file` still removes the stash file after a successful generator run\n- [ ] `docker compose up` is reached when the generator succeeds\n- [ ] ShellCheck passes on `bin/disinto`\n\n## Affected files\n\n- `bin/disinto` — `_regen_file()` function (~line 1424)\n" - }, - { - "action": "add_label", - "issue": 784, - "label": "backlog" - }, - { - "action": "remove_label", - "issue": 773, - "label": "blocked" - }, - { - "action": "add_label", - "issue": 773, - "label": "backlog" + "action": "comment", + "issue": 623, + "body": "**Dependency check:** All blocking dependencies are now closed:\n- #620 ✓ closed\n- #621 ✓ closed \n- #622 ✓ closed\n\nPer the issue description: *\"Once #620/#621/#622 are green, this issue should fork into at least three backlog children: subpath routing + Forgejo ROOT_URL / Woodpecker HOST, disinto-chat container scaffold with OAuth gate, and Claude Code sandbox envelope + working-dir scoping.\"*\n\nThis vision issue is ready for the planner to decompose into backlog children." }, { "action": "comment", - "issue": 772, - "body": "All child issues have been resolved:\n- #768 (edge restart policy) — closed\n- #769 (agents-llama generator service) — closed\n- #770 (disinto up regenerate) — closed\n- #771 (deprecate docker/Caddyfile) — closed\n\nClosing tracker as all decomposed work is complete." - }, - { - "action": "close", - "issue": 772, - "reason": "all child issues 768-771 closed" - }, - { - "action": "edit_body", - "issue": 778, - "body": "## Problem\n\n`formulas/rent-a-human-caddy-ssh.toml` step 3 tells the operator:\n\n```\necho \"CADDY_SSH_KEY=$(base64 -w0 caddy-collect)\" >> .env.vault.enc\n```\n\n**You cannot append plaintext to a sops-encrypted file.** The append silently corrupts `.env.vault.enc` — subsequent `sops -d` fails, all vault secrets become unrecoverable. Any operator who followed the docs verbatim has broken their vault.\n\nSteps 4 (`CADDY_HOST`) and 5 (`CADDY_ACCESS_LOG`) have the same bug.\n\n## Proposed fix\n\nRewrite the `>>` steps to use the stdin-piped `disinto secrets add` (from issue A):\n\n```\ncat caddy-collect | disinto secrets add CADDY_SSH_KEY\necho '159.89.14.107' | disinto secrets add CADDY_SSH_HOST\necho 'debian' | disinto secrets add CADDY_SSH_USER\necho '/var/log/caddy/access.log' | disinto secrets add CADDY_ACCESS_LOG\n```\n\nAlso:\n- Remove the `base64 -w0` step — the new `secrets add` stores multi-line keys verbatim.\n- Remove the `shred -u caddy-collect` step from the happy path — let the operator keep the backup until they have verified the edge container picks it up.\n- Add a recovery note: operators with a corrupted vault from the old docs must `rm .env.vault.enc` (or `migrate-from-vault` if issue B landed) before re-running.\n\n## Context\n\n- Parent: sprint PR `disinto-admin/disinto-ops#10`.\n- Depends on: #776 (piped `secrets add`) — now closed.\n- Soft-depends on: #777 (if landed, drop all `.env.vault*` references entirely).\n\n## Acceptance criteria\n\n- [ ] Formula runs end-to-end without touching `.env.vault.enc` or `.env.vault` by hand\n- [ ] Re-running is idempotent (upsert via `disinto secrets add -f`)\n- [ ] Edge container starts cleanly with the imported secrets and the daily collect-engagement cron fires without `\"CADDY_SSH_KEY not set, skipping\"`\n\n## Affected files\n\n- `formulas/rent-a-human-caddy-ssh.toml` — replace `>> .env.vault.enc` steps with `disinto secrets add` calls\n" - }, - { - "action": "remove_label", - "issue": 778, - "label": "blocked" - }, - { - "action": "add_label", - "issue": 778, - "label": "backlog" - }, - { - "action": "edit_body", - "issue": 777, - "body": "## Problem\n\nTwo parallel secret stores:\n\n1. `secrets/.enc` — per-key, age-encrypted. Populated by `disinto secrets add`. **No runtime consumer today.** Only `disinto secrets show` ever decrypts these.\n2. `.env.vault.enc` — monolithic, sops/dotenv-encrypted. The only store actually loaded into containers (via `docker/edge/dispatcher.sh` → `sops -d --output-type dotenv`).\n\nTwo mental models, redundant subcommands (`edit-vault`, `show-vault`, `migrate-vault`), and today's `disinto secrets add` silently deposits secrets into a dead-letter directory. Operator runs the command, edge container still logs `CADDY_SSH_KEY not set, skipping` (docker/edge/entrypoint-edge.sh:207).\n\n## Proposed solution\n\nConsolidate on `secrets/.enc` as THE store. One file per secret, granular, small surface.\n\n**1. Wire container dispatchers to load `secrets/*.enc` into env**\n\n- `docker/edge/dispatcher.sh` (and agent / ops dispatchers) decrypt declared secrets at startup and export them.\n- Granular per-secret — not a bulk dump.\n\n**2. Containers declare required secrets**\n\n- `secrets.required = [\"CADDY_SSH_KEY\", \"CADDY_SSH_HOST\", ...]` in the container's TOML, or equivalent in compose.\n- Missing required secret → **hard fail** with clear message. Replaces today's silent-skip branch at `entrypoint-edge.sh:207`.\n\n**3. Deprecate the monolithic vault**\n\n- Remove `.env.vault`, `.env.vault.enc`, and subcommands `edit-vault` / `show-vault` / `migrate-vault` from `bin/disinto`.\n- Remove sops round-trip from `docker/edge/dispatcher.sh` (lines 32-40 currently).\n\n**4. One-shot migration for existing operators**\n\n- `disinto secrets migrate-from-vault` splits an existing `.env.vault.enc` into `secrets/.enc` files, verifies each, then removes the old vault on success.\n- Idempotent: safe to run multiple times.\n\n## Context\n\n- Parent: sprint PR `disinto-admin/disinto-ops#10`.\n- Depends on: #776 (`secrets add` must accept piped stdin before we can deprecate `edit-vault`) — now closed.\n- Rationale (operator quote): *\"containers should have option to load single secrets, granular. no 2 mental models, only 1 thing that works well and has small surface.\"*\n\n## Acceptance criteria\n\n- [ ] Edge container declares `secrets.required = [\"CADDY_SSH_KEY\", \"CADDY_SSH_HOST\", \"CADDY_SSH_USER\", \"CADDY_ACCESS_LOG\"]`; dispatcher exports them; `collect-engagement.sh` runs without additional env wiring\n- [ ] Container refuses to start when a required secret is missing (fail loudly, not skip silently)\n- [ ] `.env.vault*` files and all vault-specific subcommands removed from `bin/disinto` and all formulas / docs\n- [ ] `migrate-from-vault` converts an existing monolithic vault correctly (verified by round-trip test)\n- [ ] `disinto secrets` help text shows one store, four verbs: `add`, `show`, `remove`, `list`\n\n## Affected files\n\n- `bin/disinto` — remove `edit-vault`, `show-vault`, `migrate-vault` subcommands; add `migrate-from-vault`\n- `docker/edge/dispatcher.sh` — replace sops round-trip with per-secret age decryption (lines 32-40)\n- `docker/edge/entrypoint-edge.sh` — replace silent-skip at line 207 with hard fail on missing required secrets\n- `lib/vault.sh` — update or remove vault-env.sh wiring now that `.env.vault.enc` is deprecated\n" - }, - { - "action": "remove_label", - "issue": 777, - "label": "blocked" - }, - { - "action": "add_label", - "issue": 777, - "label": "backlog" + "issue": 758, + "body": "**Gardener flag:** This issue requires human admin action on Forgejo to resolve — changing branch protection settings on the ops repo. No automated formula can fix Forgejo admin settings.\n\nProposed options (from issue body):\n1. Add `planner-bot` to the merge whitelist in ops repo branch protection\n2. Remove branch protection from the ops repo (agents are primary writers)\n3. Create an admin-level service token for agents\n\nThis is blocking all ops repo writes (planner knowledge, sprint artifacts, vault items)." } ] diff --git a/lib/AGENTS.md b/lib/AGENTS.md index 428ab8f..86fd67a 100644 --- a/lib/AGENTS.md +++ b/lib/AGENTS.md @@ -1,4 +1,4 @@ - + # Shared Helpers (`lib/`) All agents source `lib/env.sh` as their first action. Additional helpers are @@ -30,7 +30,7 @@ sourced as needed. | `lib/git-creds.sh` | Shared git credential helper configuration. `configure_git_creds([HOME_DIR] [RUN_AS_CMD])` — writes a static credential helper script and configures git globally to use password-based HTTP auth (Forgejo 11.x rejects API tokens for `git push`, #361). **Retry on cold boot (#741)**: resolves bot username from `FORGE_TOKEN` with 5 retries (exponential backoff 1-5s); fails loudly and returns 1 if Forgejo is unreachable — never falls back to a wrong hardcoded default (exports `BOT_USER` on success). `repair_baked_cred_urls([--as RUN_AS_CMD] DIR ...)` — rewrites any git remote URLs that have credentials baked in to use clean URLs instead; uses `safe.directory` bypass for root-owned repos (#671). Requires `FORGE_PASS`, `FORGE_URL`, `FORGE_TOKEN`. | entrypoints (agents, edge) | | `lib/ops-setup.sh` | `setup_ops_repo()` — creates ops repo on Forgejo if it doesn't exist, configures bot collaborators, clones/initializes ops repo locally, seeds directory structure (vault, knowledge, evidence, sprints). Evidence subdirectories seeded: engagement/, red-team/, holdout/, evolution/, user-test/. Also seeds sprints/ for architect output. Exports `_ACTUAL_OPS_SLUG`. `migrate_ops_repo(ops_root, [primary_branch])` — idempotent migration helper that seeds missing directories and .gitkeep files on existing ops repos (pre-#407 deployments). | bin/disinto (init) | | `lib/ci-setup.sh` | `_install_cron_impl()` — installs crontab entries for bare-metal deployments (compose mode uses polling loop instead). `_create_forgejo_oauth_app()` — generic helper to create an OAuth2 app on Forgejo (shared by Woodpecker and chat). `_create_woodpecker_oauth_impl()` — creates Woodpecker OAuth2 app (thin wrapper). `_create_chat_oauth_impl()` — creates disinto-chat OAuth2 app, writes `CHAT_OAUTH_CLIENT_ID`/`CHAT_OAUTH_CLIENT_SECRET` to `.env` (#708). `_generate_woodpecker_token_impl()` — auto-generates WOODPECKER_TOKEN via OAuth2 flow. `_activate_woodpecker_repo_impl()` — activates repo in Woodpecker. All gated by `_load_ci_context()` which validates required env vars. | bin/disinto (init) | -| `lib/generators.sh` | Template generation for `disinto init`: `generate_compose()` — docker-compose.yml (uses `codeberg.org/forgejo/forgejo:11.0` tag; adds `security_opt: [apparmor:unconfined]` to all services for rootless container compatibility; Forgejo includes a healthcheck so dependent services use `condition: service_healthy` — fixes cold-start races, #665; adds `chat` service block with isolated `chat-config` named volume and `CHAT_HISTORY_DIR` bind-mount for per-user NDJSON history persistence (#710); injects `FORWARD_AUTH_SECRET` for Caddy↔chat defense-in-depth auth (#709); cost-cap env vars `CHAT_MAX_REQUESTS_PER_HOUR`, `CHAT_MAX_REQUESTS_PER_DAY`, `CHAT_MAX_TOKENS_PER_DAY` (#711); subdomain fallback comment for `EDGE_TUNNEL_FQDN_*` vars (#713); all `depends_on` now use `condition: service_healthy/started` instead of bare service names; all services now include `restart: unless-stopped` including the edge service — #768; agents service now uses `image: ghcr.io/disinto/agents:${DISINTO_IMAGE_TAG:-latest}` instead of `build:` (#429); `WOODPECKER_PLUGINS_PRIVILEGED` env var added to woodpecker service (#779); agents-llama conditional block gated on `ENABLE_LLAMA_AGENT=1` (#769); agents service gains volume mounts for `./projects`, `./.env`, `./state`), `generate_caddyfile()` — Caddyfile (routes: `/forge/*` → forgejo:3000, `/woodpecker/*` → woodpecker:8000, `/staging/*` → staging:80; `/chat/login` and `/chat/oauth/callback` bypass `forward_auth` so unauthenticated users can reach the OAuth flow; `/chat/*` gated by `forward_auth` on `chat:8080/chat/auth/verify` which stamps `X-Forwarded-User` (#709); root `/` redirects to `/forge/`), `generate_staging_index()` — staging index, `generate_deploy_pipelines()` — Woodpecker deployment pipeline configs. Requires `FACTORY_ROOT`, `PROJECT_NAME`, `PRIMARY_BRANCH`. | bin/disinto (init) | +| `lib/generators.sh` | Template generation for `disinto init`: `generate_compose()` — docker-compose.yml (uses `codeberg.org/forgejo/forgejo:11.0` tag; adds `security_opt: [apparmor:unconfined]` to all services for rootless container compatibility; Forgejo includes a healthcheck so dependent services use `condition: service_healthy` — fixes cold-start races, #665; adds `chat` service block with isolated `chat-config` named volume and `CHAT_HISTORY_DIR` bind-mount for per-user NDJSON history persistence (#710); injects `FORWARD_AUTH_SECRET` for Caddy↔chat defense-in-depth auth (#709); cost-cap env vars `CHAT_MAX_REQUESTS_PER_HOUR`, `CHAT_MAX_REQUESTS_PER_DAY`, `CHAT_MAX_TOKENS_PER_DAY` (#711); subdomain fallback comment for `EDGE_TUNNEL_FQDN_*` vars (#713); all `depends_on` now use `condition: service_healthy/started` instead of bare service names; all services now include `restart: unless-stopped` including the edge service — #768; agents service now uses `image: ghcr.io/disinto/agents:${DISINTO_IMAGE_TAG:-latest}` instead of `build:` (#429); `WOODPECKER_PLUGINS_PRIVILEGED` env var added to woodpecker service (#779); agents-llama conditional block gated on `ENABLE_LLAMA_AGENT=1` (#769); `agents-llama-all` compose service (profile `agents-llama-all`, all 7 roles: review,dev,gardener,architect,planner,predictor,supervisor) added by #801; agents service gains volume mounts for `./projects`, `./.env`, `./state`), `generate_caddyfile()` — Caddyfile (routes: `/forge/*` → forgejo:3000, `/woodpecker/*` → woodpecker:8000, `/staging/*` → staging:80; `/chat/login` and `/chat/oauth/callback` bypass `forward_auth` so unauthenticated users can reach the OAuth flow; `/chat/*` gated by `forward_auth` on `chat:8080/chat/auth/verify` which stamps `X-Forwarded-User` (#709); root `/` redirects to `/forge/`), `generate_staging_index()` — staging index, `generate_deploy_pipelines()` — Woodpecker deployment pipeline configs. Requires `FACTORY_ROOT`, `PROJECT_NAME`, `PRIMARY_BRANCH`. | bin/disinto (init) | | `lib/sprint-filer.sh` | Post-merge sub-issue filer for sprint PRs. Invoked by the `.woodpecker/ops-filer.yml` pipeline after a sprint PR merges to ops repo `main`. Parses ` ... ` blocks from sprint PR bodies to extract sub-issue definitions, creates them on the project repo using `FORGE_FILER_TOKEN` (narrow-scope `filer-bot` identity with `issues:write` only), adds `in-progress` label to the parent vision issue, and handles vision lifecycle closure when all sub-issues are closed. Uses `filer_api_all()` for paginated fetches. Idempotent: uses `` markers to skip already-filed issues. Requires `FORGE_FILER_TOKEN`, `FORGE_API`, `FORGE_API_BASE`, `FORGE_OPS_REPO`. | `.woodpecker/ops-filer.yml` (CI pipeline on ops repo) | | `lib/hire-agent.sh` | `disinto_hire_an_agent()` — user creation, `.profile` repo setup, formula copying, branch protection, and state marker creation for hiring a new agent. Requires `FORGE_URL`, `FORGE_TOKEN`, `FACTORY_ROOT`, `PROJECT_NAME`. Extracted from `bin/disinto`. | bin/disinto (hire) | | `lib/release.sh` | `disinto_release()` — vault TOML creation, branch setup on ops repo, PR creation, and auto-merge request for a versioned release. `_assert_release_globals()` validates required env vars. Requires `FORGE_URL`, `FORGE_TOKEN`, `FORGE_OPS_REPO`, `FACTORY_ROOT`, `PRIMARY_BRANCH`. Extracted from `bin/disinto`. | bin/disinto (release) | diff --git a/planner/AGENTS.md b/planner/AGENTS.md index 59f54bf..aa784f4 100644 --- a/planner/AGENTS.md +++ b/planner/AGENTS.md @@ -1,4 +1,4 @@ - + # Planner Agent **Role**: Strategic planning using a Prerequisite Tree (Theory of Constraints), diff --git a/predictor/AGENTS.md b/predictor/AGENTS.md index 98dc8cd..c10e1f8 100644 --- a/predictor/AGENTS.md +++ b/predictor/AGENTS.md @@ -1,4 +1,4 @@ - + # Predictor Agent **Role**: Abstract adversary (the "goblin"). Runs a 2-step formula diff --git a/review/AGENTS.md b/review/AGENTS.md index f757e22..5137302 100644 --- a/review/AGENTS.md +++ b/review/AGENTS.md @@ -1,4 +1,4 @@ - + # Review Agent **Role**: AI-powered PR review — post structured findings and formal diff --git a/supervisor/AGENTS.md b/supervisor/AGENTS.md index e96bd53..ef36ccb 100644 --- a/supervisor/AGENTS.md +++ b/supervisor/AGENTS.md @@ -1,4 +1,4 @@ - + # Supervisor Agent **Role**: Health monitoring and auto-remediation, executed as a formula-driven @@ -7,13 +7,11 @@ then runs an interactive Claude session (sonnet) that assesses health, auto-fixe issues, and writes a daily journal. When blocked on external resources or human decisions, files vault items instead of escalating directly. -**Trigger**: `supervisor-run.sh` is invoked by the polling loop in `docker/edge/entrypoint-edge.sh` -every 20 minutes (line 50-53). Sources `lib/guard.sh` and calls `check_active supervisor` first -— skips if `$FACTORY_ROOT/state/.supervisor-active` is absent. Then runs `claude -p` via -`agent-sdk.sh`, injects `formulas/run-supervisor.toml` with pre-collected metrics as context, -and cleans up on completion or timeout (20 min max session). Note: the supervisor runs in the -**edge container** (`entrypoint-edge.sh`), not the agent container — this distinction matters -for operators debugging the factory. +**Trigger**: `supervisor-run.sh` is invoked by two polling loops: +- **Agents container** (`docker/agents/entrypoint.sh`): every `SUPERVISOR_INTERVAL` seconds (default 1200 = 20 min). Controlled by the `supervisor` role in `AGENT_ROLES` (included in the default seven-role set since P1/#801). Logs to `supervisor.log` in the agents container. +- **Edge container** (`docker/edge/entrypoint-edge.sh`): separate loop in the edge container (line 169-172). Runs independently of the agents container's polling schedule. + +Both invoke the same `supervisor-run.sh`. Sources `lib/guard.sh` and calls `check_active supervisor` first — skips if `$FACTORY_ROOT/state/.supervisor-active` is absent. Then runs `claude -p` via `agent-sdk.sh`, injects `formulas/run-supervisor.toml` with pre-collected metrics as context, and cleans up on completion or timeout. **Key files**: - `supervisor/supervisor-run.sh` — Polling loop participant + orchestrator: lock, memory guard, @@ -39,6 +37,7 @@ P3 (degraded PRs, circular deps, stale deps), P4 (housekeeping). **Environment variables consumed**: - `FORGE_TOKEN`, `FORGE_SUPERVISOR_TOKEN` (falls back to FORGE_TOKEN), `FORGE_REPO`, `FORGE_API`, `PROJECT_NAME`, `PROJECT_REPO_ROOT`, `OPS_REPO_ROOT` - `PRIMARY_BRANCH`, `CLAUDE_MODEL` (set to sonnet by supervisor-run.sh) +- `SUPERVISOR_INTERVAL` — polling interval in seconds for agents container (default 1200 = 20 min) - `WOODPECKER_TOKEN`, `WOODPECKER_SERVER`, `WOODPECKER_DB_PASSWORD`, `WOODPECKER_DB_USER`, `WOODPECKER_DB_HOST`, `WOODPECKER_DB_NAME` — CI database queries **Degraded mode (Issue #544)**: When `OPS_REPO_ROOT` is not set or the directory doesn't exist, the supervisor runs in degraded mode: -- 2.49.1