Merge pull request 'chore: gardener housekeeping 2026-03-26' (#768) from chore/gardener-20260326-1814 into main

Reviewed-on: https://codeberg.org/johba/disinto/pulls/768
This commit is contained in:
johba 2026-03-26 19:19:32 +01:00
commit a899fd0733
11 changed files with 45 additions and 130 deletions

View file

@ -1,4 +1,4 @@
<!-- last-reviewed: cebcb8c13ab7948fc794f49c379ed34570e45652 --> <!-- last-reviewed: f32707ba659de278a3af434e3549fb8a8dce9d3a -->
# Disinto — Agent Instructions # Disinto — Agent Instructions
## What this repo is ## What this repo is

View file

@ -1,4 +1,4 @@
<!-- last-reviewed: 043bf0f0217aef3f319b844f1a1277acd6327a1c --> <!-- last-reviewed: f32707ba659de278a3af434e3549fb8a8dce9d3a -->
# Action Agent # Action Agent
**Role**: Execute operational tasks described by action formulas — run scripts, **Role**: Execute operational tasks described by action formulas — run scripts,
@ -24,8 +24,10 @@ session, and spawns `action-agent.sh <issue-number>`.
6. **Path B (no git output):** Claude posts results as comment, closes issue → `PHASE:done` → handler cleans up (kill session, docker compose down, remove temp files). 6. **Path B (no git output):** Claude posts results as comment, closes issue → `PHASE:done` → handler cleans up (kill session, docker compose down, remove temp files).
7. For human input: Claude writes `PHASE:escalate`; human responds via vault/forge. 7. For human input: Claude writes `PHASE:escalate`; human responds via vault/forge.
**Crash recovery**: on `PHASE:crashed` or non-zero exit, the worktree is **preserved** (not destroyed) for debugging. Location logged. Supervisor housekeeping removes stale crashed worktrees older than 24h.
**Environment variables consumed**: **Environment variables consumed**:
- `FORGE_TOKEN`, `FORGE_REPO`, `FORGE_API`, `FORGE_URL`, `PROJECT_NAME`, `FORGE_WEB` - `FORGE_TOKEN`, `FORGE_ACTION_TOKEN` (falls back to FORGE_TOKEN), `FORGE_REPO`, `FORGE_API`, `FORGE_URL`, `PROJECT_NAME`, `FORGE_WEB`
- `ACTION_IDLE_TIMEOUT` — Max seconds before killing idle session (default 14400 = 4h) - `ACTION_IDLE_TIMEOUT` — Max seconds before killing idle session (default 14400 = 4h)
- `ACTION_MAX_LIFETIME` — Max total session wall-clock seconds (default 28800 = 8h); caps session independently of idle timeout - `ACTION_MAX_LIFETIME` — Max total session wall-clock seconds (default 28800 = 8h); caps session independently of idle timeout

View file

@ -1,4 +1,4 @@
<!-- last-reviewed: 043bf0f0217aef3f319b844f1a1277acd6327a1c --> <!-- last-reviewed: f32707ba659de278a3af434e3549fb8a8dce9d3a -->
# Dev Agent # Dev Agent
**Role**: Implement issues autonomously — write code, push branches, address **Role**: Implement issues autonomously — write code, push branches, address
@ -29,6 +29,10 @@ check so approved PRs get merged even while a dev-agent session is active.
**FORGE_REMOTE**: `dev-agent.sh` auto-detects which git remote corresponds to `FORGE_URL` by matching the remote's push URL hostname. This is exported as `FORGE_REMOTE` and used for all git push/pull/worktree operations. Defaults to `origin` if no match found. This ensures correct behaviour when the forge is local Forgejo (remote typically named `forgejo`) rather than Codeberg (`origin`). **FORGE_REMOTE**: `dev-agent.sh` auto-detects which git remote corresponds to `FORGE_URL` by matching the remote's push URL hostname. This is exported as `FORGE_REMOTE` and used for all git push/pull/worktree operations. Defaults to `origin` if no match found. This ensures correct behaviour when the forge is local Forgejo (remote typically named `forgejo`) rather than Codeberg (`origin`).
**Session lock**: fd-based flock — released during idle phases (`awaiting_review`, `awaiting_ci`) so other agents can proceed; re-acquired before injecting the next prompt. This prevents the lock from blocking the whole factory while the dev session waits.
**Crash recovery**: on `PHASE:crashed` or non-zero exit, the worktree is **preserved** (not destroyed) for debugging. Location logged. Supervisor housekeeping removes stale crashed worktrees older than 24h.
**Lifecycle**: dev-poll.sh (`check_active dev`) → dev-agent.sh → tmux `dev-{project}-{issue}` → phase file **Lifecycle**: dev-poll.sh (`check_active dev`) → dev-agent.sh → tmux `dev-{project}-{issue}` → phase file
drives CI/review loop → merge + `mirror_push()` → close issue. On respawn after drives CI/review loop → merge + `mirror_push()` → close issue. On respawn after
`PHASE:escalate`, the stale phase file is cleared first so the session starts `PHASE:escalate`, the stale phase file is cleared first so the session starts

View file

@ -1,4 +1,4 @@
<!-- last-reviewed: f2064ba67c3b6819f5e252300927c01e2825dd7c --> <!-- last-reviewed: f32707ba659de278a3af434e3549fb8a8dce9d3a -->
# Gardener Agent # Gardener Agent
**Role**: Backlog grooming — detect duplicate issues, missing acceptance **Role**: Backlog grooming — detect duplicate issues, missing acceptance
@ -28,7 +28,7 @@ directly from cron like the planner, predictor, and supervisor.
PR, reviewed alongside AGENTS.md changes, executed by gardener-run.sh after merge. PR, reviewed alongside AGENTS.md changes, executed by gardener-run.sh after merge.
**Environment variables consumed**: **Environment variables consumed**:
- `FORGE_TOKEN`, `FORGE_REPO`, `FORGE_API`, `PROJECT_NAME`, `PROJECT_REPO_ROOT` - `FORGE_TOKEN`, `FORGE_GARDENER_TOKEN` (falls back to FORGE_TOKEN), `FORGE_REPO`, `FORGE_API`, `PROJECT_NAME`, `PROJECT_REPO_ROOT`
- `PRIMARY_BRANCH`, `CLAUDE_MODEL` (set to sonnet by gardener-run.sh) - `PRIMARY_BRANCH`, `CLAUDE_MODEL` (set to sonnet by gardener-run.sh)
**Lifecycle**: gardener-run.sh (cron 0,6,12,18) → `check_active gardener` → lock + memory guard → **Lifecycle**: gardener-run.sh (cron 0,6,12,18) → `check_active gardener` → lock + memory guard →

View file

@ -1,126 +1,32 @@
[ [
{ {
"action": "comment", "action": "edit_body",
"issue": 710, "issue": 765,
"body": "Closing as duplicate of #714, which covers the same task (creating an OpenClaw/ClawHub skill package) with complete acceptance criteria and affected files. All work should proceed under #714." "body": "Depends on: none\n\n## Goal\n\nThe disinto website becomes a versioned artifact: built by CI, published to Codeberg's generic package registry, deployed to staging automatically. Version visible in footer.\n\n## Files to add/change\n\n### `site/VERSION`\n```\n0.1.0\n```\n\n### `site/build.sh`\n```bash\n#!/bin/bash\nVERSION=$(cat VERSION)\nmkdir -p dist\ncp *.html *.jpg *.webp *.png *.ico *.xml robots.txt dist/\nsed -i \"s|Built from scrap, powered by a single battery.|v${VERSION} · Built from scrap, powered by a single battery.|\" dist/index.html\necho \"$VERSION\" > dist/VERSION\n```\n\n### `site/index.html`\nNo template placeholder needed — `build.sh` does the sed replacement on the existing footer text.\n\n### `.woodpecker/site.yml`\n```yaml\nwhen:\n path: \"site/**\"\n event: push\n branch: main\n\nsteps:\n - name: build\n image: alpine\n commands:\n - cd site && sh build.sh\n - VERSION=$(cat site/VERSION)\n - tar czf site-${VERSION}.tar.gz -C site/dist .\n\n - name: publish\n image: alpine\n commands:\n - apk add curl\n - VERSION=$(cat site/VERSION)\n - >-\n curl -sf --user \"johba:$$FORGE_TOKEN\"\n --upload-file site-${VERSION}.tar.gz\n \"https://codeberg.org/api/packages/johba/generic/disinto-site/${VERSION}/site-${VERSION}.tar.gz\"\n environment:\n FORGE_TOKEN:\n from_secret: forge_token\n\n - name: deploy-staging\n image: alpine\n commands:\n - apk add curl\n - VERSION=$(cat site/VERSION)\n - >-\n curl -sf --user \"johba:$$FORGE_TOKEN\"\n \"https://codeberg.org/api/packages/johba/generic/disinto-site/${VERSION}/site-${VERSION}.tar.gz\"\n -o site.tar.gz\n - rm -rf /srv/staging/*\n - tar xzf site.tar.gz -C /srv/staging/\n environment:\n FORGE_TOKEN:\n from_secret: forge_token\n volumes:\n - /home/debian/staging-site:/srv/staging\n```\n\n## Infra setup (manual, before first run)\n- `mkdir -p /home/debian/staging-site`\n- Add to Caddyfile: `staging.disinto.ai { root * /home/debian/staging-site; file_server }`\n- DNS: `staging.disinto.ai` A record → same IP as `disinto.ai`\n- Reload Caddy: `sudo systemctl reload caddy`\n- Add `forge_token` as Woodpecker repo secret for johba/disinto (if not already set)\n- Add `/home/debian/staging-site` to `WOODPECKER_BACKEND_DOCKER_VOLUMES`\n\n## Verification\n- [ ] Merge PR that touches `site/` → CI runs site pipeline\n- [ ] Package appears at `codeberg.org/johba/-/packages/generic/disinto-site/0.1.0`\n- [ ] `staging.disinto.ai` serves the site with `v0.1.0` in footer\n- [ ] `disinto.ai` (production) unchanged\n\n## Related\n- #764 — docker stack edge proxy + staging (future: this moves inside the stack)\n- #755 — vault-gated production promotion (production deploy comes later)\n\n## Affected files\n- `site/VERSION` — new, holds current version string\n- `site/build.sh` — new, builds dist/ with version injected into footer\n- `.woodpecker/site.yml` — new, CI pipeline for build/publish/deploy-staging"
},
{
"action": "close",
"issue": 710,
"reason": "duplicate of #714"
},
{
"action": "comment",
"issue": 711,
"body": "Closing as duplicate of #715, which covers the same task (publishing to ClawHub) with complete acceptance criteria and affected files. All work should proceed under #715."
},
{
"action": "close",
"issue": 711,
"reason": "duplicate of #715"
}, },
{ {
"action": "edit_body", "action": "edit_body",
"issue": 712, "issue": 764,
"body": "## Context\n\nAfter ClawHub publishing (#715), expand reach by listing on secondary registries and discovery channels.\n\n## Dependencies\n- #715 (ClawHub listing must be live first)\n\n## Acceptance criteria\n- [ ] disinto skill appears in SkillsMP search (auto-indexed from GitHub or submitted manually)\n- [ ] PR submitted to awesome-agent-skills repo listing disinto under DevOps/Automation\n- [ ] SkillHub listing submitted (or confirmed live)\n- [ ] GitHub repo topics updated: `agent-skill`, `openclaw`, `clawhub`, `code-factory`, `automation`\n\n## Affected files\n- `README.md` (add secondary registry badges/links)\n- `.github/` or repo settings (GitHub topics — manual step)\n\n## Action items\n\n### SkillsMP (skillsmp.com)\n- [ ] SkillsMP auto-indexes from GitHub — ensure the skill directory is in the public repo\n- [ ] Verify disinto appears in SkillsMP search after a few days\n- [ ] If not auto-indexed, submit manually\n\n### awesome-agent-skills\n- [ ] Submit PR to github.com/skillmatic-ai/awesome-agent-skills\n- [ ] Add disinto under appropriate category (DevOps / Automation)\n\n### SkillHub (skillhub.club)\n- [ ] Submit skill for AI evaluation\n- [ ] Verify listing\n\n### LobeHub (lobehub.com/skills)\n- [ ] Submit skill to curated directory\n\n### GitHub discoverability\n- [ ] Add topics to repo: `agent-skill`, `openclaw`, `clawhub`, `code-factory`, `automation`\n- [ ] Ensure SKILL.md is discoverable at repo root or skill/ directory\n\n## References\n\n- Research report: #709\n- Skill package: #714\n- ClawHub listing: #715\n" "body": "Depends on: none (builds on existing docker-compose generation in `bin/disinto`)\n\n## Design\n\n`disinto init` + `disinto up` starts two additional containers as base factory infrastructure:\n\n### Edge proxy (Caddy)\n- Reverse proxies to Forgejo and Woodpecker\n- Serves staging site\n- Runs on ports 80/443\n- At bootstrap: IP-only, self-signed TLS or HTTP\n- Domain + Let's Encrypt added later via vault resource request\n\n### Staging container (Caddy)\n- Static file server for the project's staging artifacts\n- Starts with a default \"Nothing shipped yet\" page\n- CI pipelines write to a shared volume to update staging content\n- No vault approval needed — staging is the factory's sandbox\n\n### docker-compose addition\n```yaml\nservices:\n edge:\n image: caddy:alpine\n ports:\n - \"80:80\"\n - \"443:443\"\n volumes:\n - ./Caddyfile:/etc/caddy/Caddyfile\n - caddy_data:/data\n depends_on:\n - forgejo\n - woodpecker-server\n - staging\n\n staging:\n image: caddy:alpine\n volumes:\n - staging-site:/srv/site\n # Not exposed directly — edge proxies to it\n\nvolumes:\n caddy_data:\n staging-site:\n```\n\n### Caddyfile (generated by `disinto init`)\n```\n# IP-only at bootstrap, domain added later\n:80 {\n handle /forgejo/* {\n reverse_proxy forgejo:3000\n }\n handle /ci/* {\n reverse_proxy woodpecker-server:8000\n }\n handle {\n reverse_proxy staging:80\n }\n}\n```\n\n### Staging update flow\n1. CI builds artifact (site tarball, etc.)\n2. CI step writes to `staging-site` volume\n3. Staging container serves updated content immediately\n4. No restart needed — Caddy serves files directly\n\n### Domain lifecycle\n- Bootstrap: no domain, edge serves on IP\n- Later: factory files vault resource request for domain\n- Human buys domain, sets DNS\n- Caddyfile updated with domain, Let's Encrypt auto-provisions TLS\n\n## Affected files\n- `bin/disinto` — `generate_compose()` adds edge + staging services\n- New: default staging page (\"Nothing shipped yet\")\n- New: Caddyfile template in `docker/`\n\n## Related\n- #755 — vault-gated deployment promotion (production comes later)\n- #757 — ops repo (domain is a resource requested through vault)\n\n## Acceptance criteria\n- [ ] `disinto init` generates a `docker-compose.yml` that includes `edge` (Caddy) and `staging` containers\n- [ ] Edge proxy routes `/forgejo/*` → Forgejo, `/ci/*` → Woodpecker, default → staging container\n- [ ] Staging container serves a default \"Nothing shipped yet\" page on first boot\n- [ ] `docker/` directory contains a Caddyfile template generated by `disinto init`\n- [ ] `disinto up` starts all containers including edge and staging without manual steps"
},
{
"action": "edit_body",
"issue": 761,
"body": "Depends on: #747\n\n## Design\n\nEach agent account on the bundled Forgejo gets a `.profile` repo. This repo holds the agent's formula (copied from disinto at creation time) and its journal.\n\n### Structure\n```\n{agent-bot}/.profile/\n├── formula.toml # snapshot of the formula at agent creation time\n├── journal/ # daily logs of what the agent did\n│ ├── 2026-03-26.md\n│ └── ...\n└── knowledge/ # learned patterns, best-practices (optional, agent can evolve)\n```\n\n### Lifecycle\n1. **Create agent** — `disinto init` or `disinto spawn-agent` creates Forgejo account + `.profile` repo\n2. **Copy formula** — current `formulas/{role}.toml` from disinto repo is copied to `.profile/formula.toml`\n3. **Agent reads its own formula** — at session start, agent reads from its `.profile`, not from the disinto repo\n4. **Agent writes journal** — daily entries pushed to `.profile/journal/`\n5. **Agent can evolve knowledge** — best-practices, heuristics, patterns written to `.profile/knowledge/`\n\n### What this enables\n\n**A/B testing formulas:** Create two agents from different formula versions, run both against the same backlog, compare results (cycle time, CI pass rate, review rejection rate).\n\n**Rollback:** New formula worse? Kill agent, spawn from older formula version.\n\n**Audit:** What formula was this agent running when it produced that PR? Check its `.profile` at that git commit.\n\n**Drift tracking:** Diff what an agent learned (`.profile/knowledge/`) vs what it started with. Measure formula evolution over time.\n\n**Portability:** Move agent to different box — `git clone` its `.profile`.\n\n### Disinto repo becomes the template\n\n```\ndisinto repo:\n formulas/dev-agent.toml ← canonical template, evolves\n formulas/review-agent.toml\n formulas/planner.toml\n ...\n\nRunning agents:\n dev-bot-v2/.profile/formula.toml ← snapshot from formulas/dev-agent.toml@v2\n dev-bot-v3/.profile/formula.toml ← snapshot from formulas/dev-agent.toml@v3\n review-bot/.profile/formula.toml ← snapshot from formulas/review-agent.toml\n```\n\nThe formula in the disinto repo is the template. The `.profile` copy is the instance. They can diverge — that's a feature, not a bug.\n\n## Affected files\n- `bin/disinto` — agent creation copies formula to .profile\n- Agent session scripts — read formula from .profile instead of local formulas/ dir\n- Planner/supervisor — can read other agents' journals from their .profile repos\n\n## Related\n- #747 — per-agent Forgejo accounts (prerequisite)\n- #757 — ops repo (shared concerns stay there: vault, portfolio, resources)\n\n## Acceptance criteria\n- [ ] `disinto spawn-agent` (or `disinto init`) creates a Forgejo account + `.profile` repo for each agent bot\n- [ ] Current `formulas/{role}.toml` is copied to `.profile/formula.toml` at agent creation time\n- [ ] Agent session script reads its formula from `.profile/formula.toml`, not from the repo's `formulas/` directory\n- [ ] Agent writes daily journal entries to `.profile/journal/YYYY-MM-DD.md`"
},
{
"action": "edit_body",
"issue": 742,
"body": "## Problem\n\n`gardener/recipes/*.toml` (4 files: cascade-rebase, chicken-egg-ci, flaky-test, shellcheck-violations) are an older pattern predating `formulas/*.toml`. Two systems for the same thing.\n\n## Fix\n\nMigrate any unique content from recipes to the gardener formula or to new formulas. Delete the recipes directory.\n\n## Affected files\n- `gardener/recipes/*.toml` — delete after migration\n- `formulas/run-gardener.toml` — absorb relevant content\n- Gardener scripts that reference recipes/\n\n## Acceptance criteria\n- [ ] Contents of `gardener/recipes/*.toml` are diff'd against `formulas/run-gardener.toml` — any unique content is migrated\n- [ ] `gardener/recipes/` directory is deleted\n- [ ] No scripts in `gardener/` reference the `recipes/` path after migration\n- [ ] ShellCheck passes on all modified scripts"
}, },
{ {
"action": "add_label", "action": "add_label",
"issue": 712, "issue": 742,
"label": "backlog" "label": "backlog"
}, },
{ {
"action": "add_label", "action": "add_label",
"issue": 714, "issue": 741,
"label": "backlog" "label": "backlog"
},
{
"action": "add_label",
"issue": 715,
"label": "backlog"
},
{
"action": "create_issue",
"title": "fix: add weekly Docker prune cron to prevent recurring disk P1 threshold breach",
"body": "## Problem\n\nDisk has twice crossed the 80% P1 threshold in two days (peaked at 82% on 2026-03-24). The supervisor performs reactive Docker prune when P1 is hit, but a proactive scheduled cleanup would prevent the threshold from being crossed in the first place.\n\nSupersedes prediction in #644.\n\n## Action\n\nAdd a scheduled weekly Docker prune to the supervisor or cron config so Docker image/container buildup is cleared before it reaches crisis levels. The cleanup should run at a time offset from the 06:00 formula burst (e.g. Sunday 04:00 UTC).\n\n## Acceptance criteria\n- [ ] A Docker prune cron entry (weekly, off-peak) is added to the factory cron config or supervisor schedule\n- [ ] The cron runs `docker system prune -f && docker image prune -f --filter \"until=168h\"` (keeping images used in last 7 days)\n- [ ] Cron time does not overlap with the 06:00 formula burst (gardener, predictor, supervisor)\n- [ ] After the change, disk stays below 75% between reactive supervisor prune events\n\n## Affected files\n- `supervisor/supervisor-run.sh` (if adding to supervisor schedule)\n- or cron config file (if adding standalone cron entry)\n- `projects/disinto.toml.example` (if scheduling parameters belong there)\n\n## Related\n- #644 (prediction that triggered this)\n",
"labels": [
"backlog"
]
},
{
"action": "comment",
"issue": 644,
"body": "Actioned: created a backlog issue for adding a proactive weekly Docker prune cron to prevent recurring P1 disk threshold breaches. This supplements the reactive supervisor cleanup with a scheduled preventive pass."
},
{
"action": "add_label",
"issue": 644,
"label": "prediction/actioned"
},
{
"action": "remove_label",
"issue": 644,
"label": "prediction/backlog"
},
{
"action": "close",
"issue": 644,
"reason": "prediction actioned — recurring Docker prune cron backlog issue created"
},
{
"action": "create_issue",
"title": "fix: upgrade Caddy to v2.11.1 on harb-staging to patch CVE-2026-27590 RCE and medium CVEs",
"body": "## Problem\n\nThree CVEs affecting Caddy are fixed in v2.11.1:\n- CVE-2026-27590 (HIGH — RCE via FastCGI path-splitting bug)\n- Two medium CVEs (see #580 for details)\n\nharb-staging is currently running an older Caddy version and needs an upgrade.\n\nSupersedes prediction in #580.\n\n## Action\n\nUpgrade Caddy to v2.11.1 on harb-staging. Verify the service restarts cleanly.\n\n## Acceptance criteria\n- [ ] `caddy version` on harb-staging shows v2.11.1 or later\n- [ ] Caddy service is running and serving requests after upgrade\n- [ ] No CVE-2026-27590, CVE-2026-27589 in installed version\n\n## Affected files\n- harb-staging host: `/usr/local/bin/caddy` (upgraded in-place or via package manager)\n\n## References\n- Prediction: #580\n- CVE advisory: CVE-2026-27590 (FastCGI RCE), CVE-2026-27589 (medium)\n",
"labels": [
"action"
]
},
{
"action": "comment",
"issue": 580,
"body": "Actioned: created an action issue for upgrading Caddy to v2.11.1 on harb-staging to remediate CVE-2026-27590 (HIGH/RCE) and two medium CVEs. Priority: high."
},
{
"action": "add_label",
"issue": 580,
"label": "prediction/actioned"
},
{
"action": "remove_label",
"issue": 580,
"label": "prediction/backlog"
},
{
"action": "close",
"issue": 580,
"reason": "prediction actioned — Caddy upgrade action issue created"
},
{
"action": "create_issue",
"title": "fix: stagger formula agent cron start times to reduce simultaneous 06:00 RAM burst",
"body": "## Problem\n\nThree formula agents (gardener, predictor, supervisor) all start at 06:00 UTC simultaneously, competing for RAM and driving swap usage to 57% (2335MB / 4095MB). Swap spikes under the burst introduce latency and risk OOM events if usage grows further.\n\nSupersedes prediction in #529.\n\n## Action\n\nStagger the cron start times for formula agents by 12 minutes each so RAM pressure is distributed across time rather than concentrated at 06:00.\n\nSuggested schedule:\n- Supervisor: 06:00 (runs first, gathers health snapshot)\n- Predictor: 06:02\n- Gardener: 06:04\n- Planner: 06:06 (if on same schedule)\n\n## Acceptance criteria\n- [ ] Formula agent cron entries are offset by at least 1 minute from each other\n- [ ] No two formula agents start within the same minute\n- [ ] Swap usage at the 06:0006:10 window stays below 50% after the change\n\n## Affected files\n- `projects/disinto.toml` (cron schedule fields, if stored there)\n- or host cron config file (e.g. `/etc/cron.d/disinto-*`)\n- `BOOTSTRAP.md` (update documented cron schedule if shown there)\n\n## Related\n- #529 (prediction that triggered this)\n",
"labels": [
"backlog"
]
},
{
"action": "comment",
"issue": 529,
"body": "Actioned: created a backlog issue for staggering formula agent cron start times (supervisor 06:00, predictor 06:02, gardener 06:04) to distribute the RAM burst across 46 minutes instead of hitting simultaneously."
},
{
"action": "add_label",
"issue": 529,
"label": "prediction/actioned"
},
{
"action": "remove_label",
"issue": 529,
"label": "prediction/backlog"
},
{
"action": "close",
"issue": 529,
"reason": "prediction actioned — cron stagger backlog issue created"
} }
] ]

View file

@ -1,4 +1,4 @@
<!-- last-reviewed: 043bf0f0217aef3f319b844f1a1277acd6327a1c --> <!-- last-reviewed: f32707ba659de278a3af434e3549fb8a8dce9d3a -->
# Shared Helpers (`lib/`) # Shared Helpers (`lib/`)
All agents source `lib/env.sh` as their first action. Additional helpers are All agents source `lib/env.sh` as their first action. Additional helpers are
@ -6,8 +6,8 @@ sourced as needed.
| File | What it provides | Sourced by | | File | What it provides | Sourced by |
|---|---|---| |---|---|---|
| `lib/env.sh` | Loads `.env`, sets `FACTORY_ROOT`, exports project config (`FORGE_REPO`, `PROJECT_NAME`, etc.), defines `log()`, `forge_api()`, `forge_api_all()` (accepts optional second TOKEN parameter, defaults to `$FORGE_TOKEN`), `woodpecker_api()`, `wpdb()`. Auto-loads project TOML if `PROJECT_TOML` is set. **Container note**: when `DISINTO_CONTAINER=1`, `.env` is NOT re-sourced — compose already injects env vars (including `FORGE_URL=http://forgejo:3000`) and re-sourcing would clobber them. | Every agent | | `lib/env.sh` | Loads `.env`, sets `FACTORY_ROOT`, exports project config (`FORGE_REPO`, `PROJECT_NAME`, etc.), defines `log()`, `forge_api()`, `forge_api_all()` (accepts optional second TOKEN parameter, defaults to `$FORGE_TOKEN`), `woodpecker_api()`, `wpdb()`, `memory_guard()` (skips agent if RAM < threshold). Auto-loads project TOML if `PROJECT_TOML` is set. Exports per-agent tokens (`FORGE_PLANNER_TOKEN`, `FORGE_GARDENER_TOKEN`, `FORGE_VAULT_TOKEN`, `FORGE_SUPERVISOR_TOKEN`, `FORGE_PREDICTOR_TOKEN`, `FORGE_ACTION_TOKEN`) each falls back to `$FORGE_TOKEN` if not set. **Vault-only token guard (AD-006)**: `unset GITHUB_TOKEN CLAWHUB_TOKEN` so agents never hold external-action tokens only the vault-runner container receives them. **Container note**: when `DISINTO_CONTAINER=1`, `.env` is NOT re-sourced compose already injects env vars (including `FORGE_URL=http://forgejo:3000`) and re-sourcing would clobber them. | Every agent |
| `lib/ci-helpers.sh` | `ci_passed()` — returns 0 if CI state is "success" (or no CI configured). `ci_required_for_pr()` — returns 0 if PR has code files (CI required), 1 if non-code only (CI not required). `is_infra_step()` — returns 0 if a single CI step failure matches infra heuristics (clone/git exit 128, any exit 137, log timeout patterns). `classify_pipeline_failure()` — returns "infra \<reason>" if any failed Woodpecker step matches infra heuristics via `is_infra_step()`, else "code". `ensure_priority_label()` — looks up (or creates) the `priority` label and returns its ID; caches in `_PRIORITY_LABEL_ID`. `ci_commit_status <sha>` — queries Woodpecker directly for CI state, falls back to forge commit status API. `ci_pipeline_number <sha>` — returns the Woodpecker pipeline number for a commit, falls back to parsing forge status `target_url`. | dev-poll, review-poll, review-pr, supervisor-poll | | `lib/ci-helpers.sh` | `ci_passed()` — returns 0 if CI state is "success" (or no CI configured). `ci_required_for_pr()` — returns 0 if PR has code files (CI required), 1 if non-code only (CI not required). `is_infra_step()` — returns 0 if a single CI step failure matches infra heuristics (clone/git exit 128, any exit 137, log timeout patterns). `classify_pipeline_failure()` — returns "infra \<reason>" if any failed Woodpecker step matches infra heuristics via `is_infra_step()`, else "code". `ensure_priority_label()` — looks up (or creates) the `priority` label and returns its ID; caches in `_PRIORITY_LABEL_ID`. `ci_commit_status <sha>` — queries Woodpecker directly for CI state, falls back to forge commit status API. `ci_pipeline_number <sha>` — returns the Woodpecker pipeline number for a commit, falls back to parsing forge status `target_url`. `ci_promote <repo_id> <pipeline_num> <environment>` — promotes a pipeline to a named Woodpecker environment (vault-gated deployment: vault approves, vault-fire calls this). | dev-poll, review-poll, review-pr, supervisor-poll |
| `lib/ci-debug.sh` | CLI tool for Woodpecker CI: `list`, `status`, `logs`, `failures` subcommands. Not sourced — run directly. | Humans / dev-agent (tool access) | | `lib/ci-debug.sh` | CLI tool for Woodpecker CI: `list`, `status`, `logs`, `failures` subcommands. Not sourced — run directly. | Humans / dev-agent (tool access) |
| `lib/load-project.sh` | Parses a `projects/*.toml` file into env vars (`PROJECT_NAME`, `FORGE_REPO`, `WOODPECKER_REPO_ID`, monitoring toggles, mirror config, etc.). | env.sh (when `PROJECT_TOML` is set), supervisor-poll (per-project iteration) | | `lib/load-project.sh` | Parses a `projects/*.toml` file into env vars (`PROJECT_NAME`, `FORGE_REPO`, `WOODPECKER_REPO_ID`, monitoring toggles, mirror config, etc.). | env.sh (when `PROJECT_TOML` is set), supervisor-poll (per-project iteration) |
| `lib/parse-deps.sh` | Extracts dependency issue numbers from an issue body (stdin → stdout, one number per line). Matches `## Dependencies` / `## Depends on` / `## Blocked by` sections and inline `depends on #N` / `blocked by #N` patterns. Inline scan skips fenced code blocks to prevent false positives from code examples in issue bodies. Not sourced — executed via `bash lib/parse-deps.sh`. | dev-poll, supervisor-poll | | `lib/parse-deps.sh` | Extracts dependency issue numbers from an issue body (stdin → stdout, one number per line). Matches `## Dependencies` / `## Depends on` / `## Blocked by` sections and inline `depends on #N` / `blocked by #N` patterns. Inline scan skips fenced code blocks to prevent false positives from code examples in issue bodies. Not sourced — executed via `bash lib/parse-deps.sh`. | dev-poll, supervisor-poll |

View file

@ -1,4 +1,4 @@
<!-- last-reviewed: 043bf0f0217aef3f319b844f1a1277acd6327a1c --> <!-- last-reviewed: f32707ba659de278a3af434e3549fb8a8dce9d3a -->
# Planner Agent # Planner Agent
**Role**: Strategic planning using a Prerequisite Tree (Theory of Constraints), **Role**: Strategic planning using a Prerequisite Tree (Theory of Constraints),
@ -74,5 +74,5 @@ prerequisite tree but NOT as issues. This prevents the "spray issues across
all milestones" pattern that produced premature work in planner v1/v2. all milestones" pattern that produced premature work in planner v1/v2.
**Environment variables consumed**: **Environment variables consumed**:
- `FORGE_TOKEN`, `FORGE_REPO`, `FORGE_API`, `PROJECT_NAME`, `PROJECT_REPO_ROOT` - `FORGE_TOKEN`, `FORGE_PLANNER_TOKEN` (falls back to FORGE_TOKEN), `FORGE_REPO`, `FORGE_API`, `PROJECT_NAME`, `PROJECT_REPO_ROOT`
- `PRIMARY_BRANCH`, `CLAUDE_MODEL` (set to opus by planner-run.sh) - `PRIMARY_BRANCH`, `CLAUDE_MODEL` (set to opus by planner-run.sh)

View file

@ -1,4 +1,4 @@
<!-- last-reviewed: 043bf0f0217aef3f319b844f1a1277acd6327a1c --> <!-- last-reviewed: f32707ba659de278a3af434e3549fb8a8dce9d3a -->
# Predictor Agent # Predictor Agent
**Role**: Abstract adversary (the "goblin"). Runs a 2-step formula **Role**: Abstract adversary (the "goblin"). Runs a 2-step formula
@ -41,7 +41,7 @@ RAM < 2000 MB).
interactive session interactive session
**Environment variables consumed**: **Environment variables consumed**:
- `FORGE_TOKEN`, `FORGE_REPO`, `FORGE_API`, `PROJECT_NAME`, `PROJECT_REPO_ROOT` - `FORGE_TOKEN`, `FORGE_PREDICTOR_TOKEN` (falls back to FORGE_TOKEN), `FORGE_REPO`, `FORGE_API`, `PROJECT_NAME`, `PROJECT_REPO_ROOT`
- `PRIMARY_BRANCH`, `CLAUDE_MODEL` (set to sonnet by predictor-run.sh) - `PRIMARY_BRANCH`, `CLAUDE_MODEL` (set to sonnet by predictor-run.sh)
**Lifecycle**: predictor-run.sh (daily 06:00 cron) → lock + memory guard → **Lifecycle**: predictor-run.sh (daily 06:00 cron) → lock + memory guard →

View file

@ -1,4 +1,4 @@
<!-- last-reviewed: 043bf0f0217aef3f319b844f1a1277acd6327a1c --> <!-- last-reviewed: f32707ba659de278a3af434e3549fb8a8dce9d3a -->
# Review Agent # Review Agent
**Role**: AI-powered PR review — post structured findings and formal **Role**: AI-powered PR review — post structured findings and formal

View file

@ -1,4 +1,4 @@
<!-- last-reviewed: f2064ba67c3b6819f5e252300927c01e2825dd7c --> <!-- last-reviewed: f32707ba659de278a3af434e3549fb8a8dce9d3a -->
# Supervisor Agent # Supervisor Agent
**Role**: Health monitoring and auto-remediation, executed as a formula-driven **Role**: Health monitoring and auto-remediation, executed as a formula-driven
@ -25,7 +25,9 @@ runs directly from cron like the planner and predictor.
tails, CI pipeline status, open PRs, issue counts, stale worktrees, blocked tails, CI pipeline status, open PRs, issue counts, stale worktrees, blocked
issues. Also performs **stale phase cleanup**: scans `/tmp/*-session-*.phase` issues. Also performs **stale phase cleanup**: scans `/tmp/*-session-*.phase`
files for `PHASE:escalate` entries and auto-removes any whose linked issue files for `PHASE:escalate` entries and auto-removes any whose linked issue
is confirmed closed (24h grace period after closure to avoid races) is confirmed closed (24h grace period after closure to avoid races). Reports
**stale crashed worktrees** (worktrees preserved after crash) — supervisor
housekeeping removes them after 24h
- `formulas/run-supervisor.toml` — Execution spec: five steps (preflight review, - `formulas/run-supervisor.toml` — Execution spec: five steps (preflight review,
health-assessment, decide-actions, report, journal) with `needs` dependencies. health-assessment, decide-actions, report, journal) with `needs` dependencies.
Claude evaluates all metrics and takes actions in a single interactive session Claude evaluates all metrics and takes actions in a single interactive session
@ -41,7 +43,7 @@ runs directly from cron like the planner and predictor.
P3 (degraded PRs, circular deps, stale deps), P4 (housekeeping). P3 (degraded PRs, circular deps, stale deps), P4 (housekeeping).
**Environment variables consumed**: **Environment variables consumed**:
- `FORGE_TOKEN`, `FORGE_REPO`, `FORGE_API`, `PROJECT_NAME`, `PROJECT_REPO_ROOT` - `FORGE_TOKEN`, `FORGE_SUPERVISOR_TOKEN` (falls back to FORGE_TOKEN), `FORGE_REPO`, `FORGE_API`, `PROJECT_NAME`, `PROJECT_REPO_ROOT`
- `PRIMARY_BRANCH`, `CLAUDE_MODEL` (set to sonnet by supervisor-run.sh) - `PRIMARY_BRANCH`, `CLAUDE_MODEL` (set to sonnet by supervisor-run.sh)
- `WOODPECKER_TOKEN`, `WOODPECKER_SERVER`, `WOODPECKER_DB_PASSWORD`, `WOODPECKER_DB_USER`, `WOODPECKER_DB_HOST`, `WOODPECKER_DB_NAME` — CI database queries - `WOODPECKER_TOKEN`, `WOODPECKER_SERVER`, `WOODPECKER_DB_PASSWORD`, `WOODPECKER_DB_USER`, `WOODPECKER_DB_HOST`, `WOODPECKER_DB_NAME` — CI database queries

View file

@ -1,4 +1,4 @@
<!-- last-reviewed: 043bf0f0217aef3f319b844f1a1277acd6327a1c --> <!-- last-reviewed: f32707ba659de278a3af434e3549fb8a8dce9d3a -->
# Vault Agent # Vault Agent
**Role**: Three-pipeline gate — action safety classification, resource procurement, and human-action drafting. **Role**: Three-pipeline gate — action safety classification, resource procurement, and human-action drafting.
@ -28,8 +28,9 @@ needed — the human reviews and publishes directly.
**Key files**: **Key files**:
- `vault/vault-poll.sh` — Processes pending items: retry approved, auto-reject after 48h timeout, invoke vault-agent for JSON actions, notify human for procurement requests - `vault/vault-poll.sh` — Processes pending items: retry approved, auto-reject after 48h timeout, invoke vault-agent for JSON actions, notify human for procurement requests
- `vault/vault-agent.sh` — Classifies and routes pending JSON actions via `claude -p`: auto-approve, auto-reject, or escalate to human - `vault/vault-agent.sh` — Classifies and routes pending JSON actions via `claude -p`: auto-approve, auto-reject, or escalate to human
- `vault/vault-env.sh` — Shared env setup for vault sub-scripts: sources `lib/env.sh`, overrides `FORGE_TOKEN` with `FORGE_VAULT_TOKEN`, sets `VAULT_TOKEN` for vault-runner container
- `vault/PROMPT.md` — System prompt for the vault agent's Claude invocation - `vault/PROMPT.md` — System prompt for the vault agent's Claude invocation
- `vault/vault-fire.sh` — Executes an approved action (JSON) or writes RESOURCES.md entry (procurement MD) - `vault/vault-fire.sh` — Executes an approved action (JSON) in an **ephemeral Docker container** with vault-only secrets injected (GITHUB_TOKEN, CLAWHUB_TOKEN — never exposed to agents). For deployment actions, calls `lib/ci-helpers.sh:ci_promote()` to gate production promotes via Woodpecker environments. Writes RESOURCES.md entry for procurement MD approvals.
- `vault/vault-reject.sh` — Marks a JSON action as rejected - `vault/vault-reject.sh` — Marks a JSON action as rejected
- `formulas/run-rent-a-human.toml` — Formula for human-action drafts: Claude researches target platform norms, drafts copy-paste content, writes to `vault/outreach/{platform}/drafts/`, notifies human via vault/forge - `formulas/run-rent-a-human.toml` — Formula for human-action drafts: Claude researches target platform norms, drafts copy-paste content, writes to `vault/outreach/{platform}/drafts/`, notifies human via vault/forge