From f480cbe5d080c44a9501b32c3f31dbc91edc26cd Mon Sep 17 00:00:00 2001 From: openhands Date: Sat, 21 Mar 2026 12:44:23 +0000 Subject: [PATCH] chore: gardener housekeeping 2026-03-21 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Progressive disclosure split of AGENTS.md (487→152 lines): - Extracted per-directory AGENTS.md files for all 8 agents + lib/ - Root AGENTS.md now serves as a table of contents with summary table - All watermarks updated to 16e430e Grooming results: - Promoted #469 (WATCH flow missing curl) and #436 (idle_pane_count bug) to backlog - 12 dust items classified, no groups ripe for bundling yet - No blocked issues, no AD violations --- AGENTS.md | 383 +++---------------------------------------- action/AGENTS.md | 31 ++++ dev/AGENTS.md | 27 +++ gardener/AGENTS.md | 23 +++ lib/AGENTS.md | 18 ++ planner/AGENTS.md | 43 +++++ predictor/AGENTS.md | 42 +++++ review/AGENTS.md | 20 +++ supervisor/AGENTS.md | 53 ++++++ vault/AGENTS.md | 20 +++ 10 files changed, 301 insertions(+), 359 deletions(-) create mode 100644 action/AGENTS.md create mode 100644 dev/AGENTS.md create mode 100644 gardener/AGENTS.md create mode 100644 lib/AGENTS.md create mode 100644 planner/AGENTS.md create mode 100644 predictor/AGENTS.md create mode 100644 review/AGENTS.md create mode 100644 supervisor/AGENTS.md create mode 100644 vault/AGENTS.md diff --git a/AGENTS.md b/AGENTS.md index 54a457e..8dbc47e 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -1,4 +1,4 @@ - + # Disinto — Agent Instructions ## What this repo is @@ -71,304 +71,24 @@ bash dev/phase-test.sh ## Agents -### Dev (`dev/`) +| Agent | Directory | Role | Guide | +|-------|-----------|------|-------| +| Dev | `dev/` | Issue implementation | [dev/AGENTS.md](dev/AGENTS.md) | +| Review | `review/` | PR review | [review/AGENTS.md](review/AGENTS.md) | +| Gardener | `gardener/` | Backlog grooming | [gardener/AGENTS.md](gardener/AGENTS.md) | +| Supervisor | `supervisor/` | Health monitoring | [supervisor/AGENTS.md](supervisor/AGENTS.md) | +| Planner | `planner/` | Strategic planning | [planner/AGENTS.md](planner/AGENTS.md) | +| Predictor | `predictor/` | Infrastructure pattern detection | [predictor/AGENTS.md](predictor/AGENTS.md) | +| Action | `action/` | Operational task execution | [action/AGENTS.md](action/AGENTS.md) | +| Vault | `vault/` | Safety gate for dangerous actions | [vault/AGENTS.md](vault/AGENTS.md) | -**Role**: Implement issues autonomously — write code, push branches, address -CI failures and review feedback. - -**Trigger**: `dev-poll.sh` runs every 10 min via cron. It scans for ready -backlog issues (all deps closed) or orphaned in-progress issues and spawns -`dev-agent.sh `. - -**Key files**: -- `dev/dev-poll.sh` — Cron scheduler: finds next ready issue, handles merge/rebase of approved PRs, tracks CI fix attempts -- `dev/dev-agent.sh` — Orchestrator: claims issue, creates worktree + tmux session with interactive `claude`, monitors phase file, injects CI results and review feedback, merges on approval -- `dev/phase-test.sh` — Integration test for the phase protocol - -**Environment variables consumed** (via `lib/env.sh` + project TOML): -- `CODEBERG_TOKEN` — Dev-agent token (push, PR creation, merge) — use the dedicated bot account -- `CODEBERG_REPO`, `CODEBERG_API` — Target repository -- `PROJECT_NAME`, `PROJECT_REPO_ROOT` — Local checkout path -- `PRIMARY_BRANCH` — Branch to merge into (e.g. `main`, `master`) -- `WOODPECKER_REPO_ID` — CI pipeline lookups -- `CLAUDE_TIMEOUT` — Max seconds for a Claude session (default 7200) -- `MATRIX_TOKEN`, `MATRIX_ROOM_ID`, `MATRIX_HOMESERVER` — Notifications (optional) - -**Lifecycle**: dev-poll.sh → dev-agent.sh → create Matrix thread + export -`MATRIX_THREAD_ID` (streams Claude output to thread via Stop hook) → tmux -`dev-{project}-{issue}` → phase file drives CI/review loop → merge → close issue. - -### Review (`review/`) - -**Role**: AI-powered PR review — post structured findings and formal -approve/request-changes verdicts to Codeberg. - -**Trigger**: `review-poll.sh` runs every 10 min via cron. It scans open PRs -whose CI has passed and that lack a review for the current HEAD SHA, then -spawns `review-pr.sh `. - -**Key files**: -- `review/review-poll.sh` — Cron scheduler: finds unreviewed PRs with passing CI -- `review/review-pr.sh` — Creates/reuses a tmux session (`review-{project}-{pr}`), injects PR diff, waits for Claude to write structured JSON output, posts markdown review + formal Codeberg review, auto-creates follow-up issues for pre-existing tech debt - -**Environment variables consumed**: -- `CODEBERG_TOKEN` — Dev-agent token (must not be the same account as REVIEW_BOT_TOKEN) -- `REVIEW_BOT_TOKEN` — Review-agent token for approvals (use human/admin account; branch protection: in approvals whitelist) -- `CODEBERG_REPO`, `CODEBERG_API`, `PROJECT_NAME`, `PROJECT_REPO_ROOT` -- `PRIMARY_BRANCH`, `WOODPECKER_REPO_ID` -- `MATRIX_TOKEN`, `MATRIX_ROOM_ID`, `MATRIX_HOMESERVER` - -### Gardener (`gardener/`) - -**Role**: Backlog grooming — detect duplicate issues, missing acceptance -criteria, oversized issues, stale issues, and circular dependencies. Invoke -Claude to fix or escalate to a human via Matrix. - -**Trigger**: `gardener-run.sh` runs 2x/day via cron. It files an `action` -issue referencing `formulas/run-gardener.toml`; the [action-agent](#action-action) -picks it up and executes the gardener steps in an interactive Claude tmux session. -Accepts an optional project TOML argument (configures which project the action -issue is filed against). - -**Key files**: -- `gardener/gardener-run.sh` — Cron wrapper: lock, memory guard, dedup check, files action issue -- `gardener/gardener-poll.sh` — Escalation-reply injection for dev sessions, invokes gardener-agent.sh for grooming -- `gardener/gardener-agent.sh` — Orchestrator: bash pre-analysis, creates tmux session (`gardener-{project}`) with interactive `claude`, monitors phase file, parses result file (ACTION:/DUST:/ESCALATE) -- `formulas/run-gardener.toml` — Execution spec: preflight, grooming, dust-bundling, blocked-review, agents-update, commit-and-pr - -**Environment variables consumed**: -- `CODEBERG_TOKEN`, `CODEBERG_REPO`, `CODEBERG_API`, `PROJECT_NAME`, `PROJECT_REPO_ROOT` -- `CLAUDE_TIMEOUT` -- `MATRIX_TOKEN`, `MATRIX_ROOM_ID`, `MATRIX_HOMESERVER` - -### Supervisor (`supervisor/`) - -**Role**: Health monitoring and auto-remediation, executed as a formula-driven -Claude agent. Collects system and project metrics via a bash pre-flight script, -then runs an interactive Claude session (sonnet) that assesses health, auto-fixes -issues, escalates via Matrix, and writes a daily journal. - -**Trigger**: `supervisor-run.sh` runs every 20 min via cron. It creates a tmux -session with `claude --model sonnet`, injects `formulas/run-supervisor.toml` -with pre-collected metrics as context, monitors the phase file, and cleans up -on completion or timeout (20 min max session). No action issues — the supervisor -runs directly from cron like the planner and predictor. - -**Key files**: -- `supervisor/supervisor-run.sh` — Cron wrapper + orchestrator: lock, memory guard, - runs preflight.sh, sources disinto project config, creates tmux session, injects - formula prompt with metrics, monitors phase file, handles crash recovery via - `run_formula_and_monitor` -- `supervisor/preflight.sh` — Data collection: system resources (RAM, disk, swap, - load), Docker status, active tmux sessions + phase files, lock files, agent log - tails, CI pipeline status, open PRs, issue counts, stale worktrees, blocked - issues, Matrix escalation replies -- `formulas/run-supervisor.toml` — Execution spec: five steps (preflight review, - health-assessment, decide-actions, report, journal) with `needs` dependencies. - Claude evaluates all metrics and takes actions in a single interactive session -- `supervisor/journal/*.md` — Daily health logs from each supervisor run (local, - committed periodically) -- `supervisor/PROMPT.md` — Best-practices reference for remediation actions -- `supervisor/best-practices/*.md` — Domain-specific remediation guides (memory, - disk, CI, git, dev-agent, review-agent, codeberg) -- `supervisor/supervisor-poll.sh` — Legacy bash orchestrator (superseded by - supervisor-run.sh + formula) - -**Alert priorities**: P0 (memory crisis), P1 (disk), P2 (factory stopped/stalled), -P3 (degraded PRs, circular deps, stale deps), P4 (housekeeping). - -**Matrix integration**: The supervisor has its own Matrix thread. Posts health -summaries when there are changes, escalates P0-P2 issues, and processes replies -from humans ("ignore disk warning", "kill that agent", "what's stuck?"). The -Matrix listener routes thread replies to `/tmp/supervisor-escalation-reply`, -which `supervisor-run.sh` consumes atomically on each run. - -**Environment variables consumed**: -- `CODEBERG_TOKEN`, `CODEBERG_REPO`, `CODEBERG_API`, `PROJECT_NAME`, `PROJECT_REPO_ROOT` -- `PRIMARY_BRANCH`, `CLAUDE_MODEL` (set to sonnet by supervisor-run.sh) -- `WOODPECKER_TOKEN`, `WOODPECKER_SERVER`, `WOODPECKER_DB_PASSWORD`, `WOODPECKER_DB_USER`, `WOODPECKER_DB_HOST`, `WOODPECKER_DB_NAME` — CI database queries -- `MATRIX_TOKEN`, `MATRIX_ROOM_ID`, `MATRIX_HOMESERVER` — Matrix notifications + human input - -**Lifecycle**: supervisor-run.sh (cron */20) → lock + memory guard → run -preflight.sh (collect metrics) → consume escalation replies → load formula + -context → create tmux session → Claude assesses health, auto-fixes, posts -Matrix summary, writes journal → `PHASE:done`. - -### Planner (`planner/`) - -**Role**: Strategic planning, executed directly from cron via tmux + Claude. -Phase 0 (preflight): pull latest code, load persistent memory from -`planner/MEMORY.md`. Phase 1 (prediction-triage): triage -`prediction/unreviewed` issues filed by the [Predictor](#predictor-planner) — -for each prediction: promote to action, promote to backlog, watch (relabel to -prediction/backlog), or dismiss with reasoning. Promoted predictions compete -with vision gaps for the per-cycle issue limit. Phase 2 (strategic-planning): -resource+leverage gap analysis — reasons about VISION.md, RESOURCES.md, -formula catalog, and project state to create up to 5 total issues (including -promotions) prioritized by leverage. Phase 3 (journal-and-memory): write -daily journal entry (committed to git) and update `planner/MEMORY.md` -(committed to git). Phase 4 (commit-and-pr): one commit with all file -changes, push, create PR. AGENTS.md maintenance is handled by the -[Gardener](#gardener-gardener). - -**Trigger**: `planner-run.sh` runs weekly via cron. It creates a tmux session -with `claude --model opus`, injects `formulas/run-planner.toml` as context, -monitors the phase file, and cleans up on completion or timeout. No action -issues — the planner is a nervous system component, not work. - -**Key files**: -- `planner/planner-run.sh` — Cron wrapper + orchestrator: lock, memory guard, - sources disinto project config, creates tmux session, injects formula prompt, - monitors phase file, handles crash recovery, cleans up -- `formulas/run-planner.toml` — Execution spec: five steps (preflight, - prediction-triage, strategic-planning, journal-and-memory, commit-and-pr) - with `needs` dependencies. Claude executes all steps in a single interactive - session with tool access -- `planner/MEMORY.md` — Persistent memory across runs (committed to git) -- `planner/journal/*.md` — Daily raw logs from each planner run (committed to git) - -**Future direction**: The [Predictor](#predictor-predictor) files prediction issues daily for the planner to triage. The next step is evidence-gated deployment (see `docs/EVIDENCE-ARCHITECTURE.md`): replacing human "ship it" decisions with automated gates across dimensions (holdout, red-team, user-test, evolution fitness, protocol metrics, funnel). Not yet implemented. - -**Environment variables consumed**: -- `CODEBERG_TOKEN`, `CODEBERG_REPO`, `CODEBERG_API`, `PROJECT_NAME`, `PROJECT_REPO_ROOT` -- `PRIMARY_BRANCH`, `CLAUDE_MODEL` (set to opus by planner-run.sh) -- `MATRIX_TOKEN`, `MATRIX_ROOM_ID`, `MATRIX_HOMESERVER` - -### Predictor (`predictor/`) - -**Role**: Infrastructure pattern detection (the "goblin"). Runs a 3-step -formula (preflight → collect-signals → analyze-and-predict) via interactive -tmux Claude session (sonnet). Collects disinto-specific signals: CI pipeline -trends (Woodpecker), stale issues, agent health (tmux sessions + logs), and -resource patterns (RAM, disk, load, containers). Files up to 5 -`prediction/unreviewed` issues for the [Planner](#planner-planner) to triage. -The predictor MUST NOT emit feature work — only observations about CI health, -issue staleness, agent status, and system conditions. - -**Trigger**: `predictor-run.sh` runs daily at 06:00 UTC via cron (1h before -the planner at 07:00). Guarded by PID lock (`/tmp/predictor-run.lock`) and -memory check (skips if available RAM < 2000 MB). - -**Key files**: -- `predictor/predictor-run.sh` — Cron wrapper + orchestrator: lock, memory guard, - sources disinto project config, builds prompt with formula + Codeberg API - reference, creates tmux session (sonnet), monitors phase file, handles crash - recovery via `run_formula_and_monitor` -- `formulas/run-predictor.toml` — Execution spec: three steps (preflight, - collect-signals, analyze-and-predict) with `needs` dependencies. Claude - collects signals and files prediction issues in a single interactive session - -**Supersedes**: The legacy predictor (`planner/prediction-poll.sh` + -`planner/prediction-agent.sh`) used `claude -p` one-shot, read `evidence/` -JSON, and ran hourly. This formula-based predictor replaces it with direct -CI/issues/logs signal collection and interactive Claude sessions, matching the -planner's tmux+formula pattern. - -**Environment variables consumed**: -- `CODEBERG_TOKEN`, `CODEBERG_REPO`, `CODEBERG_API`, `PROJECT_NAME`, `PROJECT_REPO_ROOT` -- `PRIMARY_BRANCH`, `CLAUDE_MODEL` (set to sonnet by predictor-run.sh) -- `WOODPECKER_TOKEN`, `WOODPECKER_SERVER` — CI pipeline trend queries (optional; skipped if unset) -- `MATRIX_TOKEN`, `MATRIX_ROOM_ID`, `MATRIX_HOMESERVER` — Notifications (optional) - -**Lifecycle**: predictor-run.sh (daily 06:00 cron) → lock + memory guard → -load formula + context → create tmux session → Claude collects signals -(CI trends, stale issues, agent health, resources) → dedup against existing -open predictions → file `prediction/unreviewed` issues → `PHASE:done`. -The planner's Phase 1 later triages these predictions. - -### Action (`action/`) - -**Role**: Execute operational tasks described by action formulas — run scripts, -call APIs, send messages, collect human approval. Shares the same phase handler -as the dev-agent: if an action produces code changes, the orchestrator creates a -PR and drives the CI/review loop; otherwise Claude closes the issue directly. - -**Trigger**: `action-poll.sh` runs every 10 min via cron. It scans for open -issues labeled `action` that have no active tmux session, then spawns -`action-agent.sh `. - -**Key files**: -- `action/action-poll.sh` — Cron scheduler: finds open action issues with no active tmux session, spawns action-agent.sh -- `action/action-agent.sh` — Orchestrator: fetches issue body + prior comments, creates tmux session (`action-{issue_num}`) with interactive `claude`, injects formula prompt with phase protocol, enters `monitor_phase_loop` (shared via `dev/phase-handler.sh`) for CI/review lifecycle or direct completion - -**Session lifecycle**: -1. `action-poll.sh` finds open `action` issues with no active tmux session. -2. Spawns `action-agent.sh `. -3. Agent creates Matrix thread, exports `MATRIX_THREAD_ID` so Claude's output streams to the thread via a Stop hook (`on-stop-matrix.sh`). -4. Agent creates tmux session `action-{issue_num}`, injects prompt (formula + prior comments + phase protocol). -5. Agent enters `monitor_phase_loop` (shared with dev-agent via `dev/phase-handler.sh`). -6. **Path A (git output):** Claude pushes branch → `PHASE:awaiting_ci` → handler creates PR, polls CI → injects failures → Claude fixes → push → re-poll → CI passes → `PHASE:awaiting_review` → handler polls reviews → injects REQUEST_CHANGES → Claude fixes → approved → merge → cleanup. -7. **Path B (no git output):** Claude posts results as comment, closes issue → `PHASE:done` → handler cleans up (kill session, docker compose down, remove temp files). -8. For human input: Claude sends a Matrix message and waits; the reply is injected into the session by `matrix_listener.sh`. - -**Environment variables consumed**: -- `CODEBERG_TOKEN`, `CODEBERG_REPO`, `CODEBERG_API`, `PROJECT_NAME`, `CODEBERG_WEB` -- `MATRIX_TOKEN`, `MATRIX_ROOM_ID`, `MATRIX_HOMESERVER` — Matrix notifications + human input -- `ACTION_IDLE_TIMEOUT` — Max seconds before killing idle session (default 14400 = 4h) -- `ACTION_MAX_LIFETIME` — Max total session wall-clock seconds (default 28800 = 8h); caps session independently of idle timeout - ---- - -### Vault (`vault/`) - -**Role**: Safety gate for dangerous or irreversible actions. Actions enter a -pending queue and are classified by Claude via `vault-agent.sh`, which can -auto-approve (call `vault-fire.sh` directly), auto-reject (call -`vault-reject.sh`), or escalate to a human via Matrix for APPROVE/REJECT. - -**Trigger**: `vault-poll.sh` runs every 30 min via cron. - -**Key files**: -- `vault/vault-poll.sh` — Processes pending actions: retry approved, auto-reject after 48h timeout, invoke vault-agent for new items -- `vault/vault-agent.sh` — Classifies and routes pending actions via `claude -p`: auto-approve, auto-reject, or escalate to human -- `vault/PROMPT.md` — System prompt for the vault agent's Claude invocation -- `vault/vault-fire.sh` — Executes an approved action -- `vault/vault-reject.sh` — Marks an action as rejected - -**Environment variables consumed**: -- All from `lib/env.sh` -- `MATRIX_TOKEN`, `MATRIX_ROOM_ID`, `MATRIX_HOMESERVER` — Escalation channel - ---- - -## Shared helpers (`lib/`) - -All agents source `lib/env.sh` as their first action. Additional helpers are -sourced as needed. - -| File | What it provides | Sourced by | -|---|---|---| -| `lib/env.sh` | Loads `.env`, sets `FACTORY_ROOT`, exports project config (`CODEBERG_REPO`, `PROJECT_NAME`, etc.), defines `log()`, `codeberg_api()`, `codeberg_api_all()` (accepts optional second TOKEN parameter, defaults to `$CODEBERG_TOKEN`), `woodpecker_api()`, `wpdb()`, `matrix_send()`, `matrix_send_ctx()`. Auto-loads project TOML if `PROJECT_TOML` is set. | Every agent | -| `lib/ci-helpers.sh` | `ci_passed()` — returns 0 if CI state is "success" (or no CI configured). `is_infra_step()` — returns 0 if a single CI step failure matches infra heuristics (clone/git exit 128, any exit 137, log timeout patterns). `classify_pipeline_failure()` — returns "infra \" if any failed Woodpecker step matches infra heuristics via `is_infra_step()`, else "code". | dev-poll, review-poll, review-pr, supervisor-poll | -| `lib/ci-debug.sh` | CLI tool for Woodpecker CI: `list`, `status`, `logs`, `failures` subcommands. Not sourced — run directly. | Humans / dev-agent (tool access) | -| `lib/load-project.sh` | Parses a `projects/*.toml` file into env vars (`PROJECT_NAME`, `CODEBERG_REPO`, `WOODPECKER_REPO_ID`, monitoring toggles, Matrix config, etc.). | env.sh (when `PROJECT_TOML` is set), supervisor-poll (per-project iteration) | -| `lib/parse-deps.sh` | Extracts dependency issue numbers from an issue body (stdin → stdout, one number per line). Matches `## Dependencies` / `## Depends on` / `## Blocked by` sections and inline `depends on #N` patterns. Not sourced — executed via `bash lib/parse-deps.sh`. | dev-poll, supervisor-poll | -| `lib/matrix_listener.sh` | Long-poll Matrix sync daemon. Dispatches thread replies to the correct agent via well-known files (`/tmp/{agent}-escalation-reply`). Handles supervisor, gardener, dev, review, vault, and action reply routing. Run as systemd service. | Standalone daemon | -| `lib/formula-session.sh` | `acquire_cron_lock()`, `check_memory()`, `load_formula()`, `build_context_block()`, `start_formula_session()`, `formula_phase_callback()`, `build_prompt_footer()`, `run_formula_and_monitor()` — shared helpers for formula-driven cron agents (lock, memory guard, formula loading, prompt assembly, tmux session, monitor loop, crash recovery). | planner-run.sh, predictor-run.sh | -| `lib/secret-scan.sh` | `scan_for_secrets()` — detects potential secrets (API keys, bearer tokens, private keys, URLs with embedded credentials) in text; returns 1 if secrets found. `redact_secrets()` — replaces detected secret patterns with `[REDACTED]`. | file-action-issue.sh, phase-handler.sh | -| `lib/file-action-issue.sh` | `file_action_issue()` — dedup check, secret scan, label lookup, and issue creation for formula-driven cron wrappers. Sets `FILED_ISSUE_NUM` on success. Returns 4 if secrets detected in body. | gardener-run.sh | -| `lib/agent-session.sh` | Shared tmux + Claude session helpers: `create_agent_session()`, `inject_formula()`, `agent_wait_for_claude_ready()`, `agent_inject_into_session()`, `agent_kill_session()`, `monitor_phase_loop()`, `read_phase()`, `write_compact_context()`. `create_agent_session(session, workdir, [phase_file])` optionally installs a PostToolUse hook (matcher `Bash\|Write`) that detects phase file writes in real-time — when Claude writes to the phase file, the hook writes a marker so `monitor_phase_loop` reacts on the next poll instead of waiting for mtime changes. Also installs a StopFailure hook (matcher `rate_limit\|server_error\|authentication_failed\|billing_error`) that writes `PHASE:failed` with an `api_error` reason to the phase file and touches the phase-changed marker, so the orchestrator discovers API errors within one poll cycle instead of waiting for idle timeout. Also installs a SessionStart hook (matcher `compact`) that re-injects phase protocol instructions after context compaction — callers write the context file via `write_compact_context(phase_file, content)`, and the hook (`on-compact-reinject.sh`) outputs the file content to stdout so Claude retains critical instructions. When `MATRIX_THREAD_ID` is exported, also installs a Stop hook (`on-stop-matrix.sh`) that streams each Claude response to the Matrix thread. `monitor_phase_loop` sets `_MONITOR_LOOP_EXIT` to one of: `done`, `idle_timeout`, `idle_prompt` (Claude returned to `❯` for 3 consecutive polls without writing any phase — callback invoked with `PHASE:failed`, session already dead), `crashed`, or a `PHASE:*` string. **Callers must handle `idle_prompt`** in both their callback and their post-loop exit handler — see [`docs/PHASE-PROTOCOL.md` § idle_prompt](docs/PHASE-PROTOCOL.md#idle_prompt-exit-reason) for the full contract. | dev-agent.sh, gardener-agent.sh, action-agent.sh | +See [lib/AGENTS.md](lib/AGENTS.md) for the full shared helper reference. --- ## Issue lifecycle and label conventions -Issues flow through these states: - -``` - [created] - │ - ▼ - backlog ← Ready for the dev-agent to pick up - │ - ▼ - in-progress ← Dev-agent has claimed the issue (backlog label removed) - │ - ├── PR created → CI runs → review → merge - │ - ▼ - closed ← PR merged, issue closed automatically by dev-poll -``` +Issues flow: `backlog` → `in-progress` → PR → CI → review → merge → `closed`. ### Labels @@ -388,17 +108,9 @@ Issues flow through these states: ### Dependency conventions Issues declare dependencies in their body using a `## Dependencies` or -`## Depends on` section listing `#N` references: - -```markdown -## Dependencies -- #42 -- #55 -``` - -The dev-poll scheduler uses `lib/parse-deps.sh` to extract these and only -picks issues whose dependencies are all closed. The supervisor detects -circular dependency chains and stale dependencies (open > 30 days). +`## Depends on` section listing `#N` references. The dev-poll scheduler uses +`lib/parse-deps.sh` to extract these and only picks issues whose dependencies +are all closed. ### Single-threaded pipeline @@ -427,61 +139,14 @@ Humans write these. Agents read and enforce them. --- -## Phase-Signaling Protocol (for persistent tmux sessions) +## Phase-Signaling Protocol -When running as a **persistent tmux session** (issue #80+), Claude must signal -the orchestrator at each phase boundary by writing to a well-known file. +When running as a persistent tmux session, Claude must signal the orchestrator +at each phase boundary by writing to a phase file (e.g. +`/tmp/dev-session-{project}-{issue}.phase`). -### Phase file path +Key phases: `PHASE:awaiting_ci` → `PHASE:awaiting_review` → `PHASE:done`. +Also: `PHASE:needs_human`, `PHASE:failed`. -``` -/tmp/dev-session-{project}-{issue}.phase -``` - -### Required phase sentinels - -Write exactly one of these lines (with `>`, not `>>`) when a phase ends: - -```bash -PHASE_FILE="/tmp/dev-session-${PROJECT_NAME:-project}-${ISSUE:-0}.phase" - -# After pushing a PR branch — waiting for CI -echo "PHASE:awaiting_ci" > "$PHASE_FILE" - -# After CI passes — waiting for review -echo "PHASE:awaiting_review" > "$PHASE_FILE" - -# Blocked on human decision (ambiguous spec, architectural question) -echo "PHASE:needs_human" > "$PHASE_FILE" - -# PR is merged and issue is done -echo "PHASE:done" > "$PHASE_FILE" - -# Unrecoverable failure -printf 'PHASE:failed\nReason: %s\n' "describe what failed" > "$PHASE_FILE" -``` - -### When to write each phase - -1. **After `git push origin $BRANCH`** → write `PHASE:awaiting_ci` -2. **After receiving "CI passed" injection** → write `PHASE:awaiting_review` -3. **After receiving review feedback** → address it, push, write `PHASE:awaiting_review` -4. **After receiving "Approved" injection** → merge (or wait for orchestrator to merge), write `PHASE:done` -5. **When stuck on human-only decision** → write `PHASE:needs_human`, then wait for input -6. **When a step fails unrecoverably** → write `PHASE:failed` - -### Crash recovery - -If this session was restarted after a crash, the orchestrator will inject: -- The issue body -- `git diff` of work completed before the crash -- The last known phase -- Any CI results or review comments - -Read that context, then resume from where you left off. The git worktree is -the checkpoint — your code changes survived the crash. - -### Full protocol reference - -See `docs/PHASE-PROTOCOL.md` for the complete spec including the orchestrator -reaction matrix and sequence diagram. +See [docs/PHASE-PROTOCOL.md](docs/PHASE-PROTOCOL.md) for the complete spec +including the orchestrator reaction matrix, sequence diagram, and crash recovery. diff --git a/action/AGENTS.md b/action/AGENTS.md new file mode 100644 index 0000000..07da47a --- /dev/null +++ b/action/AGENTS.md @@ -0,0 +1,31 @@ + +# Action Agent + +**Role**: Execute operational tasks described by action formulas — run scripts, +call APIs, send messages, collect human approval. Shares the same phase handler +as the dev-agent: if an action produces code changes, the orchestrator creates a +PR and drives the CI/review loop; otherwise Claude closes the issue directly. + +**Trigger**: `action-poll.sh` runs every 10 min via cron. It scans for open +issues labeled `action` that have no active tmux session, then spawns +`action-agent.sh `. + +**Key files**: +- `action/action-poll.sh` — Cron scheduler: finds open action issues with no active tmux session, spawns action-agent.sh +- `action/action-agent.sh` — Orchestrator: fetches issue body + prior comments, creates tmux session (`action-{issue_num}`) with interactive `claude`, injects formula prompt with phase protocol, enters `monitor_phase_loop` (shared via `dev/phase-handler.sh`) for CI/review lifecycle or direct completion + +**Session lifecycle**: +1. `action-poll.sh` finds open `action` issues with no active tmux session. +2. Spawns `action-agent.sh `. +3. Agent creates Matrix thread, exports `MATRIX_THREAD_ID` so Claude's output streams to the thread via a Stop hook (`on-stop-matrix.sh`). +4. Agent creates tmux session `action-{issue_num}`, injects prompt (formula + prior comments + phase protocol). +5. Agent enters `monitor_phase_loop` (shared with dev-agent via `dev/phase-handler.sh`). +6. **Path A (git output):** Claude pushes branch → `PHASE:awaiting_ci` → handler creates PR, polls CI → injects failures → Claude fixes → push → re-poll → CI passes → `PHASE:awaiting_review` → handler polls reviews → injects REQUEST_CHANGES → Claude fixes → approved → merge → cleanup. +7. **Path B (no git output):** Claude posts results as comment, closes issue → `PHASE:done` → handler cleans up (kill session, docker compose down, remove temp files). +8. For human input: Claude sends a Matrix message and waits; the reply is injected into the session by `matrix_listener.sh`. + +**Environment variables consumed**: +- `CODEBERG_TOKEN`, `CODEBERG_REPO`, `CODEBERG_API`, `PROJECT_NAME`, `CODEBERG_WEB` +- `MATRIX_TOKEN`, `MATRIX_ROOM_ID`, `MATRIX_HOMESERVER` — Matrix notifications + human input +- `ACTION_IDLE_TIMEOUT` — Max seconds before killing idle session (default 14400 = 4h) +- `ACTION_MAX_LIFETIME` — Max total session wall-clock seconds (default 28800 = 8h); caps session independently of idle timeout diff --git a/dev/AGENTS.md b/dev/AGENTS.md new file mode 100644 index 0000000..990cebb --- /dev/null +++ b/dev/AGENTS.md @@ -0,0 +1,27 @@ + +# Dev Agent + +**Role**: Implement issues autonomously — write code, push branches, address +CI failures and review feedback. + +**Trigger**: `dev-poll.sh` runs every 10 min via cron. It scans for ready +backlog issues (all deps closed) or orphaned in-progress issues and spawns +`dev-agent.sh `. + +**Key files**: +- `dev/dev-poll.sh` — Cron scheduler: finds next ready issue, handles merge/rebase of approved PRs, tracks CI fix attempts +- `dev/dev-agent.sh` — Orchestrator: claims issue, creates worktree + tmux session with interactive `claude`, monitors phase file, injects CI results and review feedback, merges on approval +- `dev/phase-test.sh` — Integration test for the phase protocol + +**Environment variables consumed** (via `lib/env.sh` + project TOML): +- `CODEBERG_TOKEN` — Dev-agent token (push, PR creation, merge) — use the dedicated bot account +- `CODEBERG_REPO`, `CODEBERG_API` — Target repository +- `PROJECT_NAME`, `PROJECT_REPO_ROOT` — Local checkout path +- `PRIMARY_BRANCH` — Branch to merge into (e.g. `main`, `master`) +- `WOODPECKER_REPO_ID` — CI pipeline lookups +- `CLAUDE_TIMEOUT` — Max seconds for a Claude session (default 7200) +- `MATRIX_TOKEN`, `MATRIX_ROOM_ID`, `MATRIX_HOMESERVER` — Notifications (optional) + +**Lifecycle**: dev-poll.sh → dev-agent.sh → create Matrix thread + export +`MATRIX_THREAD_ID` (streams Claude output to thread via Stop hook) → tmux +`dev-{project}-{issue}` → phase file drives CI/review loop → merge → close issue. diff --git a/gardener/AGENTS.md b/gardener/AGENTS.md new file mode 100644 index 0000000..b704c57 --- /dev/null +++ b/gardener/AGENTS.md @@ -0,0 +1,23 @@ + +# Gardener Agent + +**Role**: Backlog grooming — detect duplicate issues, missing acceptance +criteria, oversized issues, stale issues, and circular dependencies. Invoke +Claude to fix or escalate to a human via Matrix. + +**Trigger**: `gardener-run.sh` runs 2x/day via cron. It files an `action` +issue referencing `formulas/run-gardener.toml`; the action-agent picks it up +and executes the gardener steps in an interactive Claude tmux session. Accepts +an optional project TOML argument (configures which project the action issue is +filed against). + +**Key files**: +- `gardener/gardener-run.sh` — Cron wrapper: lock, memory guard, dedup check, files action issue +- `gardener/gardener-poll.sh` — Escalation-reply injection for dev sessions, invokes gardener-agent.sh for grooming +- `gardener/gardener-agent.sh` — Orchestrator: bash pre-analysis, creates tmux session (`gardener-{project}`) with interactive `claude`, monitors phase file, parses result file (ACTION:/DUST:/ESCALATE) +- `formulas/run-gardener.toml` — Execution spec: preflight, grooming, dust-bundling, blocked-review, agents-update, commit-and-pr + +**Environment variables consumed**: +- `CODEBERG_TOKEN`, `CODEBERG_REPO`, `CODEBERG_API`, `PROJECT_NAME`, `PROJECT_REPO_ROOT` +- `CLAUDE_TIMEOUT` +- `MATRIX_TOKEN`, `MATRIX_ROOM_ID`, `MATRIX_HOMESERVER` diff --git a/lib/AGENTS.md b/lib/AGENTS.md new file mode 100644 index 0000000..6586590 --- /dev/null +++ b/lib/AGENTS.md @@ -0,0 +1,18 @@ + +# Shared Helpers (`lib/`) + +All agents source `lib/env.sh` as their first action. Additional helpers are +sourced as needed. + +| File | What it provides | Sourced by | +|---|---|---| +| `lib/env.sh` | Loads `.env`, sets `FACTORY_ROOT`, exports project config (`CODEBERG_REPO`, `PROJECT_NAME`, etc.), defines `log()`, `codeberg_api()`, `codeberg_api_all()` (accepts optional second TOKEN parameter, defaults to `$CODEBERG_TOKEN`), `woodpecker_api()`, `wpdb()`, `matrix_send()`, `matrix_send_ctx()`. Auto-loads project TOML if `PROJECT_TOML` is set. | Every agent | +| `lib/ci-helpers.sh` | `ci_passed()` — returns 0 if CI state is "success" (or no CI configured). `is_infra_step()` — returns 0 if a single CI step failure matches infra heuristics (clone/git exit 128, any exit 137, log timeout patterns). `classify_pipeline_failure()` — returns "infra \" if any failed Woodpecker step matches infra heuristics via `is_infra_step()`, else "code". | dev-poll, review-poll, review-pr, supervisor-poll | +| `lib/ci-debug.sh` | CLI tool for Woodpecker CI: `list`, `status`, `logs`, `failures` subcommands. Not sourced — run directly. | Humans / dev-agent (tool access) | +| `lib/load-project.sh` | Parses a `projects/*.toml` file into env vars (`PROJECT_NAME`, `CODEBERG_REPO`, `WOODPECKER_REPO_ID`, monitoring toggles, Matrix config, etc.). | env.sh (when `PROJECT_TOML` is set), supervisor-poll (per-project iteration) | +| `lib/parse-deps.sh` | Extracts dependency issue numbers from an issue body (stdin → stdout, one number per line). Matches `## Dependencies` / `## Depends on` / `## Blocked by` sections and inline `depends on #N` patterns. Not sourced — executed via `bash lib/parse-deps.sh`. | dev-poll, supervisor-poll | +| `lib/matrix_listener.sh` | Long-poll Matrix sync daemon. Dispatches thread replies to the correct agent via well-known files (`/tmp/{agent}-escalation-reply`). Handles supervisor, gardener, dev, review, vault, and action reply routing. Run as systemd service. | Standalone daemon | +| `lib/formula-session.sh` | `acquire_cron_lock()`, `check_memory()`, `load_formula()`, `build_context_block()`, `start_formula_session()`, `formula_phase_callback()`, `build_prompt_footer()`, `run_formula_and_monitor()` — shared helpers for formula-driven cron agents (lock, memory guard, formula loading, prompt assembly, tmux session, monitor loop, crash recovery). | planner-run.sh, predictor-run.sh | +| `lib/secret-scan.sh` | `scan_for_secrets()` — detects potential secrets (API keys, bearer tokens, private keys, URLs with embedded credentials) in text; returns 1 if secrets found. `redact_secrets()` — replaces detected secret patterns with `[REDACTED]`. | file-action-issue.sh, phase-handler.sh | +| `lib/file-action-issue.sh` | `file_action_issue()` — dedup check, secret scan, label lookup, and issue creation for formula-driven cron wrappers. Sets `FILED_ISSUE_NUM` on success. Returns 4 if secrets detected in body. | gardener-run.sh | +| `lib/agent-session.sh` | Shared tmux + Claude session helpers: `create_agent_session()`, `inject_formula()`, `agent_wait_for_claude_ready()`, `agent_inject_into_session()`, `agent_kill_session()`, `monitor_phase_loop()`, `read_phase()`, `write_compact_context()`. `create_agent_session(session, workdir, [phase_file])` optionally installs a PostToolUse hook (matcher `Bash\|Write`) that detects phase file writes in real-time — when Claude writes to the phase file, the hook writes a marker so `monitor_phase_loop` reacts on the next poll instead of waiting for mtime changes. Also installs a StopFailure hook (matcher `rate_limit\|server_error\|authentication_failed\|billing_error`) that writes `PHASE:failed` with an `api_error` reason to the phase file and touches the phase-changed marker, so the orchestrator discovers API errors within one poll cycle instead of waiting for idle timeout. Also installs a SessionStart hook (matcher `compact`) that re-injects phase protocol instructions after context compaction — callers write the context file via `write_compact_context(phase_file, content)`, and the hook (`on-compact-reinject.sh`) outputs the file content to stdout so Claude retains critical instructions. When `MATRIX_THREAD_ID` is exported, also installs a Stop hook (`on-stop-matrix.sh`) that streams each Claude response to the Matrix thread. `monitor_phase_loop` sets `_MONITOR_LOOP_EXIT` to one of: `done`, `idle_timeout`, `idle_prompt` (Claude returned to `>` for 3 consecutive polls without writing any phase — callback invoked with `PHASE:failed`, session already dead), `crashed`, or a `PHASE:*` string. **Callers must handle `idle_prompt`** in both their callback and their post-loop exit handler — see [`docs/PHASE-PROTOCOL.md` idle_prompt](docs/PHASE-PROTOCOL.md#idle_prompt-exit-reason) for the full contract. | dev-agent.sh, gardener-agent.sh, action-agent.sh | diff --git a/planner/AGENTS.md b/planner/AGENTS.md new file mode 100644 index 0000000..0594eef --- /dev/null +++ b/planner/AGENTS.md @@ -0,0 +1,43 @@ + +# Planner Agent + +**Role**: Strategic planning, executed directly from cron via tmux + Claude. +Phase 0 (preflight): pull latest code, load persistent memory from +`planner/MEMORY.md`. Phase 1 (prediction-triage): triage +`prediction/unreviewed` issues filed by the Predictor — for each prediction: +promote to action, promote to backlog, watch (relabel to prediction/backlog), +or dismiss with reasoning. Promoted predictions compete with vision gaps for +the per-cycle issue limit. Phase 2 (strategic-planning): resource+leverage gap +analysis — reasons about VISION.md, RESOURCES.md, formula catalog, and project +state to create up to 5 total issues (including promotions) prioritized by +leverage. Phase 3 (journal-and-memory): write daily journal entry (committed to +git) and update `planner/MEMORY.md` (committed to git). Phase 4 (commit-and-pr): +one commit with all file changes, push, create PR. AGENTS.md maintenance is +handled by the Gardener. + +**Trigger**: `planner-run.sh` runs weekly via cron. It creates a tmux session +with `claude --model opus`, injects `formulas/run-planner.toml` as context, +monitors the phase file, and cleans up on completion or timeout. No action +issues — the planner is a nervous system component, not work. + +**Key files**: +- `planner/planner-run.sh` — Cron wrapper + orchestrator: lock, memory guard, + sources disinto project config, creates tmux session, injects formula prompt, + monitors phase file, handles crash recovery, cleans up +- `formulas/run-planner.toml` — Execution spec: five steps (preflight, + prediction-triage, strategic-planning, journal-and-memory, commit-and-pr) + with `needs` dependencies. Claude executes all steps in a single interactive + session with tool access +- `planner/MEMORY.md` — Persistent memory across runs (committed to git) +- `planner/journal/*.md` — Daily raw logs from each planner run (committed to git) + +**Future direction**: The Predictor files prediction issues daily for the planner +to triage. The next step is evidence-gated deployment (see +`docs/EVIDENCE-ARCHITECTURE.md`): replacing human "ship it" decisions with +automated gates across dimensions (holdout, red-team, user-test, evolution +fitness, protocol metrics, funnel). Not yet implemented. + +**Environment variables consumed**: +- `CODEBERG_TOKEN`, `CODEBERG_REPO`, `CODEBERG_API`, `PROJECT_NAME`, `PROJECT_REPO_ROOT` +- `PRIMARY_BRANCH`, `CLAUDE_MODEL` (set to opus by planner-run.sh) +- `MATRIX_TOKEN`, `MATRIX_ROOM_ID`, `MATRIX_HOMESERVER` diff --git a/predictor/AGENTS.md b/predictor/AGENTS.md new file mode 100644 index 0000000..115e3b7 --- /dev/null +++ b/predictor/AGENTS.md @@ -0,0 +1,42 @@ + +# Predictor Agent + +**Role**: Infrastructure pattern detection (the "goblin"). Runs a 3-step +formula (preflight → collect-signals → analyze-and-predict) via interactive +tmux Claude session (sonnet). Collects disinto-specific signals: CI pipeline +trends (Woodpecker), stale issues, agent health (tmux sessions + logs), and +resource patterns (RAM, disk, load, containers). Files up to 5 +`prediction/unreviewed` issues for the Planner to triage. The predictor MUST +NOT emit feature work — only observations about CI health, issue staleness, +agent status, and system conditions. + +**Trigger**: `predictor-run.sh` runs daily at 06:00 UTC via cron (1h before +the planner at 07:00). Guarded by PID lock (`/tmp/predictor-run.lock`) and +memory check (skips if available RAM < 2000 MB). + +**Key files**: +- `predictor/predictor-run.sh` — Cron wrapper + orchestrator: lock, memory guard, + sources disinto project config, builds prompt with formula + Codeberg API + reference, creates tmux session (sonnet), monitors phase file, handles crash + recovery via `run_formula_and_monitor` +- `formulas/run-predictor.toml` — Execution spec: three steps (preflight, + collect-signals, analyze-and-predict) with `needs` dependencies. Claude + collects signals and files prediction issues in a single interactive session + +**Supersedes**: The legacy predictor (`planner/prediction-poll.sh` + +`planner/prediction-agent.sh`) used `claude -p` one-shot, read `evidence/` +JSON, and ran hourly. This formula-based predictor replaces it with direct +CI/issues/logs signal collection and interactive Claude sessions, matching the +planner's tmux+formula pattern. + +**Environment variables consumed**: +- `CODEBERG_TOKEN`, `CODEBERG_REPO`, `CODEBERG_API`, `PROJECT_NAME`, `PROJECT_REPO_ROOT` +- `PRIMARY_BRANCH`, `CLAUDE_MODEL` (set to sonnet by predictor-run.sh) +- `WOODPECKER_TOKEN`, `WOODPECKER_SERVER` — CI pipeline trend queries (optional; skipped if unset) +- `MATRIX_TOKEN`, `MATRIX_ROOM_ID`, `MATRIX_HOMESERVER` — Notifications (optional) + +**Lifecycle**: predictor-run.sh (daily 06:00 cron) → lock + memory guard → +load formula + context → create tmux session → Claude collects signals +(CI trends, stale issues, agent health, resources) → dedup against existing +open predictions → file `prediction/unreviewed` issues → `PHASE:done`. +The planner's Phase 1 later triages these predictions. diff --git a/review/AGENTS.md b/review/AGENTS.md new file mode 100644 index 0000000..15962cb --- /dev/null +++ b/review/AGENTS.md @@ -0,0 +1,20 @@ + +# Review Agent + +**Role**: AI-powered PR review — post structured findings and formal +approve/request-changes verdicts to Codeberg. + +**Trigger**: `review-poll.sh` runs every 10 min via cron. It scans open PRs +whose CI has passed and that lack a review for the current HEAD SHA, then +spawns `review-pr.sh `. + +**Key files**: +- `review/review-poll.sh` — Cron scheduler: finds unreviewed PRs with passing CI +- `review/review-pr.sh` — Creates/reuses a tmux session (`review-{project}-{pr}`), injects PR diff, waits for Claude to write structured JSON output, posts markdown review + formal Codeberg review, auto-creates follow-up issues for pre-existing tech debt + +**Environment variables consumed**: +- `CODEBERG_TOKEN` — Dev-agent token (must not be the same account as REVIEW_BOT_TOKEN) +- `REVIEW_BOT_TOKEN` — Review-agent token for approvals (use human/admin account; branch protection: in approvals whitelist) +- `CODEBERG_REPO`, `CODEBERG_API`, `PROJECT_NAME`, `PROJECT_REPO_ROOT` +- `PRIMARY_BRANCH`, `WOODPECKER_REPO_ID` +- `MATRIX_TOKEN`, `MATRIX_ROOM_ID`, `MATRIX_HOMESERVER` diff --git a/supervisor/AGENTS.md b/supervisor/AGENTS.md new file mode 100644 index 0000000..06deed7 --- /dev/null +++ b/supervisor/AGENTS.md @@ -0,0 +1,53 @@ + +# Supervisor Agent + +**Role**: Health monitoring and auto-remediation, executed as a formula-driven +Claude agent. Collects system and project metrics via a bash pre-flight script, +then runs an interactive Claude session (sonnet) that assesses health, auto-fixes +issues, escalates via Matrix, and writes a daily journal. + +**Trigger**: `supervisor-run.sh` runs every 20 min via cron. It creates a tmux +session with `claude --model sonnet`, injects `formulas/run-supervisor.toml` +with pre-collected metrics as context, monitors the phase file, and cleans up +on completion or timeout (20 min max session). No action issues — the supervisor +runs directly from cron like the planner and predictor. + +**Key files**: +- `supervisor/supervisor-run.sh` — Cron wrapper + orchestrator: lock, memory guard, + runs preflight.sh, sources disinto project config, creates tmux session, injects + formula prompt with metrics, monitors phase file, handles crash recovery via + `run_formula_and_monitor` +- `supervisor/preflight.sh` — Data collection: system resources (RAM, disk, swap, + load), Docker status, active tmux sessions + phase files, lock files, agent log + tails, CI pipeline status, open PRs, issue counts, stale worktrees, blocked + issues, Matrix escalation replies +- `formulas/run-supervisor.toml` — Execution spec: five steps (preflight review, + health-assessment, decide-actions, report, journal) with `needs` dependencies. + Claude evaluates all metrics and takes actions in a single interactive session +- `supervisor/journal/*.md` — Daily health logs from each supervisor run (local, + committed periodically) +- `supervisor/PROMPT.md` — Best-practices reference for remediation actions +- `supervisor/best-practices/*.md` — Domain-specific remediation guides (memory, + disk, CI, git, dev-agent, review-agent, codeberg) +- `supervisor/supervisor-poll.sh` — Legacy bash orchestrator (superseded by + supervisor-run.sh + formula) + +**Alert priorities**: P0 (memory crisis), P1 (disk), P2 (factory stopped/stalled), +P3 (degraded PRs, circular deps, stale deps), P4 (housekeeping). + +**Matrix integration**: The supervisor has its own Matrix thread. Posts health +summaries when there are changes, escalates P0-P2 issues, and processes replies +from humans ("ignore disk warning", "kill that agent", "what's stuck?"). The +Matrix listener routes thread replies to `/tmp/supervisor-escalation-reply`, +which `supervisor-run.sh` consumes atomically on each run. + +**Environment variables consumed**: +- `CODEBERG_TOKEN`, `CODEBERG_REPO`, `CODEBERG_API`, `PROJECT_NAME`, `PROJECT_REPO_ROOT` +- `PRIMARY_BRANCH`, `CLAUDE_MODEL` (set to sonnet by supervisor-run.sh) +- `WOODPECKER_TOKEN`, `WOODPECKER_SERVER`, `WOODPECKER_DB_PASSWORD`, `WOODPECKER_DB_USER`, `WOODPECKER_DB_HOST`, `WOODPECKER_DB_NAME` — CI database queries +- `MATRIX_TOKEN`, `MATRIX_ROOM_ID`, `MATRIX_HOMESERVER` — Matrix notifications + human input + +**Lifecycle**: supervisor-run.sh (cron */20) → lock + memory guard → run +preflight.sh (collect metrics) → consume escalation replies → load formula + +context → create tmux session → Claude assesses health, auto-fixes, posts +Matrix summary, writes journal → `PHASE:done`. diff --git a/vault/AGENTS.md b/vault/AGENTS.md new file mode 100644 index 0000000..992f5e9 --- /dev/null +++ b/vault/AGENTS.md @@ -0,0 +1,20 @@ + +# Vault Agent + +**Role**: Safety gate for dangerous or irreversible actions. Actions enter a +pending queue and are classified by Claude via `vault-agent.sh`, which can +auto-approve (call `vault-fire.sh` directly), auto-reject (call +`vault-reject.sh`), or escalate to a human via Matrix for APPROVE/REJECT. + +**Trigger**: `vault-poll.sh` runs every 30 min via cron. + +**Key files**: +- `vault/vault-poll.sh` — Processes pending actions: retry approved, auto-reject after 48h timeout, invoke vault-agent for new items +- `vault/vault-agent.sh` — Classifies and routes pending actions via `claude -p`: auto-approve, auto-reject, or escalate to human +- `vault/PROMPT.md` — System prompt for the vault agent's Claude invocation +- `vault/vault-fire.sh` — Executes an approved action +- `vault/vault-reject.sh` — Marks an action as rejected + +**Environment variables consumed**: +- All from `lib/env.sh` +- `MATRIX_TOKEN`, `MATRIX_ROOM_ID`, `MATRIX_HOMESERVER` — Escalation channel