From b8dc01b06f7d36c94d1933b8a895b2da3ccdcd3e Mon Sep 17 00:00:00 2001 From: openhands Date: Wed, 25 Mar 2026 00:07:52 +0000 Subject: [PATCH] chore: gardener housekeeping 2026-03-25 --- AGENTS.md | 4 ++-- action/AGENTS.md | 9 +++++---- dev/AGENTS.md | 29 +++++++++++++++++------------ gardener/AGENTS.md | 16 +++++++++------- gardener/pending-actions.json | 10 ++++++++++ lib/AGENTS.md | 5 ++++- planner/AGENTS.md | 12 +++++++----- predictor/AGENTS.md | 18 +++++++++++------- review/AGENTS.md | 6 +++--- supervisor/AGENTS.md | 12 +++++++----- vault/AGENTS.md | 2 +- 11 files changed, 76 insertions(+), 47 deletions(-) diff --git a/AGENTS.md b/AGENTS.md index 51d34b6..db61b8e 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -1,4 +1,4 @@ - + # Disinto — Agent Instructions ## What this repo is @@ -26,7 +26,7 @@ disinto/ │ supervisor-poll.sh — legacy bash orchestrator (superseded) ├── vault/ vault-poll.sh, vault-agent.sh, vault-fire.sh — action gating + procurement ├── action/ action-poll.sh, action-agent.sh — operational task execution -├── lib/ env.sh, agent-session.sh, ci-helpers.sh, ci-debug.sh, load-project.sh, parse-deps.sh, matrix_listener.sh +├── lib/ env.sh, agent-session.sh, ci-helpers.sh, ci-debug.sh, load-project.sh, parse-deps.sh, matrix_listener.sh, guard.sh, mirrors.sh, build-graph.py ├── projects/ *.toml.example — templates; *.toml — local per-box config (gitignored) ├── formulas/ Issue templates (TOML specs for multi-step agent tasks) └── docs/ Protocol docs (PHASE-PROTOCOL.md, EVIDENCE-ARCHITECTURE.md) diff --git a/action/AGENTS.md b/action/AGENTS.md index 464497e..b0b3629 100644 --- a/action/AGENTS.md +++ b/action/AGENTS.md @@ -1,4 +1,4 @@ - + # Action Agent **Role**: Execute operational tasks described by action formulas — run scripts, @@ -6,9 +6,10 @@ call APIs, send messages, collect human approval. Shares the same phase handler as the dev-agent: if an action produces code changes, the orchestrator creates a PR and drives the CI/review loop; otherwise Claude closes the issue directly. -**Trigger**: `action-poll.sh` runs every 10 min via cron. It scans for open -issues labeled `action` that have no active tmux session, then spawns -`action-agent.sh `. +**Trigger**: `action-poll.sh` runs every 10 min via cron. Sources `lib/guard.sh` +and calls `check_active action` first — skips if `$FACTORY_ROOT/state/.action-active` +is absent. Then scans for open issues labeled `action` that have no active tmux +session, and spawns `action-agent.sh `. **Key files**: - `action/action-poll.sh` — Cron scheduler: finds open action issues with no active tmux session, spawns action-agent.sh diff --git a/dev/AGENTS.md b/dev/AGENTS.md index 4e29440..cf37fd8 100644 --- a/dev/AGENTS.md +++ b/dev/AGENTS.md @@ -1,21 +1,22 @@ - + # Dev Agent **Role**: Implement issues autonomously — write code, push branches, address CI failures and review feedback. -**Trigger**: `dev-poll.sh` runs every 10 min via cron. It performs a direct-merge -scan first (approved + CI green PRs — including chore/gardener PRs without issue -numbers), then checks the agent lock and scans for ready issues using a two-tier -priority queue: (1) `priority`+`backlog` issues first (FIFO within tier), then -(2) plain `backlog` issues (FIFO). Orphaned in-progress issues are also picked up. -The direct-merge scan runs before the lock check so approved PRs get merged even -while a dev-agent session is active on another issue. +**Trigger**: `dev-poll.sh` runs every 10 min via cron. Sources `lib/guard.sh` and +calls `check_active dev` first — skips if `$FACTORY_ROOT/state/.dev-active` is +absent. Then performs a direct-merge scan (approved + CI green PRs — including +chore/gardener PRs without issue numbers), then checks the agent lock and scans +for ready issues using a two-tier priority queue: (1) `priority`+`backlog` issues +first (FIFO within tier), then (2) plain `backlog` issues (FIFO). Orphaned +in-progress issues are also picked up. The direct-merge scan runs before the lock +check so approved PRs get merged even while a dev-agent session is active. **Key files**: - `dev/dev-poll.sh` — Cron scheduler: finds next ready issue, handles merge/rebase of approved PRs, tracks CI fix attempts - `dev/dev-agent.sh` — Orchestrator: claims issue, creates worktree + tmux session with interactive `claude`, monitors phase file, injects CI results and review feedback, merges on approval -- `dev/phase-handler.sh` — Phase callback functions: `post_refusal_comment()`, `_on_phase_change()`, `build_phase_protocol_prompt()`. `do_merge()` detects already-merged PRs on HTTP 405 (race with dev-poll's pre-lock scan) and returns success instead of escalating +- `dev/phase-handler.sh` — Phase callback functions: `post_refusal_comment()`, `_on_phase_change()`, `build_phase_protocol_prompt()`. `do_merge()` detects already-merged PRs on HTTP 405 (race with dev-poll's pre-lock scan) and returns success instead of escalating. Sources `lib/mirrors.sh` and calls `mirror_push()` after every successful merge. Matrix escalation notifications include `MATRIX_MENTION_USER` HTML mention when set. - `dev/phase-test.sh` — Integration test for the phase protocol **Environment variables consumed** (via `lib/env.sh` + project TOML): @@ -27,6 +28,10 @@ while a dev-agent session is active on another issue. - `CLAUDE_TIMEOUT` — Max seconds for a Claude session (default 7200) - `MATRIX_TOKEN`, `MATRIX_ROOM_ID`, `MATRIX_HOMESERVER` — Notifications (optional) -**Lifecycle**: dev-poll.sh → dev-agent.sh → create Matrix thread + export -`MATRIX_THREAD_ID` (streams Claude output to thread via Stop hook) → tmux -`dev-{project}-{issue}` → phase file drives CI/review loop → merge → close issue. +**Lifecycle**: dev-poll.sh (`check_active dev`) → dev-agent.sh → create Matrix +thread + export `MATRIX_THREAD_ID` → tmux `dev-{project}-{issue}` → phase file +drives CI/review loop → merge + `mirror_push()` → close issue. On respawn after +`PHASE:escalate`, the stale phase file is cleared first so the session starts +clean; the reinject prompt tells Claude not to re-escalate for the same reason. +On respawn for any active PR, the prompt explicitly tells Claude the PR already +exists and not to create a new one via API. diff --git a/gardener/AGENTS.md b/gardener/AGENTS.md index 11c3df2..f179c1a 100644 --- a/gardener/AGENTS.md +++ b/gardener/AGENTS.md @@ -1,4 +1,4 @@ - + # Gardener Agent **Role**: Backlog grooming — detect duplicate issues, missing acceptance @@ -7,11 +7,13 @@ the quality gate: strips the `backlog` label from issues that lack acceptance criteria checkboxes (`- [ ]`) or an `## Affected files` section. Invokes Claude to fix or escalate to a human via Matrix. -**Trigger**: `gardener-run.sh` runs 4x/day via cron. It creates a tmux -session with `claude --model sonnet`, injects `formulas/run-gardener.toml` -with escalation replies as context, monitors the phase file, and cleans up -on completion or timeout (2h max session). No action issues — the gardener -runs directly from cron like the planner, predictor, and supervisor. +**Trigger**: `gardener-run.sh` runs 4x/day via cron. Sources `lib/guard.sh` and +calls `check_active gardener` first — skips if `$FACTORY_ROOT/state/.gardener-active` +is absent. Then creates a tmux session with `claude --model sonnet`, injects +`formulas/run-gardener.toml` with escalation replies as context, monitors the +phase file, and cleans up on completion or timeout (2h max session). No action +issues — the gardener runs directly from cron like the planner, predictor, and +supervisor. **Key files**: - `gardener/gardener-run.sh` — Cron wrapper + orchestrator: lock, memory guard, @@ -32,7 +34,7 @@ runs directly from cron like the planner, predictor, and supervisor. - `PRIMARY_BRANCH`, `CLAUDE_MODEL` (set to sonnet by gardener-run.sh) - `MATRIX_TOKEN`, `MATRIX_ROOM_ID`, `MATRIX_HOMESERVER` -**Lifecycle**: gardener-run.sh (cron 0,6,12,18) → lock + memory guard → +**Lifecycle**: gardener-run.sh (cron 0,6,12,18) → `check_active gardener` → lock + memory guard → consume escalation replies → load formula + context → create tmux session → Claude grooms backlog (writes proposed actions to manifest), bundles dust, reviews blocked issues, updates AGENTS.md, commits manifest + docs to PR → diff --git a/gardener/pending-actions.json b/gardener/pending-actions.json index 99f3ea6..87ec531 100644 --- a/gardener/pending-actions.json +++ b/gardener/pending-actions.json @@ -1,4 +1,14 @@ [ + { + "action": "edit_body", + "issue": 619, + "body": "Depends on: *Containerize full stack with docker-compose*\r\n\r\n## Problem\r\n\r\nDendrite currently runs on the bare host as a systemd service (`dendrite.service`), manually installed and configured. The `matrix_listener.sh` daemon also runs on the host via its own systemd unit (`matrix_listener.service`), hardcoded to `/home/admin/disinto`. This is the last piece of the stack that isn't containerized — Forgejo, Woodpecker, and the agents are inside compose, but the Matrix homeserver sits outside.\r\n\r\nThe result: `disinto init` can't provision Matrix automatically, the listener systemd unit has a hardcoded user and path, and if someone sets up disinto fresh on a new VPS they need to install Dendrite manually, create users, create rooms, and wire everything together before notifications and human-in-the-loop escalation work.\r\n\r\n## Solution\r\n\r\nAdd Dendrite as a fourth service in `docker-compose.yml`. Provision the bot user, coordination room, and access token during `disinto init`. Move the `matrix_listener.sh` daemon into the agent container's entrypoint alongside cron.\r\n\r\n## Scope\r\n\r\n### 1. Dendrite service in docker-compose.yml\r\n\r\n```yaml\r\n dendrite:\r\n image: matrixdotorg/dendrite-monolith:latest\r\n restart: unless-stopped\r\n volumes:\r\n - dendrite-data:/etc/dendrite\r\n environment:\r\n DENDRITE_DOMAIN: disinto.local\r\n networks:\r\n - disinto-net\r\n```\r\n\r\nNo host ports exposed — the agents talk to Dendrite over the internal Docker network at `http://dendrite:8008`. There's no need for federation or external Matrix clients unless the user explicitly wants to connect their own Matrix client (e.g. Element), in which case they can add a port mapping themselves.\r\n\r\n### 2. Provisioning in `disinto init`\r\n\r\nAfter Dendrite is healthy, `disinto init` creates:\r\n\r\n- A server signing key (Dendrite generates this on first start if missing)\r\n- A bot user via Dendrite's admin API (`POST /_dendrite/admin/createOrModifyAccount` or `create-account` CLI tool via `docker compose exec dendrite`)\r\n- A coordination room via the Matrix client-server API (`POST /_matrix/client/v3/createRoom`)\r\n- An access token for the bot (via login: `POST /_matrix/client/v3/login`)\r\n\r\nStore the resulting `MATRIX_TOKEN`, `MATRIX_ROOM_ID`, and `MATRIX_BOT_USER` in `.env` (or `.env.enc` if SOPS is available). Set `MATRIX_HOMESERVER=http://dendrite:8008` — this URL only needs to resolve inside the Docker network.\r\n\r\nFor the interactive case: after creating the room, print the room alias or ID so the user can join from their own Matrix client (Element, etc.) to receive notifications and reply to escalations. If they don't have a Matrix client, the factory still works — escalations just go unanswered until they check manually.\r\n\r\n### 3. Move `matrix_listener.sh` into agent container\r\n\r\nThe listener is currently a systemd service on the host. In the compose setup it runs inside the agent container as a background process alongside cron. Update `docker/agents/entrypoint.sh`:\r\n\r\n```bash\r\n#!/bin/bash\r\n# Start matrix listener in background (if configured)\r\nif [ -n \"${MATRIX_TOKEN:-}\" ] && [ -n \"${MATRIX_ROOM_ID:-}\" ]; then\r\n /home/agent/disinto/lib/matrix_listener.sh &\r\nfi\r\n\r\n# Start cron in foreground\r\nexec cron -f\r\n```\r\n\r\nRemove the pidfile guard in `matrix_listener.sh` (line 24–31) or make it work with the container lifecycle — inside a container the PID file from a previous run doesn't exist. The trap on EXIT already cleans up.\r\n\r\n### 4. Remove `matrix_listener.service`\r\n\r\nThe systemd unit file at `lib/matrix_listener.service` becomes dead code once the listener runs inside the agent container. Keep it for bare-metal deployments (`disinto init --bare`) but document it as the legacy path.\r\n\r\n### 5. Update `MATRIX_HOMESERVER` default\r\n\r\nIn `.env.example`, change the default from `http://localhost:8008` to `http://dendrite:8008`. In `lib/env.sh`, the default should detect the environment:\r\n\r\n- Inside a container (compose): `MATRIX_HOMESERVER` defaults to `http://dendrite:8008`\r\n- On bare metal: defaults to `http://localhost:8008`\r\n\r\nThis can use the same container detection from the compose issue (e.g. checking for `/.dockerenv` or a `DISINTO_COMPOSE=1` env var set in the compose file).\r\n\r\n### 6. Per-project Matrix rooms (optional enhancement)\r\n\r\nThe current setup uses one room for all projects, with project-specific thread maps for routing. This works fine inside compose — no change needed. But `disinto init` for a new project could optionally create a per-project room and store it in the project TOML under `[matrix] room_id`. The listener already dispatches by project name via the thread map, so per-project rooms would just reduce noise.\r\n\r\nThis is optional — single-room works, document multi-room as a possible configuration.\r\n\r\n## Affected files\r\n\r\n- `docker-compose.yml` (generated) — add `dendrite` service and `dendrite-data` volume\r\n- `docker/agents/entrypoint.sh` — start `matrix_listener.sh` as background process\r\n- `bin/disinto` — Dendrite provisioning (bot user, room, token) during init\r\n- `.env.example` — update `MATRIX_HOMESERVER` default, document compose vs bare-metal\r\n- `lib/matrix_listener.sh` — make pidfile guard container-friendly (no stale PID from previous container)\r\n- `lib/matrix_listener.service` — keep for `--bare` mode, document as legacy\r\n\r\n## Not in scope\r\n\r\n- Matrix federation with external homeservers\r\n- End-to-end encryption for the coordination room (Dendrite supports it, but agent bots don't need it for an internal channel)\r\n- Element or other Matrix client setup (user's responsibility)\r\n- Replacing Matrix with a different notification system\r\n- Migrating existing Matrix room history into the containerized Dendrite\r\n\r\n## Acceptance criteria\r\n\r\n- `disinto init` provisions Dendrite, creates a bot user and coordination room, and stores credentials in `.env`\r\n- Agent notifications (`matrix_send`) work via `http://dendrite:8008` inside the Docker network\r\n- `matrix_listener.sh` runs inside the agent container and dispatches escalation replies to agent sessions\r\n- Dendrite is only reachable from within `disinto-net` — no host ports exposed by default\r\n- Users can join the coordination room from an external Matrix client by adding a port mapping to compose and joining via room alias\r\n- The factory works end-to-end without Matrix configured (all `matrix_send` calls already guard on `[ -z \"${MATRIX_TOKEN:-}\" ]`)\r\n" + }, + { + "action": "add_label", + "issue": 619, + "label": "backlog" + }, { "action": "edit_body", "issue": 614, diff --git a/lib/AGENTS.md b/lib/AGENTS.md index d3cc65d..6eb9367 100644 --- a/lib/AGENTS.md +++ b/lib/AGENTS.md @@ -1,4 +1,4 @@ - + # Shared Helpers (`lib/`) All agents source `lib/env.sh` as their first action. Additional helpers are @@ -13,6 +13,9 @@ sourced as needed. | `lib/parse-deps.sh` | Extracts dependency issue numbers from an issue body (stdin → stdout, one number per line). Matches `## Dependencies` / `## Depends on` / `## Blocked by` sections and inline `depends on #N` / `blocked by #N` patterns. Inline scan skips fenced code blocks to prevent false positives from code examples in issue bodies. Not sourced — executed via `bash lib/parse-deps.sh`. | dev-poll, supervisor-poll | | `lib/matrix_listener.sh` | Long-poll Matrix sync daemon. Dispatches thread replies to the correct agent via tmux session injection (dev, action, vault, review) or well-known files (`/tmp/{agent}-escalation-reply` for supervisor/gardener). Handles all agent reply routing. Run as systemd service. | Standalone daemon | | `lib/formula-session.sh` | `acquire_cron_lock()`, `check_memory()`, `load_formula()`, `build_context_block()`, `consume_escalation_reply()`, `start_formula_session()`, `formula_phase_callback()`, `build_prompt_footer()`, `run_formula_and_monitor(AGENT [TIMEOUT] [CALLBACK])` — shared helpers for formula-driven cron agents (lock, memory guard, formula loading, prompt assembly, tmux session, monitor loop, crash recovery). `formula_phase_callback()` handles `PHASE:escalate` (unified escalation path — kills the session; callers may follow up via Matrix). `run_formula_and_monitor` accepts an optional CALLBACK (default: `formula_phase_callback`) so callers can install custom merge-through or escalation handlers. | planner-run.sh, predictor-run.sh, gardener-run.sh, supervisor-run.sh, dev-agent.sh, action-agent.sh | +| `lib/guard.sh` | `check_active(agent_name)` — reads `$FACTORY_ROOT/state/.{agent_name}-active`; exits 0 (skip) if the file is absent. Factory is off by default — state files must be created to enable each agent. Sourced by dev-poll.sh, review-poll.sh, action-poll.sh, predictor-run.sh, supervisor-run.sh. | cron entry points | +| `lib/mirrors.sh` | `mirror_push()` — pushes `$PRIMARY_BRANCH` + tags to all configured mirror remotes (fire-and-forget background pushes). Reads `MIRROR_NAMES` and `MIRROR_*` vars exported by `load-project.sh` from the `[mirrors]` TOML section. Failures are logged but never block the pipeline. Sourced by dev-poll.sh and dev/phase-handler.sh — called after every successful merge. | dev-poll.sh, phase-handler.sh | +| `lib/build-graph.py` | Python tool: parses VISION.md, prerequisite-tree.md, AGENTS.md, formulas/*.toml, evidence/, and forge issues/labels into a NetworkX DiGraph. Runs structural analyses (orphaned objectives, stale prerequisites, thin evidence, circular deps) and outputs a JSON report. Used by `review-pr.sh` (per-PR changed-file analysis) and `predictor-run.sh` (full-project analysis) to provide structural context to Claude. | review-pr.sh, predictor-run.sh | | `lib/secret-scan.sh` | `scan_for_secrets()` — detects potential secrets (API keys, bearer tokens, private keys, URLs with embedded credentials) in text; returns 1 if secrets found. `redact_secrets()` — replaces detected secret patterns with `[REDACTED]`. | file-action-issue.sh, phase-handler.sh | | `lib/file-action-issue.sh` | `file_action_issue()` — dedup check, secret scan, label lookup, and issue creation for formula-driven cron wrappers. Sets `FILED_ISSUE_NUM` on success. Returns 4 if secrets detected in body. | (available for future use) | | `lib/agent-session.sh` | Shared tmux + Claude session helpers: `create_agent_session()`, `inject_formula()`, `agent_wait_for_claude_ready()`, `agent_inject_into_session()`, `agent_kill_session()`, `monitor_phase_loop()`, `read_phase()`, `write_compact_context()`. `create_agent_session(session, workdir, [phase_file])` optionally installs a PostToolUse hook (matcher `Bash\|Write`) that detects phase file writes in real-time — when Claude writes to the phase file, the hook writes a marker so `monitor_phase_loop` reacts on the next poll instead of waiting for mtime changes. Also installs a StopFailure hook (matcher `rate_limit\|server_error\|authentication_failed\|billing_error`) that writes `PHASE:failed` with an `api_error` reason to the phase file and touches the phase-changed marker, so the orchestrator discovers API errors within one poll cycle instead of waiting for idle timeout. Also installs a SessionStart hook (matcher `compact`) that re-injects phase protocol instructions after context compaction — callers write the context file via `write_compact_context(phase_file, content)`, and the hook (`on-compact-reinject.sh`) outputs the file content to stdout so Claude retains critical instructions. When `MATRIX_THREAD_ID` is exported, also installs a Stop hook (`on-stop-matrix.sh`) that streams each Claude response to the Matrix thread. When `phase_file` is set, passes it to the idle stop hook (`on-idle-stop.sh`) so the hook can **nudge Claude** (up to 2 times) if Claude returns to the prompt without writing to the phase file — the hook injects a tmux reminder asking Claude to signal PHASE:done or PHASE:awaiting_ci. The PreToolUse guard hook (`on-pretooluse-guard.sh`) receives the session name as a third argument — formula agents (`gardener-*`, `planner-*`, `predictor-*`, `supervisor-*`) are identified this way and allowed to access `FACTORY_ROOT` from worktrees (they need env.sh, AGENTS.md, formulas/, lib/). `monitor_phase_loop` sets `_MONITOR_LOOP_EXIT` to one of: `done`, `idle_timeout`, `idle_prompt` (Claude returned to `>` for 3 consecutive polls without writing any phase — callback invoked with `PHASE:failed`, session already dead), `crashed`, or `PHASE:escalate` / other `PHASE:*` string. **Unified escalation**: `PHASE:escalate` is the signal that a session needs human input (renamed from `PHASE:needs_human`). **Callers must handle `idle_prompt`** in both their callback and their post-loop exit handler — see [`docs/PHASE-PROTOCOL.md` idle_prompt](docs/PHASE-PROTOCOL.md#idle_prompt-exit-reason) for the full contract. | dev-agent.sh, action-agent.sh | diff --git a/planner/AGENTS.md b/planner/AGENTS.md index 7c3a9f0..abd1654 100644 --- a/planner/AGENTS.md +++ b/planner/AGENTS.md @@ -1,4 +1,4 @@ - + # Planner Agent **Role**: Strategic planning using a Prerequisite Tree (Theory of Constraints), @@ -31,10 +31,12 @@ and `$PROJECT_REPO_ROOT/vault/`, not `$FACTORY_ROOT`. Each project manages its own planner state independently. **Trigger**: `planner-run.sh` runs daily via cron (accepts an optional project -TOML argument, defaults to `projects/disinto.toml`). It creates a tmux session -with `claude --model opus`, injects `formulas/run-planner.toml` as context, -monitors the phase file, and cleans up on completion or timeout. No action -issues — the planner is a nervous system component, not work. +TOML argument, defaults to `projects/disinto.toml`). Sources `lib/guard.sh` and +calls `check_active planner` first — skips if `$FACTORY_ROOT/state/.planner-active` +is absent. Then creates a tmux session with `claude --model opus`, injects +`formulas/run-planner.toml` as context, monitors the phase file, and cleans up +on completion or timeout. No action issues — the planner is a nervous system +component, not work. **Key files**: - `planner/planner-run.sh` — Cron wrapper + orchestrator: lock, memory guard, diff --git a/predictor/AGENTS.md b/predictor/AGENTS.md index 5e79206..2ba726f 100644 --- a/predictor/AGENTS.md +++ b/predictor/AGENTS.md @@ -1,4 +1,4 @@ - + # Predictor Agent **Role**: Abstract adversary (the "goblin"). Runs a 2-step formula @@ -23,14 +23,18 @@ emit feature work — only observations challenging claims, exposing gaps, and surfacing risks. **Trigger**: `predictor-run.sh` runs daily at 06:00 UTC via cron (1h before -the planner at 07:00). Guarded by PID lock (`/tmp/predictor-run.lock`) and -memory check (skips if available RAM < 2000 MB). +the planner at 07:00). Sources `lib/guard.sh` and calls `check_active predictor` +first — skips if `$FACTORY_ROOT/state/.predictor-active` is absent. Also guarded +by PID lock (`/tmp/predictor-run.lock`) and memory check (skips if available +RAM < 2000 MB). **Key files**: -- `predictor/predictor-run.sh` — Cron wrapper + orchestrator: lock, memory guard, - sources disinto project config, builds prompt with formula + forge API - reference, creates tmux session (sonnet), monitors phase file, handles crash - recovery via `run_formula_and_monitor` +- `predictor/predictor-run.sh` — Cron wrapper + orchestrator: active-state guard, + lock, memory guard, sources disinto project config, builds structural analysis + graph via `lib/build-graph.py` (full-project scan — results included in prompt + as `## Structural analysis`; failures non-fatal), builds prompt with formula + + forge API reference, creates tmux session (sonnet), monitors phase file, handles + crash recovery via `run_formula_and_monitor` - `formulas/run-predictor.toml` — Execution spec: two steps (preflight, find-weakness-and-act) with `needs` dependencies. Claude reviews prediction history, explores/exploits weaknesses, and files issues in a single diff --git a/review/AGENTS.md b/review/AGENTS.md index 0156c32..ba0a91c 100644 --- a/review/AGENTS.md +++ b/review/AGENTS.md @@ -1,4 +1,4 @@ - + # Review Agent **Role**: AI-powered PR review — post structured findings and formal @@ -9,8 +9,8 @@ whose CI has passed and that lack a review for the current HEAD SHA, then spawns `review-pr.sh `. **Key files**: -- `review/review-poll.sh` — Cron scheduler: finds unreviewed PRs with passing CI -- `review/review-pr.sh` — Creates/reuses a tmux session (`review-{project}-{pr}`), injects PR diff, waits for Claude to write structured JSON output, posts markdown review + formal forge review, auto-creates follow-up issues for pre-existing tech debt +- `review/review-poll.sh` — Cron scheduler: finds unreviewed PRs with passing CI. Sources `lib/guard.sh` and calls `check_active reviewer` — skips if `$FACTORY_ROOT/state/.reviewer-active` is absent. +- `review/review-pr.sh` — Creates/reuses a tmux session (`review-{project}-{pr}`), injects PR diff, waits for Claude to write structured JSON output, posts markdown review + formal forge review, auto-creates follow-up issues for pre-existing tech debt. Before starting the session, runs `lib/build-graph.py --changed-files ` and appends the JSON structural analysis (affected objectives, orphaned prerequisites, thin evidence) to the review prompt. Graph failures are non-fatal — review proceeds without it. **Environment variables consumed**: - `FORGE_TOKEN` — Dev-agent token (must not be the same account as FORGE_REVIEW_TOKEN) diff --git a/supervisor/AGENTS.md b/supervisor/AGENTS.md index 12a53bc..46d1198 100644 --- a/supervisor/AGENTS.md +++ b/supervisor/AGENTS.md @@ -1,4 +1,4 @@ - + # Supervisor Agent **Role**: Health monitoring and auto-remediation, executed as a formula-driven @@ -6,10 +6,12 @@ Claude agent. Collects system and project metrics via a bash pre-flight script, then runs an interactive Claude session (sonnet) that assesses health, auto-fixes issues, escalates via Matrix, and writes a daily journal. -**Trigger**: `supervisor-run.sh` runs every 20 min via cron. It creates a tmux -session with `claude --model sonnet`, injects `formulas/run-supervisor.toml` -with pre-collected metrics as context, monitors the phase file, and cleans up -on completion or timeout (20 min max session). No action issues — the supervisor +**Trigger**: `supervisor-run.sh` runs every 20 min via cron. Sources `lib/guard.sh` +and calls `check_active supervisor` first — skips if +`$FACTORY_ROOT/state/.supervisor-active` is absent. Then creates a tmux session +with `claude --model sonnet`, injects `formulas/run-supervisor.toml` with +pre-collected metrics as context, monitors the phase file, and cleans up on +completion or timeout (20 min max session). No action issues — the supervisor runs directly from cron like the planner and predictor. **Key files**: diff --git a/vault/AGENTS.md b/vault/AGENTS.md index 92c0634..13b2edc 100644 --- a/vault/AGENTS.md +++ b/vault/AGENTS.md @@ -1,4 +1,4 @@ - + # Vault Agent **Role**: Dual-purpose gate — action safety classification and resource procurement.