chore: gardener housekeeping 2026-03-25

2026-03-25 00:07:52 +00:00 · 2026-03-25 00:07:52 +00:00 · b8dc01b06f
commit b8dc01b06f
parent 6afc7f183f
11 changed files with 76 additions and 47 deletions
--- a/AGENTS.md
+++ b/AGENTS.md
@ -1,4 +1,4 @@
-<!-- last-reviewed: a2016db5c35ee3429ebaa212192983a03c4e4cb8 -->
+<!-- last-reviewed: 6afc7f183ffd831edae1a6c3f9d92e2094f2b998 -->
 # Disinto — Agent Instructions

 ## What this repo is
@ -26,7 +26,7 @@ disinto/
 │                  supervisor-poll.sh — legacy bash orchestrator (superseded)
 ├── vault/         vault-poll.sh, vault-agent.sh, vault-fire.sh — action gating + procurement
 ├── action/        action-poll.sh, action-agent.sh — operational task execution
-├── lib/           env.sh, agent-session.sh, ci-helpers.sh, ci-debug.sh, load-project.sh, parse-deps.sh, matrix_listener.sh
+├── lib/           env.sh, agent-session.sh, ci-helpers.sh, ci-debug.sh, load-project.sh, parse-deps.sh, matrix_listener.sh, guard.sh, mirrors.sh, build-graph.py
 ├── projects/      *.toml.example — templates; *.toml — local per-box config (gitignored)
 ├── formulas/      Issue templates (TOML specs for multi-step agent tasks)
 └── docs/          Protocol docs (PHASE-PROTOCOL.md, EVIDENCE-ARCHITECTURE.md)
--- a/action/AGENTS.md
+++ b/action/AGENTS.md
@ -1,4 +1,4 @@
-<!-- last-reviewed: a2016db5c35ee3429ebaa212192983a03c4e4cb8 -->
+<!-- last-reviewed: 6afc7f183ffd831edae1a6c3f9d92e2094f2b998 -->
 # Action Agent

 **Role**: Execute operational tasks described by action formulas — run scripts,
@ -6,9 +6,10 @@ call APIs, send messages, collect human approval. Shares the same phase handler
 as the dev-agent: if an action produces code changes, the orchestrator creates a
 PR and drives the CI/review loop; otherwise Claude closes the issue directly.

-**Trigger**: `action-poll.sh` runs every 10 min via cron. It scans for open
-issues labeled `action` that have no active tmux session, then spawns
-`action-agent.sh <issue-number>`.
+**Trigger**: `action-poll.sh` runs every 10 min via cron. Sources `lib/guard.sh`
+and calls `check_active action` first — skips if `$FACTORY_ROOT/state/.action-active`
+is absent. Then scans for open issues labeled `action` that have no active tmux
+session, and spawns `action-agent.sh <issue-number>`.

 **Key files**:
 - `action/action-poll.sh` — Cron scheduler: finds open action issues with no active tmux session, spawns action-agent.sh
--- a/dev/AGENTS.md
+++ b/dev/AGENTS.md
@ -1,21 +1,22 @@
-<!-- last-reviewed: a2016db5c35ee3429ebaa212192983a03c4e4cb8 -->
+<!-- last-reviewed: 6afc7f183ffd831edae1a6c3f9d92e2094f2b998 -->
 # Dev Agent

 **Role**: Implement issues autonomously — write code, push branches, address
 CI failures and review feedback.

-**Trigger**: `dev-poll.sh` runs every 10 min via cron. It performs a direct-merge
-scan first (approved + CI green PRs — including chore/gardener PRs without issue
-numbers), then checks the agent lock and scans for ready issues using a two-tier
-priority queue: (1) `priority`+`backlog` issues first (FIFO within tier), then
-(2) plain `backlog` issues (FIFO). Orphaned in-progress issues are also picked up.
-The direct-merge scan runs before the lock check so approved PRs get merged even
-while a dev-agent session is active on another issue.
+**Trigger**: `dev-poll.sh` runs every 10 min via cron. Sources `lib/guard.sh` and
+calls `check_active dev` first — skips if `$FACTORY_ROOT/state/.dev-active` is
+absent. Then performs a direct-merge scan (approved + CI green PRs — including
+chore/gardener PRs without issue numbers), then checks the agent lock and scans
+for ready issues using a two-tier priority queue: (1) `priority`+`backlog` issues
+first (FIFO within tier), then (2) plain `backlog` issues (FIFO). Orphaned
+in-progress issues are also picked up. The direct-merge scan runs before the lock
+check so approved PRs get merged even while a dev-agent session is active.

 **Key files**:
 - `dev/dev-poll.sh` — Cron scheduler: finds next ready issue, handles merge/rebase of approved PRs, tracks CI fix attempts
 - `dev/dev-agent.sh` — Orchestrator: claims issue, creates worktree + tmux session with interactive `claude`, monitors phase file, injects CI results and review feedback, merges on approval
- `dev/phase-handler.sh` — Phase callback functions: `post_refusal_comment()`, `_on_phase_change()`, `build_phase_protocol_prompt()`. `do_merge()` detects already-merged PRs on HTTP 405 (race with dev-poll's pre-lock scan) and returns success instead of escalating
+- `dev/phase-handler.sh` — Phase callback functions: `post_refusal_comment()`, `_on_phase_change()`, `build_phase_protocol_prompt()`. `do_merge()` detects already-merged PRs on HTTP 405 (race with dev-poll's pre-lock scan) and returns success instead of escalating. Sources `lib/mirrors.sh` and calls `mirror_push()` after every successful merge. Matrix escalation notifications include `MATRIX_MENTION_USER` HTML mention when set.
 - `dev/phase-test.sh` — Integration test for the phase protocol

 **Environment variables consumed** (via `lib/env.sh` + project TOML):
@ -27,6 +28,10 @@ while a dev-agent session is active on another issue.
 - `CLAUDE_TIMEOUT` — Max seconds for a Claude session (default 7200)
 - `MATRIX_TOKEN`, `MATRIX_ROOM_ID`, `MATRIX_HOMESERVER` — Notifications (optional)

-**Lifecycle**: dev-poll.sh → dev-agent.sh → create Matrix thread + export
-`MATRIX_THREAD_ID` (streams Claude output to thread via Stop hook) → tmux
-`dev-{project}-{issue}` → phase file drives CI/review loop → merge → close issue.
+**Lifecycle**: dev-poll.sh (`check_active dev`) → dev-agent.sh → create Matrix
+thread + export `MATRIX_THREAD_ID` → tmux `dev-{project}-{issue}` → phase file
+drives CI/review loop → merge + `mirror_push()` → close issue. On respawn after
+`PHASE:escalate`, the stale phase file is cleared first so the session starts
+clean; the reinject prompt tells Claude not to re-escalate for the same reason.
+On respawn for any active PR, the prompt explicitly tells Claude the PR already
+exists and not to create a new one via API.
--- a/gardener/AGENTS.md
+++ b/gardener/AGENTS.md
@ -1,4 +1,4 @@
-<!-- last-reviewed: a2016db5c35ee3429ebaa212192983a03c4e4cb8 -->
+<!-- last-reviewed: 6afc7f183ffd831edae1a6c3f9d92e2094f2b998 -->
 # Gardener Agent

 **Role**: Backlog grooming — detect duplicate issues, missing acceptance
@ -7,11 +7,13 @@ the quality gate: strips the `backlog` label from issues that lack acceptance
 criteria checkboxes (`- [ ]`) or an `## Affected files` section. Invokes
 Claude to fix or escalate to a human via Matrix.

-**Trigger**: `gardener-run.sh` runs 4x/day via cron. It creates a tmux
-session with `claude --model sonnet`, injects `formulas/run-gardener.toml`
-with escalation replies as context, monitors the phase file, and cleans up
-on completion or timeout (2h max session). No action issues — the gardener
-runs directly from cron like the planner, predictor, and supervisor.
+**Trigger**: `gardener-run.sh` runs 4x/day via cron. Sources `lib/guard.sh` and
+calls `check_active gardener` first — skips if `$FACTORY_ROOT/state/.gardener-active`
+is absent. Then creates a tmux session with `claude --model sonnet`, injects
+`formulas/run-gardener.toml` with escalation replies as context, monitors the
+phase file, and cleans up on completion or timeout (2h max session). No action
+issues — the gardener runs directly from cron like the planner, predictor, and
+supervisor.

 **Key files**:
 - `gardener/gardener-run.sh` — Cron wrapper + orchestrator: lock, memory guard,
@ -32,7 +34,7 @@ runs directly from cron like the planner, predictor, and supervisor.
 - `PRIMARY_BRANCH`, `CLAUDE_MODEL` (set to sonnet by gardener-run.sh)
 - `MATRIX_TOKEN`, `MATRIX_ROOM_ID`, `MATRIX_HOMESERVER`

-**Lifecycle**: gardener-run.sh (cron 0,6,12,18) → lock + memory guard →
+**Lifecycle**: gardener-run.sh (cron 0,6,12,18) → `check_active gardener` → lock + memory guard →
 consume escalation replies → load formula + context → create tmux session →
 Claude grooms backlog (writes proposed actions to manifest), bundles dust,
 reviews blocked issues, updates AGENTS.md, commits manifest + docs to PR →
--- a/gardener/pending-actions.json
+++ b/gardener/pending-actions.json
--- a/lib/AGENTS.md
+++ b/lib/AGENTS.md
@ -1,4 +1,4 @@
-<!-- last-reviewed: a2016db5c35ee3429ebaa212192983a03c4e4cb8 -->
+<!-- last-reviewed: 6afc7f183ffd831edae1a6c3f9d92e2094f2b998 -->
 # Shared Helpers (`lib/`)

 All agents source `lib/env.sh` as their first action. Additional helpers are
@ -13,6 +13,9 @@ sourced as needed.
 | `lib/parse-deps.sh` | Extracts dependency issue numbers from an issue body (stdin → stdout, one number per line). Matches `## Dependencies` / `## Depends on` / `## Blocked by` sections and inline `depends on #N` / `blocked by #N` patterns. Inline scan skips fenced code blocks to prevent false positives from code examples in issue bodies. Not sourced — executed via `bash lib/parse-deps.sh`. | dev-poll, supervisor-poll |
 | `lib/matrix_listener.sh` | Long-poll Matrix sync daemon. Dispatches thread replies to the correct agent via tmux session injection (dev, action, vault, review) or well-known files (`/tmp/{agent}-escalation-reply` for supervisor/gardener). Handles all agent reply routing. Run as systemd service. | Standalone daemon |
 | `lib/formula-session.sh` | `acquire_cron_lock()`, `check_memory()`, `load_formula()`, `build_context_block()`, `consume_escalation_reply()`, `start_formula_session()`, `formula_phase_callback()`, `build_prompt_footer()`, `run_formula_and_monitor(AGENT [TIMEOUT] [CALLBACK])` — shared helpers for formula-driven cron agents (lock, memory guard, formula loading, prompt assembly, tmux session, monitor loop, crash recovery). `formula_phase_callback()` handles `PHASE:escalate` (unified escalation path — kills the session; callers may follow up via Matrix). `run_formula_and_monitor` accepts an optional CALLBACK (default: `formula_phase_callback`) so callers can install custom merge-through or escalation handlers. | planner-run.sh, predictor-run.sh, gardener-run.sh, supervisor-run.sh, dev-agent.sh, action-agent.sh |
+| `lib/guard.sh` | `check_active(agent_name)` — reads `$FACTORY_ROOT/state/.{agent_name}-active`; exits 0 (skip) if the file is absent. Factory is off by default — state files must be created to enable each agent. Sourced by dev-poll.sh, review-poll.sh, action-poll.sh, predictor-run.sh, supervisor-run.sh. | cron entry points |
+| `lib/mirrors.sh` | `mirror_push()` — pushes `$PRIMARY_BRANCH` + tags to all configured mirror remotes (fire-and-forget background pushes). Reads `MIRROR_NAMES` and `MIRROR_*` vars exported by `load-project.sh` from the `[mirrors]` TOML section. Failures are logged but never block the pipeline. Sourced by dev-poll.sh and dev/phase-handler.sh — called after every successful merge. | dev-poll.sh, phase-handler.sh |
+| `lib/build-graph.py` | Python tool: parses VISION.md, prerequisite-tree.md, AGENTS.md, formulas/*.toml, evidence/, and forge issues/labels into a NetworkX DiGraph. Runs structural analyses (orphaned objectives, stale prerequisites, thin evidence, circular deps) and outputs a JSON report. Used by `review-pr.sh` (per-PR changed-file analysis) and `predictor-run.sh` (full-project analysis) to provide structural context to Claude. | review-pr.sh, predictor-run.sh |
 | `lib/secret-scan.sh` | `scan_for_secrets()` — detects potential secrets (API keys, bearer tokens, private keys, URLs with embedded credentials) in text; returns 1 if secrets found. `redact_secrets()` — replaces detected secret patterns with `[REDACTED]`. | file-action-issue.sh, phase-handler.sh |
 | `lib/file-action-issue.sh` | `file_action_issue()` — dedup check, secret scan, label lookup, and issue creation for formula-driven cron wrappers. Sets `FILED_ISSUE_NUM` on success. Returns 4 if secrets detected in body. | (available for future use) |
 | `lib/agent-session.sh` | Shared tmux + Claude session helpers: `create_agent_session()`, `inject_formula()`, `agent_wait_for_claude_ready()`, `agent_inject_into_session()`, `agent_kill_session()`, `monitor_phase_loop()`, `read_phase()`, `write_compact_context()`. `create_agent_session(session, workdir, [phase_file])` optionally installs a PostToolUse hook (matcher `Bash\|Write`) that detects phase file writes in real-time — when Claude writes to the phase file, the hook writes a marker so `monitor_phase_loop` reacts on the next poll instead of waiting for mtime changes. Also installs a StopFailure hook (matcher `rate_limit\|server_error\|authentication_failed\|billing_error`) that writes `PHASE:failed` with an `api_error` reason to the phase file and touches the phase-changed marker, so the orchestrator discovers API errors within one poll cycle instead of waiting for idle timeout. Also installs a SessionStart hook (matcher `compact`) that re-injects phase protocol instructions after context compaction — callers write the context file via `write_compact_context(phase_file, content)`, and the hook (`on-compact-reinject.sh`) outputs the file content to stdout so Claude retains critical instructions. When `MATRIX_THREAD_ID` is exported, also installs a Stop hook (`on-stop-matrix.sh`) that streams each Claude response to the Matrix thread. When `phase_file` is set, passes it to the idle stop hook (`on-idle-stop.sh`) so the hook can **nudge Claude** (up to 2 times) if Claude returns to the prompt without writing to the phase file — the hook injects a tmux reminder asking Claude to signal PHASE:done or PHASE:awaiting_ci. The PreToolUse guard hook (`on-pretooluse-guard.sh`) receives the session name as a third argument — formula agents (`gardener-*`, `planner-*`, `predictor-*`, `supervisor-*`) are identified this way and allowed to access `FACTORY_ROOT` from worktrees (they need env.sh, AGENTS.md, formulas/, lib/). `monitor_phase_loop` sets `_MONITOR_LOOP_EXIT` to one of: `done`, `idle_timeout`, `idle_prompt` (Claude returned to `>` for 3 consecutive polls without writing any phase — callback invoked with `PHASE:failed`, session already dead), `crashed`, or `PHASE:escalate` / other `PHASE:*` string. **Unified escalation**: `PHASE:escalate` is the signal that a session needs human input (renamed from `PHASE:needs_human`). **Callers must handle `idle_prompt`** in both their callback and their post-loop exit handler — see [`docs/PHASE-PROTOCOL.md` idle_prompt](docs/PHASE-PROTOCOL.md#idle_prompt-exit-reason) for the full contract. | dev-agent.sh, action-agent.sh |
--- a/planner/AGENTS.md
+++ b/planner/AGENTS.md
@ -1,4 +1,4 @@
-<!-- last-reviewed: a2016db5c35ee3429ebaa212192983a03c4e4cb8 -->
+<!-- last-reviewed: 6afc7f183ffd831edae1a6c3f9d92e2094f2b998 -->
 # Planner Agent

 **Role**: Strategic planning using a Prerequisite Tree (Theory of Constraints),
@ -31,10 +31,12 @@ and `$PROJECT_REPO_ROOT/vault/`, not `$FACTORY_ROOT`. Each project manages its
 own planner state independently.

 **Trigger**: `planner-run.sh` runs daily via cron (accepts an optional project
-TOML argument, defaults to `projects/disinto.toml`). It creates a tmux session
-with `claude --model opus`, injects `formulas/run-planner.toml` as context,
-monitors the phase file, and cleans up on completion or timeout. No action
-issues — the planner is a nervous system component, not work.
+TOML argument, defaults to `projects/disinto.toml`). Sources `lib/guard.sh` and
+calls `check_active planner` first — skips if `$FACTORY_ROOT/state/.planner-active`
+is absent. Then creates a tmux session with `claude --model opus`, injects
+`formulas/run-planner.toml` as context, monitors the phase file, and cleans up
+on completion or timeout. No action issues — the planner is a nervous system
+component, not work.

 **Key files**:
 - `planner/planner-run.sh` — Cron wrapper + orchestrator: lock, memory guard,
--- a/predictor/AGENTS.md
+++ b/predictor/AGENTS.md
@ -1,4 +1,4 @@
-<!-- last-reviewed: a2016db5c35ee3429ebaa212192983a03c4e4cb8 -->
+<!-- last-reviewed: 6afc7f183ffd831edae1a6c3f9d92e2094f2b998 -->
 # Predictor Agent

 **Role**: Abstract adversary (the "goblin"). Runs a 2-step formula
@ -23,14 +23,18 @@ emit feature work — only observations challenging claims, exposing gaps,
 and surfacing risks.

 **Trigger**: `predictor-run.sh` runs daily at 06:00 UTC via cron (1h before
-the planner at 07:00). Guarded by PID lock (`/tmp/predictor-run.lock`) and
-memory check (skips if available RAM < 2000 MB).
+the planner at 07:00). Sources `lib/guard.sh` and calls `check_active predictor`
+first — skips if `$FACTORY_ROOT/state/.predictor-active` is absent. Also guarded
+by PID lock (`/tmp/predictor-run.lock`) and memory check (skips if available
+RAM < 2000 MB).

 **Key files**:
- `predictor/predictor-run.sh` — Cron wrapper + orchestrator: lock, memory guard,
-  sources disinto project config, builds prompt with formula + forge API
-  reference, creates tmux session (sonnet), monitors phase file, handles crash
-  recovery via `run_formula_and_monitor`
+- `predictor/predictor-run.sh` — Cron wrapper + orchestrator: active-state guard,
+  lock, memory guard, sources disinto project config, builds structural analysis
+  graph via `lib/build-graph.py` (full-project scan — results included in prompt
+  as `## Structural analysis`; failures non-fatal), builds prompt with formula +
+  forge API reference, creates tmux session (sonnet), monitors phase file, handles
+  crash recovery via `run_formula_and_monitor`
 - `formulas/run-predictor.toml` — Execution spec: two steps (preflight,
  find-weakness-and-act) with `needs` dependencies. Claude reviews prediction
  history, explores/exploits weaknesses, and files issues in a single
--- a/review/AGENTS.md
+++ b/review/AGENTS.md
@ -1,4 +1,4 @@
-<!-- last-reviewed: a2016db5c35ee3429ebaa212192983a03c4e4cb8 -->
+<!-- last-reviewed: 6afc7f183ffd831edae1a6c3f9d92e2094f2b998 -->
 # Review Agent

 **Role**: AI-powered PR review — post structured findings and formal
@ -9,8 +9,8 @@ whose CI has passed and that lack a review for the current HEAD SHA, then
 spawns `review-pr.sh <pr-number>`.

 **Key files**:
- `review/review-poll.sh` — Cron scheduler: finds unreviewed PRs with passing CI
- `review/review-pr.sh` — Creates/reuses a tmux session (`review-{project}-{pr}`), injects PR diff, waits for Claude to write structured JSON output, posts markdown review + formal forge review, auto-creates follow-up issues for pre-existing tech debt
+- `review/review-poll.sh` — Cron scheduler: finds unreviewed PRs with passing CI. Sources `lib/guard.sh` and calls `check_active reviewer` — skips if `$FACTORY_ROOT/state/.reviewer-active` is absent.
+- `review/review-pr.sh` — Creates/reuses a tmux session (`review-{project}-{pr}`), injects PR diff, waits for Claude to write structured JSON output, posts markdown review + formal forge review, auto-creates follow-up issues for pre-existing tech debt. Before starting the session, runs `lib/build-graph.py --changed-files <PR files>` and appends the JSON structural analysis (affected objectives, orphaned prerequisites, thin evidence) to the review prompt. Graph failures are non-fatal — review proceeds without it.

 **Environment variables consumed**:
 - `FORGE_TOKEN` — Dev-agent token (must not be the same account as FORGE_REVIEW_TOKEN)
--- a/supervisor/AGENTS.md
+++ b/supervisor/AGENTS.md
@ -1,4 +1,4 @@
-<!-- last-reviewed: a2016db5c35ee3429ebaa212192983a03c4e4cb8 -->
+<!-- last-reviewed: 6afc7f183ffd831edae1a6c3f9d92e2094f2b998 -->
 # Supervisor Agent

 **Role**: Health monitoring and auto-remediation, executed as a formula-driven
@ -6,10 +6,12 @@ Claude agent. Collects system and project metrics via a bash pre-flight script,
 then runs an interactive Claude session (sonnet) that assesses health, auto-fixes
 issues, escalates via Matrix, and writes a daily journal.

-**Trigger**: `supervisor-run.sh` runs every 20 min via cron. It creates a tmux
-session with `claude --model sonnet`, injects `formulas/run-supervisor.toml`
-with pre-collected metrics as context, monitors the phase file, and cleans up
-on completion or timeout (20 min max session). No action issues — the supervisor
+**Trigger**: `supervisor-run.sh` runs every 20 min via cron. Sources `lib/guard.sh`
+and calls `check_active supervisor` first — skips if
+`$FACTORY_ROOT/state/.supervisor-active` is absent. Then creates a tmux session
+with `claude --model sonnet`, injects `formulas/run-supervisor.toml` with
+pre-collected metrics as context, monitors the phase file, and cleans up on
+completion or timeout (20 min max session). No action issues — the supervisor
 runs directly from cron like the planner and predictor.

 **Key files**:
--- a/vault/AGENTS.md
+++ b/vault/AGENTS.md
@ -1,4 +1,4 @@
-<!-- last-reviewed: a2016db5c35ee3429ebaa212192983a03c4e4cb8 -->
+<!-- last-reviewed: 6afc7f183ffd831edae1a6c3f9d92e2094f2b998 -->
 # Vault Agent

 **Role**: Dual-purpose gate — action safety classification and resource procurement.