Merge pull request 'chore: gardener housekeeping 2026-03-26' (#716) from chore/gardener-20260326-0005 into main

This commit is contained in:
johba 2026-03-26 05:14:02 +01:00
commit 043bf0f021
11 changed files with 39 additions and 28 deletions

View file

@ -1,4 +1,4 @@
<!-- last-reviewed: d13f1a6997a3f5a2c9a51fea3fb18ab75f161d7b -->
<!-- last-reviewed: cebcb8c13ab7948fc794f49c379ed34570e45652 -->
# Disinto — Agent Instructions
## What this repo is

View file

@ -1,4 +1,4 @@
<!-- last-reviewed: d13f1a6997a3f5a2c9a51fea3fb18ab75f161d7b -->
<!-- last-reviewed: cebcb8c13ab7948fc794f49c379ed34570e45652 -->
# Action Agent
**Role**: Execute operational tasks described by action formulas — run scripts,
@ -13,7 +13,7 @@ session, and spawns `action-agent.sh <issue-number>`.
**Key files**:
- `action/action-poll.sh` — Cron scheduler: finds open action issues with no active tmux session, spawns action-agent.sh
- `action/action-agent.sh` — Orchestrator: fetches issue body + prior comments, creates tmux session (`action-{project}-{issue_num}`) with interactive `claude`, injects formula prompt with phase protocol, enters `monitor_phase_loop` (shared via `dev/phase-handler.sh`) for CI/review lifecycle or direct completion
- `action/action-agent.sh` — Orchestrator: fetches issue body + prior comments, **checks all dependencies via `lib/parse-deps.sh` before spawning** (skips silently if any dep is still open), creates tmux session (`action-{project}-{issue_num}`) with interactive `claude`, injects formula prompt with phase protocol, enters `monitor_phase_loop` (shared via `dev/phase-handler.sh`) for CI/review lifecycle or direct completion
**Session lifecycle**:
1. `action-poll.sh` finds open `action` issues with no active tmux session.

View file

@ -1,4 +1,4 @@
<!-- last-reviewed: d13f1a6997a3f5a2c9a51fea3fb18ab75f161d7b -->
<!-- last-reviewed: cebcb8c13ab7948fc794f49c379ed34570e45652 -->
# Dev Agent
**Role**: Implement issues autonomously — write code, push branches, address
@ -14,7 +14,7 @@ in-progress issues are also picked up. The direct-merge scan runs before the loc
check so approved PRs get merged even while a dev-agent session is active.
**Key files**:
- `dev/dev-poll.sh` — Cron scheduler: finds next ready issue, handles merge/rebase of approved PRs, tracks CI fix attempts
- `dev/dev-poll.sh` — Cron scheduler: finds next ready issue, handles merge/rebase of approved PRs, tracks CI fix attempts. Formula guard skips issues labeled `formula`, `action`, `prediction/dismissed`, or `prediction/unreviewed` (replaced `prediction/backlog` — that label no longer exists)
- `dev/dev-agent.sh` — Orchestrator: claims issue, creates worktree + tmux session with interactive `claude`, monitors phase file, injects CI results and review feedback, merges on approval
- `dev/phase-handler.sh` — Phase callback functions: `post_refusal_comment()`, `_on_phase_change()`, `build_phase_protocol_prompt()`. `do_merge()` detects already-merged PRs on HTTP 405 (race with dev-poll's pre-lock scan) and returns success instead of escalating. Sources `lib/mirrors.sh` and calls `mirror_push()` after every successful merge. Matrix escalation notifications include `MATRIX_MENTION_USER` HTML mention when set.
- `dev/phase-test.sh` — Integration test for the phase protocol

View file

@ -1,4 +1,4 @@
<!-- last-reviewed: new -->
<!-- last-reviewed: cebcb8c13ab7948fc794f49c379ed34570e45652 -->
# Executive Assistant Agent
**Role**: Interactive personal assistant for the executive (project founder).

View file

@ -1,4 +1,4 @@
<!-- last-reviewed: d13f1a6997a3f5a2c9a51fea3fb18ab75f161d7b -->
<!-- last-reviewed: cebcb8c13ab7948fc794f49c379ed34570e45652 -->
# Gardener Agent
**Role**: Backlog grooming — detect duplicate issues, missing acceptance

View file

@ -1,4 +1,4 @@
<!-- last-reviewed: d13f1a6997a3f5a2c9a51fea3fb18ab75f161d7b -->
<!-- last-reviewed: cebcb8c13ab7948fc794f49c379ed34570e45652 -->
# Shared Helpers (`lib/`)
All agents source `lib/env.sh` as their first action. Additional helpers are
@ -11,12 +11,12 @@ sourced as needed.
| `lib/ci-debug.sh` | CLI tool for Woodpecker CI: `list`, `status`, `logs`, `failures` subcommands. Not sourced — run directly. | Humans / dev-agent (tool access) |
| `lib/load-project.sh` | Parses a `projects/*.toml` file into env vars (`PROJECT_NAME`, `FORGE_REPO`, `WOODPECKER_REPO_ID`, monitoring toggles, Matrix config, etc.). | env.sh (when `PROJECT_TOML` is set), supervisor-poll (per-project iteration) |
| `lib/parse-deps.sh` | Extracts dependency issue numbers from an issue body (stdin → stdout, one number per line). Matches `## Dependencies` / `## Depends on` / `## Blocked by` sections and inline `depends on #N` / `blocked by #N` patterns. Inline scan skips fenced code blocks to prevent false positives from code examples in issue bodies. Not sourced — executed via `bash lib/parse-deps.sh`. | dev-poll, supervisor-poll |
| `lib/matrix_listener.sh` | Long-poll Matrix sync daemon. Dispatches thread replies to the correct agent via tmux session injection (dev, action, vault, review) or well-known files (`/tmp/{agent}-escalation-reply` for supervisor/gardener). Handles all agent reply routing. In compose mode, started as a background process by `docker/agents/entrypoint.sh`; on bare metal, run as systemd service (see `matrix_listener.service`). | Standalone daemon |
| `lib/formula-session.sh` | `acquire_cron_lock()`, `check_memory()`, `load_formula()`, `build_context_block()`, `consume_escalation_reply()`, `start_formula_session()`, `formula_phase_callback()`, `build_prompt_footer()`, `run_formula_and_monitor(AGENT [TIMEOUT] [CALLBACK])` — shared helpers for formula-driven cron agents (lock, memory guard, formula loading, prompt assembly, tmux session, monitor loop, crash recovery). `formula_phase_callback()` handles `PHASE:escalate` (unified escalation path — kills the session; callers may follow up via Matrix). `run_formula_and_monitor` accepts an optional CALLBACK (default: `formula_phase_callback`) so callers can install custom merge-through or escalation handlers. | planner-run.sh, predictor-run.sh, gardener-run.sh, supervisor-run.sh, dev-agent.sh, action-agent.sh |
| `lib/matrix_listener.sh` | Long-poll Matrix sync daemon. Dispatches thread replies to the correct agent via tmux session injection (dev, action, vault, review) or well-known files (`/tmp/{agent}-escalation-reply` for supervisor/gardener). Handles all agent reply routing. Uses `nohup` for robustness and validates TOML path before passing to exec-inject.sh. In compose mode, started as a background process by `docker/agents/entrypoint.sh`; on bare metal, run as systemd service (see `matrix_listener.service`). | Standalone daemon |
| `lib/formula-session.sh` | `acquire_cron_lock()`, `check_memory()`, `load_formula()`, `build_context_block()`, `consume_escalation_reply()`, `start_formula_session()`, `formula_phase_callback()`, `build_prompt_footer()`, `build_graph_section()`, `run_formula_and_monitor(AGENT [TIMEOUT] [CALLBACK])` — shared helpers for formula-driven cron agents (lock, memory guard, formula loading, prompt assembly, tmux session, monitor loop, crash recovery). `build_graph_section()` generates the structural-analysis section (runs `lib/build-graph.py`, formats JSON output) — previously duplicated in planner-run.sh and predictor-run.sh, now shared here. `formula_phase_callback()` handles `PHASE:escalate` (unified escalation path — kills the session; callers may follow up via Matrix). `run_formula_and_monitor` accepts an optional CALLBACK (default: `formula_phase_callback`) so callers can install custom merge-through or escalation handlers. | planner-run.sh, predictor-run.sh, gardener-run.sh, supervisor-run.sh, dev-agent.sh, action-agent.sh |
| `lib/guard.sh` | `check_active(agent_name)` — reads `$FACTORY_ROOT/state/.{agent_name}-active`; exits 0 (skip) if the file is absent. Factory is off by default — state files must be created to enable each agent. **Logs a message to stderr** when skipping (`[check_active] SKIP: state file not found`), so agent dropout is visible in cron logs. Sourced by dev-poll.sh, review-poll.sh, action-poll.sh, predictor-run.sh, supervisor-run.sh. | cron entry points |
| `lib/mirrors.sh` | `mirror_push()` — pushes `$PRIMARY_BRANCH` + tags to all configured mirror remotes (fire-and-forget background pushes). Reads `MIRROR_NAMES` and `MIRROR_*` vars exported by `load-project.sh` from the `[mirrors]` TOML section. Failures are logged but never block the pipeline. Sourced by dev-poll.sh and dev/phase-handler.sh — called after every successful merge. | dev-poll.sh, phase-handler.sh |
| `lib/build-graph.py` | Python tool: parses VISION.md, prerequisite-tree.md, AGENTS.md, formulas/*.toml, evidence/, and forge issues/labels into a NetworkX DiGraph. Runs structural analyses (orphaned objectives, stale prerequisites, thin evidence, circular deps) and outputs a JSON report. Used by `review-pr.sh` (per-PR changed-file analysis) and `predictor-run.sh` (full-project analysis) to provide structural context to Claude. | review-pr.sh, predictor-run.sh |
| `lib/secret-scan.sh` | `scan_for_secrets()` — detects potential secrets (API keys, bearer tokens, private keys, URLs with embedded credentials) in text; returns 1 if secrets found. `redact_secrets()` — replaces detected secret patterns with `[REDACTED]`. | file-action-issue.sh, phase-handler.sh |
| `lib/file-action-issue.sh` | `file_action_issue()` — dedup check, secret scan, label lookup, and issue creation for formula-driven cron wrappers. Sets `FILED_ISSUE_NUM` on success. Returns 4 if secrets detected in body. | (available for future use) |
| `lib/tea-helpers.sh` | `tea_file_issue(title, body, labels...)` — create issue via tea CLI with secret scanning; sets `FILED_ISSUE_NUM`. `tea_relabel(issue_num, labels...)` — replace labels. `tea_comment(issue_num, body)` — add comment with secret scanning. `tea_close(issue_num)` — close issue. All use `TEA_LOGIN` and `FORGE_REPO` from env.sh. Labels by name (no ID lookup). Sourced by env.sh when `tea` binary is available. | env.sh (conditional) |
| `lib/agent-session.sh` | Shared tmux + Claude session helpers: `create_agent_session()`, `inject_formula()`, `agent_wait_for_claude_ready()`, `agent_inject_into_session()`, `agent_kill_session()`, `monitor_phase_loop()`, `read_phase()`, `write_compact_context()`. `create_agent_session(session, workdir, [phase_file])` optionally installs a PostToolUse hook (matcher `Bash\|Write`) that detects phase file writes in real-time — when Claude writes to the phase file, the hook writes a marker so `monitor_phase_loop` reacts on the next poll instead of waiting for mtime changes. Also installs a StopFailure hook (matcher `rate_limit\|server_error\|authentication_failed\|billing_error`) that writes `PHASE:failed` with an `api_error` reason to the phase file and touches the phase-changed marker, so the orchestrator discovers API errors within one poll cycle instead of waiting for idle timeout. Also installs a SessionStart hook (matcher `compact`) that re-injects phase protocol instructions after context compaction — callers write the context file via `write_compact_context(phase_file, content)`, and the hook (`on-compact-reinject.sh`) outputs the file content to stdout so Claude retains critical instructions. When `MATRIX_THREAD_ID` is exported, also installs a Stop hook (`on-stop-matrix.sh`) that streams each Claude response to the Matrix thread. When `phase_file` is set, passes it to the idle stop hook (`on-idle-stop.sh`) so the hook can **nudge Claude** (up to 2 times) if Claude returns to the prompt without writing to the phase file — the hook injects a tmux reminder asking Claude to signal PHASE:done or PHASE:awaiting_ci. The PreToolUse guard hook (`on-pretooluse-guard.sh`) receives the session name as a third argument — formula agents (`gardener-*`, `planner-*`, `predictor-*`, `supervisor-*`) are identified this way and allowed to access `FACTORY_ROOT` from worktrees (they need env.sh, AGENTS.md, formulas/, lib/). `monitor_phase_loop` sets `_MONITOR_LOOP_EXIT` to one of: `done`, `idle_timeout`, `idle_prompt` (Claude returned to `>` for 3 consecutive polls without writing any phase — callback invoked with `PHASE:failed`, session already dead), `crashed`, or `PHASE:escalate` / other `PHASE:*` string. **Unified escalation**: `PHASE:escalate` is the signal that a session needs human input (renamed from `PHASE:needs_human`). **Callers must handle `idle_prompt`** in both their callback and their post-loop exit handler — see [`docs/PHASE-PROTOCOL.md` idle_prompt](docs/PHASE-PROTOCOL.md#idle_prompt-exit-reason) for the full contract. | dev-agent.sh, action-agent.sh |
| `lib/tea-helpers.sh` | `tea_file_issue(title, body, labels...)` — create issue via tea CLI with secret scanning; sets `FILED_ISSUE_NUM`. `tea_relabel(issue_num, labels...)` — replace labels using tea's `edit` subcommand (not `label`). `tea_comment(issue_num, body)` — add comment with secret scanning. `tea_close(issue_num)` — close issue. All use `TEA_LOGIN` and `FORGE_REPO` from env.sh. Labels by name (no ID lookup). Tea binary download verified via sha256 checksum. Sourced by env.sh when `tea` binary is available. | env.sh (conditional) |
| `lib/agent-session.sh` | Shared tmux + Claude session helpers: `create_agent_session()`, `inject_formula()`, `agent_wait_for_claude_ready()`, `agent_inject_into_session()`, `agent_kill_session()`, `monitor_phase_loop()`, `read_phase()`, `write_compact_context()`. `create_agent_session(session, workdir, [phase_file])` optionally installs a PostToolUse hook (matcher `Bash\|Write`) that detects phase file writes in real-time — when Claude writes to the phase file, the hook writes a marker so `monitor_phase_loop` reacts on the next poll instead of waiting for mtime changes. Also installs a StopFailure hook (matcher `rate_limit\|server_error\|authentication_failed\|billing_error`) that writes `PHASE:failed` with an `api_error` reason to the phase file and touches the phase-changed marker, so the orchestrator discovers API errors within one poll cycle instead of waiting for idle timeout. Also installs a SessionStart hook (matcher `compact`) that re-injects phase protocol instructions after context compaction — callers write the context file via `write_compact_context(phase_file, content)`, and the hook (`on-compact-reinject.sh`) outputs the file content to stdout so Claude retains critical instructions. When `MATRIX_THREAD_ID` is exported, also installs a Stop hook (`on-stop-matrix.sh`) that streams each Claude response to the Matrix thread. When `phase_file` is set, passes it to the idle stop hook (`on-idle-stop.sh`) so the hook can **nudge Claude** (up to 2 times) if Claude returns to the prompt without writing to the phase file — the hook injects a tmux reminder asking Claude to signal PHASE:done or PHASE:awaiting_ci. The PreToolUse guard hook (`on-pretooluse-guard.sh`) receives the session name as a third argument — formula agents (`gardener-*`, `planner-*`, `predictor-*`, `supervisor-*`) are identified this way and allowed to access `FACTORY_ROOT` from worktrees (they need env.sh, AGENTS.md, formulas/, lib/). **OAuth flock**: when `DISINTO_CONTAINER=1`, Claude CLI is wrapped in `flock -w 300 ~/.claude/session.lock` to queue concurrent token refresh attempts and prevent rotation races across agents sharing the same credentials. `monitor_phase_loop` sets `_MONITOR_LOOP_EXIT` to one of: `done`, `idle_timeout`, `idle_prompt` (Claude returned to `>` for 3 consecutive polls without writing any phase — callback invoked with `PHASE:failed`, session already dead), `crashed`, or `PHASE:escalate` / other `PHASE:*` string. **Unified escalation**: `PHASE:escalate` is the signal that a session needs human input (renamed from `PHASE:needs_human`). **Callers must handle `idle_prompt`** in both their callback and their post-loop exit handler — see [`docs/PHASE-PROTOCOL.md` idle_prompt](docs/PHASE-PROTOCOL.md#idle_prompt-exit-reason) for the full contract. | dev-agent.sh, action-agent.sh |

View file

@ -1,4 +1,4 @@
<!-- last-reviewed: d13f1a6997a3f5a2c9a51fea3fb18ab75f161d7b -->
<!-- last-reviewed: cebcb8c13ab7948fc794f49c379ed34570e45652 -->
# Planner Agent
**Role**: Strategic planning using a Prerequisite Tree (Theory of Constraints),
@ -8,10 +8,13 @@ tree from `planner/MEMORY.md` and `planner/prerequisite-tree.md`. Also reads
all available formulas: factory formulas (`$FACTORY_ROOT/formulas/*.toml`) and
project-specific formulas (`$PROJECT_REPO_ROOT/formulas/*.toml`). Phase 1
(prediction-triage): triage `prediction/unreviewed` issues filed by the
Predictor — for each prediction: action (create issue, relabel to
prediction/actioned, close) or dismiss (comment reason, relabel to
prediction/dismissed, close). No fence-sitting — dismissed predictions get
re-filed by the predictor with stronger evidence if still valid. Phase 2
Predictor — for each prediction, the planner **must** act or dismiss with a
stated reason (no fence-sitting, no `prediction/backlog` label). Actions:
promote to a real issue (relabel to `prediction/actioned`, close) or
dismiss (comment reason, relabel to `prediction/dismissed`, close).
The planner has a per-run action budget — it cannot defer indefinitely.
Dismissed predictions get re-filed by the predictor with stronger evidence
if still valid. Phase 2
(update-prerequisite-tree): scan repo state + open/closed issues, mark resolved
prerequisites, discover new ones, update the tree. **Also scans comments on
referenced issues for bounce/stuck signals** (BOUNCED, ESCALATED, LABEL_CHURN)
@ -42,8 +45,8 @@ component, not work.
**Key files**:
- `planner/planner-run.sh` — Cron wrapper + orchestrator: lock, memory guard,
sources disinto project config, creates tmux session, injects formula prompt,
monitors phase file, handles crash recovery, cleans up
sources disinto project config, builds structural analysis via `lib/formula-session.sh:build_graph_section()`,
creates tmux session, injects formula prompt, monitors phase file, handles crash recovery, cleans up
- `formulas/run-planner.toml` — Execution spec: six steps (preflight,
prediction-triage, update-prerequisite-tree, file-at-constraints,
journal-and-memory, commit-and-pr) with `needs` dependencies. Claude

View file

@ -1,4 +1,4 @@
<!-- last-reviewed: d13f1a6997a3f5a2c9a51fea3fb18ab75f161d7b -->
<!-- last-reviewed: cebcb8c13ab7948fc794f49c379ed34570e45652 -->
# Predictor Agent
**Role**: Abstract adversary (the "goblin"). Runs a 2-step formula
@ -31,10 +31,10 @@ RAM < 2000 MB).
**Key files**:
- `predictor/predictor-run.sh` — Cron wrapper + orchestrator: active-state guard,
lock, memory guard, sources disinto project config, builds structural analysis
graph via `lib/build-graph.py` (full-project scan — results included in prompt
as `## Structural analysis`; failures non-fatal), builds prompt with formula +
forge API reference, creates tmux session (sonnet), monitors phase file, handles
crash recovery via `run_formula_and_monitor`
via `lib/formula-session.sh:build_graph_section()` (full-project scan — results
included in prompt as `## Structural analysis`; failures non-fatal), builds
prompt with formula + forge API reference, creates tmux session (sonnet),
monitors phase file, handles crash recovery via `run_formula_and_monitor`
- `formulas/run-predictor.toml` — Execution spec: two steps (preflight,
find-weakness-and-act) with `needs` dependencies. Claude reviews prediction
history, explores/exploits weaknesses, and files issues in a single

View file

@ -1,4 +1,4 @@
<!-- last-reviewed: d13f1a6997a3f5a2c9a51fea3fb18ab75f161d7b -->
<!-- last-reviewed: cebcb8c13ab7948fc794f49c379ed34570e45652 -->
# Review Agent
**Role**: AI-powered PR review — post structured findings and formal

View file

@ -1,4 +1,4 @@
<!-- last-reviewed: d13f1a6997a3f5a2c9a51fea3fb18ab75f161d7b -->
<!-- last-reviewed: cebcb8c13ab7948fc794f49c379ed34570e45652 -->
# Supervisor Agent
**Role**: Health monitoring and auto-remediation, executed as a formula-driven

View file

@ -1,7 +1,7 @@
<!-- last-reviewed: d13f1a6997a3f5a2c9a51fea3fb18ab75f161d7b -->
<!-- last-reviewed: cebcb8c13ab7948fc794f49c379ed34570e45652 -->
# Vault Agent
**Role**: Dual-purpose gate — action safety classification and resource procurement.
**Role**: Three-pipeline gate — action safety classification, resource procurement, and human-action drafting.
**Pipeline A — Action Gating (*.json)**: Actions enter a pending queue and are
classified by Claude via `vault-agent.sh`, which can auto-approve (call
@ -16,6 +16,13 @@ adds secrets to `.env`) and moves the file to `vault/approved/`.
`vault-fire.sh` then extracts the proposed entry and appends it to
`RESOURCES.md`.
**Pipeline C — Rent-a-Human (outreach drafts)**: Any agent can dispatch the
`run-rent-a-human` formula (via an `action` issue) when a task requires a human
touch — posting on Reddit, commenting on HN, signing up for a service, etc.
Claude drafts copy-paste-ready content to `vault/outreach/{platform}/drafts/`
and notifies the human via Matrix for one-click execution. No vault approval
needed — the human reviews and publishes directly.
**Trigger**: `vault-poll.sh` runs every 30 min via cron.
**Key files**:
@ -24,6 +31,7 @@ adds secrets to `.env`) and moves the file to `vault/approved/`.
- `vault/PROMPT.md` — System prompt for the vault agent's Claude invocation
- `vault/vault-fire.sh` — Executes an approved action (JSON) or writes RESOURCES.md entry (procurement MD)
- `vault/vault-reject.sh` — Marks a JSON action as rejected
- `formulas/run-rent-a-human.toml` — Formula for human-action drafts: Claude researches target platform norms, drafts copy-paste content, writes to `vault/outreach/{platform}/drafts/`, notifies human via Matrix
**Procurement flow**:
1. Planner drops `vault/pending/<name>.md` with what/why/proposed RESOURCES.md entry