chore: gardener housekeeping 2026-03-23

- dev/AGENTS.md: document two-tier priority queue (priority+backlog first,
  then plain backlog); note do_merge() HTTP 405 already-merged detection
- gardener/AGENTS.md: document merge-through protocol (stay alive through
  CI/review/merge); note session kill on PHASE:escalate
- lib/AGENTS.md: add ensure_priority_label() to ci-helpers.sh entry;
  document optional CALLBACK param in run_formula_and_monitor()
- predictor/AGENTS.md: update watermark (content already current from v2 PR)
- Update watermarks for action, planner, review, supervisor, vault, root

Grooming actions:
- #574: added ## Affected files section (lib/parse-deps.sh) to meet quality gate
- #568: escalated — needs human decision on guard/merge architecture
- #466: escalated — dep #393 closed; needs decision on external vs in-repo example

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
openhands 2026-03-23 00:11:06 +00:00
parent a2438cb580
commit 149211c78d
10 changed files with 26 additions and 19 deletions

View file

@ -1,4 +1,4 @@
<!-- last-reviewed: 251d160e213b19a4fcc0cd8f8e3be9ea3283887f --> <!-- last-reviewed: c9bf9fe5281c4037fd3f2219fde093dcbb053e00 -->
# Disinto — Agent Instructions # Disinto — Agent Instructions
## What this repo is ## What this repo is

View file

@ -1,4 +1,4 @@
<!-- last-reviewed: 251d160e213b19a4fcc0cd8f8e3be9ea3283887f --> <!-- last-reviewed: c9bf9fe5281c4037fd3f2219fde093dcbb053e00 -->
# Action Agent # Action Agent
**Role**: Execute operational tasks described by action formulas — run scripts, **Role**: Execute operational tasks described by action formulas — run scripts,

View file

@ -1,4 +1,4 @@
<!-- last-reviewed: 251d160e213b19a4fcc0cd8f8e3be9ea3283887f --> <!-- last-reviewed: c9bf9fe5281c4037fd3f2219fde093dcbb053e00 -->
# Dev Agent # Dev Agent
**Role**: Implement issues autonomously — write code, push branches, address **Role**: Implement issues autonomously — write code, push branches, address
@ -6,15 +6,16 @@ CI failures and review feedback.
**Trigger**: `dev-poll.sh` runs every 10 min via cron. It performs a direct-merge **Trigger**: `dev-poll.sh` runs every 10 min via cron. It performs a direct-merge
scan first (approved + CI green PRs — including chore/gardener PRs without issue scan first (approved + CI green PRs — including chore/gardener PRs without issue
numbers), then checks the agent lock and scans for ready backlog issues (all deps numbers), then checks the agent lock and scans for ready issues using a two-tier
closed) or orphaned in-progress issues to spawn `dev-agent.sh <issue-number>`. priority queue: (1) `priority`+`backlog` issues first (FIFO within tier), then
(2) plain `backlog` issues (FIFO). Orphaned in-progress issues are also picked up.
The direct-merge scan runs before the lock check so approved PRs get merged even The direct-merge scan runs before the lock check so approved PRs get merged even
while a dev-agent session is active on another issue. while a dev-agent session is active on another issue.
**Key files**: **Key files**:
- `dev/dev-poll.sh` — Cron scheduler: finds next ready issue, handles merge/rebase of approved PRs, tracks CI fix attempts - `dev/dev-poll.sh` — Cron scheduler: finds next ready issue, handles merge/rebase of approved PRs, tracks CI fix attempts
- `dev/dev-agent.sh` — Orchestrator: claims issue, creates worktree + tmux session with interactive `claude`, monitors phase file, injects CI results and review feedback, merges on approval - `dev/dev-agent.sh` — Orchestrator: claims issue, creates worktree + tmux session with interactive `claude`, monitors phase file, injects CI results and review feedback, merges on approval
- `dev/phase-handler.sh` — Phase callback functions: `post_refusal_comment()`, `_on_phase_change()`, `build_phase_protocol_prompt()` - `dev/phase-handler.sh` — Phase callback functions: `post_refusal_comment()`, `_on_phase_change()`, `build_phase_protocol_prompt()`. `do_merge()` detects already-merged PRs on HTTP 405 (race with dev-poll's pre-lock scan) and returns success instead of escalating
- `dev/phase-test.sh` — Integration test for the phase protocol - `dev/phase-test.sh` — Integration test for the phase protocol
**Environment variables consumed** (via `lib/env.sh` + project TOML): **Environment variables consumed** (via `lib/env.sh` + project TOML):

View file

@ -1,4 +1,4 @@
<!-- last-reviewed: 251d160e213b19a4fcc0cd8f8e3be9ea3283887f --> <!-- last-reviewed: a2438cb580d5a30bc910ad3ed9b2b57163471b3d -->
# Gardener Agent # Gardener Agent
**Role**: Backlog grooming — detect duplicate issues, missing acceptance **Role**: Backlog grooming — detect duplicate issues, missing acceptance
@ -16,8 +16,12 @@ runs directly from cron like the planner, predictor, and supervisor.
**Key files**: **Key files**:
- `gardener/gardener-run.sh` — Cron wrapper + orchestrator: lock, memory guard, - `gardener/gardener-run.sh` — Cron wrapper + orchestrator: lock, memory guard,
consumes escalation replies, sources disinto project config, creates tmux session, consumes escalation replies, sources disinto project config, creates tmux session,
injects formula prompt, monitors phase file, handles crash recovery via injects formula prompt, monitors phase file via custom `_gardener_on_phase_change`
`run_formula_and_monitor`, executes pending-actions manifest after PR merge callback (passed to `run_formula_and_monitor`). Kills session on `PHASE:escalate`
to prevent zombies. Stays alive through CI/review/merge cycle after `PHASE:awaiting_ci`
— injects CI results and review feedback, re-signals `PHASE:awaiting_ci` after
fixes, signals `PHASE:awaiting_review` on CI pass. Executes pending-actions
manifest after PR merge.
- `formulas/run-gardener.toml` — Execution spec: preflight, grooming, dust-bundling, blocked-review, agents-update, commit-and-pr - `formulas/run-gardener.toml` — Execution spec: preflight, grooming, dust-bundling, blocked-review, agents-update, commit-and-pr
- `gardener/pending-actions.json` — Manifest of deferred repo actions (label changes, - `gardener/pending-actions.json` — Manifest of deferred repo actions (label changes,
closures, comments, issue creation). Written during grooming steps, committed to the closures, comments, issue creation). Written during grooming steps, committed to the
@ -32,5 +36,7 @@ runs directly from cron like the planner, predictor, and supervisor.
consume escalation replies → load formula + context → create tmux session → consume escalation replies → load formula + context → create tmux session →
Claude grooms backlog (writes proposed actions to manifest), bundles dust, Claude grooms backlog (writes proposed actions to manifest), bundles dust,
reviews blocked issues, updates AGENTS.md, commits manifest + docs to PR → reviews blocked issues, updates AGENTS.md, commits manifest + docs to PR →
review-agent reviews all proposed actions → after merge, gardener-run.sh `PHASE:awaiting_ci` (stays alive) → CI pass → `PHASE:awaiting_review`
executes manifest actions via API → `PHASE:done`. review feedback → address + re-signal → merge → gardener-run.sh executes
manifest actions via API → `PHASE:done`. On `PHASE:escalate`: session killed
immediately.

View file

@ -1,4 +1,4 @@
<!-- last-reviewed: 251d160e213b19a4fcc0cd8f8e3be9ea3283887f --> <!-- last-reviewed: c9bf9fe5281c4037fd3f2219fde093dcbb053e00 -->
# Shared Helpers (`lib/`) # Shared Helpers (`lib/`)
All agents source `lib/env.sh` as their first action. Additional helpers are All agents source `lib/env.sh` as their first action. Additional helpers are
@ -7,12 +7,12 @@ sourced as needed.
| File | What it provides | Sourced by | | File | What it provides | Sourced by |
|---|---|---| |---|---|---|
| `lib/env.sh` | Loads `.env`, sets `FACTORY_ROOT`, exports project config (`CODEBERG_REPO`, `PROJECT_NAME`, etc.), defines `log()`, `codeberg_api()`, `codeberg_api_all()` (accepts optional second TOKEN parameter, defaults to `$CODEBERG_TOKEN`), `woodpecker_api()`, `wpdb()`, `matrix_send()`, `matrix_send_ctx()`. Auto-loads project TOML if `PROJECT_TOML` is set. | Every agent | | `lib/env.sh` | Loads `.env`, sets `FACTORY_ROOT`, exports project config (`CODEBERG_REPO`, `PROJECT_NAME`, etc.), defines `log()`, `codeberg_api()`, `codeberg_api_all()` (accepts optional second TOKEN parameter, defaults to `$CODEBERG_TOKEN`), `woodpecker_api()`, `wpdb()`, `matrix_send()`, `matrix_send_ctx()`. Auto-loads project TOML if `PROJECT_TOML` is set. | Every agent |
| `lib/ci-helpers.sh` | `ci_passed()` — returns 0 if CI state is "success" (or no CI configured). `ci_required_for_pr()` — returns 0 if PR has code files (CI required), 1 if non-code only (CI not required). `is_infra_step()` — returns 0 if a single CI step failure matches infra heuristics (clone/git exit 128, any exit 137, log timeout patterns). `classify_pipeline_failure()` — returns "infra \<reason>" if any failed Woodpecker step matches infra heuristics via `is_infra_step()`, else "code". | dev-poll, review-poll, review-pr, supervisor-poll | | `lib/ci-helpers.sh` | `ci_passed()` — returns 0 if CI state is "success" (or no CI configured). `ci_required_for_pr()` — returns 0 if PR has code files (CI required), 1 if non-code only (CI not required). `is_infra_step()` — returns 0 if a single CI step failure matches infra heuristics (clone/git exit 128, any exit 137, log timeout patterns). `classify_pipeline_failure()` — returns "infra \<reason>" if any failed Woodpecker step matches infra heuristics via `is_infra_step()`, else "code". `ensure_priority_label()` — looks up (or creates) the `priority` label and returns its ID; caches in `_PRIORITY_LABEL_ID`. | dev-poll, review-poll, review-pr, supervisor-poll |
| `lib/ci-debug.sh` | CLI tool for Woodpecker CI: `list`, `status`, `logs`, `failures` subcommands. Not sourced — run directly. | Humans / dev-agent (tool access) | | `lib/ci-debug.sh` | CLI tool for Woodpecker CI: `list`, `status`, `logs`, `failures` subcommands. Not sourced — run directly. | Humans / dev-agent (tool access) |
| `lib/load-project.sh` | Parses a `projects/*.toml` file into env vars (`PROJECT_NAME`, `CODEBERG_REPO`, `WOODPECKER_REPO_ID`, monitoring toggles, Matrix config, etc.). | env.sh (when `PROJECT_TOML` is set), supervisor-poll (per-project iteration) | | `lib/load-project.sh` | Parses a `projects/*.toml` file into env vars (`PROJECT_NAME`, `CODEBERG_REPO`, `WOODPECKER_REPO_ID`, monitoring toggles, Matrix config, etc.). | env.sh (when `PROJECT_TOML` is set), supervisor-poll (per-project iteration) |
| `lib/parse-deps.sh` | Extracts dependency issue numbers from an issue body (stdin → stdout, one number per line). Matches `## Dependencies` / `## Depends on` / `## Blocked by` sections and inline `depends on #N` patterns. Not sourced — executed via `bash lib/parse-deps.sh`. | dev-poll, supervisor-poll | | `lib/parse-deps.sh` | Extracts dependency issue numbers from an issue body (stdin → stdout, one number per line). Matches `## Dependencies` / `## Depends on` / `## Blocked by` sections and inline `depends on #N` patterns. Not sourced — executed via `bash lib/parse-deps.sh`. | dev-poll, supervisor-poll |
| `lib/matrix_listener.sh` | Long-poll Matrix sync daemon. Dispatches thread replies to the correct agent via tmux session injection (dev, action, vault, review) or well-known files (`/tmp/{agent}-escalation-reply` for supervisor/gardener). Handles all agent reply routing. Run as systemd service. | Standalone daemon | | `lib/matrix_listener.sh` | Long-poll Matrix sync daemon. Dispatches thread replies to the correct agent via tmux session injection (dev, action, vault, review) or well-known files (`/tmp/{agent}-escalation-reply` for supervisor/gardener). Handles all agent reply routing. Run as systemd service. | Standalone daemon |
| `lib/formula-session.sh` | `acquire_cron_lock()`, `check_memory()`, `load_formula()`, `build_context_block()`, `consume_escalation_reply()`, `start_formula_session()`, `formula_phase_callback()`, `build_prompt_footer()`, `run_formula_and_monitor()` — shared helpers for formula-driven cron agents (lock, memory guard, formula loading, prompt assembly, tmux session, monitor loop, crash recovery). `formula_phase_callback()` handles `PHASE:escalate` (unified escalation path — kills the session; callers may follow up via Matrix). | planner-run.sh, predictor-run.sh, gardener-run.sh, supervisor-run.sh, dev-agent.sh, action-agent.sh | | `lib/formula-session.sh` | `acquire_cron_lock()`, `check_memory()`, `load_formula()`, `build_context_block()`, `consume_escalation_reply()`, `start_formula_session()`, `formula_phase_callback()`, `build_prompt_footer()`, `run_formula_and_monitor(AGENT [TIMEOUT] [CALLBACK])` — shared helpers for formula-driven cron agents (lock, memory guard, formula loading, prompt assembly, tmux session, monitor loop, crash recovery). `formula_phase_callback()` handles `PHASE:escalate` (unified escalation path — kills the session; callers may follow up via Matrix). `run_formula_and_monitor` accepts an optional CALLBACK (default: `formula_phase_callback`) so callers can install custom merge-through or escalation handlers. | planner-run.sh, predictor-run.sh, gardener-run.sh, supervisor-run.sh, dev-agent.sh, action-agent.sh |
| `lib/secret-scan.sh` | `scan_for_secrets()` — detects potential secrets (API keys, bearer tokens, private keys, URLs with embedded credentials) in text; returns 1 if secrets found. `redact_secrets()` — replaces detected secret patterns with `[REDACTED]`. | file-action-issue.sh, phase-handler.sh | | `lib/secret-scan.sh` | `scan_for_secrets()` — detects potential secrets (API keys, bearer tokens, private keys, URLs with embedded credentials) in text; returns 1 if secrets found. `redact_secrets()` — replaces detected secret patterns with `[REDACTED]`. | file-action-issue.sh, phase-handler.sh |
| `lib/file-action-issue.sh` | `file_action_issue()` — dedup check, secret scan, label lookup, and issue creation for formula-driven cron wrappers. Sets `FILED_ISSUE_NUM` on success. Returns 4 if secrets detected in body. | (available for future use) | | `lib/file-action-issue.sh` | `file_action_issue()` — dedup check, secret scan, label lookup, and issue creation for formula-driven cron wrappers. Sets `FILED_ISSUE_NUM` on success. Returns 4 if secrets detected in body. | (available for future use) |
| `lib/agent-session.sh` | Shared tmux + Claude session helpers: `create_agent_session()`, `inject_formula()`, `agent_wait_for_claude_ready()`, `agent_inject_into_session()`, `agent_kill_session()`, `monitor_phase_loop()`, `read_phase()`, `write_compact_context()`. `create_agent_session(session, workdir, [phase_file])` optionally installs a PostToolUse hook (matcher `Bash\|Write`) that detects phase file writes in real-time — when Claude writes to the phase file, the hook writes a marker so `monitor_phase_loop` reacts on the next poll instead of waiting for mtime changes. Also installs a StopFailure hook (matcher `rate_limit\|server_error\|authentication_failed\|billing_error`) that writes `PHASE:failed` with an `api_error` reason to the phase file and touches the phase-changed marker, so the orchestrator discovers API errors within one poll cycle instead of waiting for idle timeout. Also installs a SessionStart hook (matcher `compact`) that re-injects phase protocol instructions after context compaction — callers write the context file via `write_compact_context(phase_file, content)`, and the hook (`on-compact-reinject.sh`) outputs the file content to stdout so Claude retains critical instructions. When `MATRIX_THREAD_ID` is exported, also installs a Stop hook (`on-stop-matrix.sh`) that streams each Claude response to the Matrix thread. The PreToolUse guard hook (`on-pretooluse-guard.sh`) receives the session name as a third argument — formula agents (`gardener-*`, `planner-*`, `predictor-*`, `supervisor-*`) are identified this way and allowed to access `FACTORY_ROOT` from worktrees (they need env.sh, AGENTS.md, formulas/, lib/). `monitor_phase_loop` sets `_MONITOR_LOOP_EXIT` to one of: `done`, `idle_timeout`, `idle_prompt` (Claude returned to `>` for 3 consecutive polls without writing any phase — callback invoked with `PHASE:failed`, session already dead), `crashed`, or `PHASE:escalate` / other `PHASE:*` string. **Unified escalation**: `PHASE:escalate` is the signal that a session needs human input (renamed from `PHASE:needs_human`). **Callers must handle `idle_prompt`** in both their callback and their post-loop exit handler — see [`docs/PHASE-PROTOCOL.md` idle_prompt](docs/PHASE-PROTOCOL.md#idle_prompt-exit-reason) for the full contract. | dev-agent.sh, action-agent.sh | | `lib/agent-session.sh` | Shared tmux + Claude session helpers: `create_agent_session()`, `inject_formula()`, `agent_wait_for_claude_ready()`, `agent_inject_into_session()`, `agent_kill_session()`, `monitor_phase_loop()`, `read_phase()`, `write_compact_context()`. `create_agent_session(session, workdir, [phase_file])` optionally installs a PostToolUse hook (matcher `Bash\|Write`) that detects phase file writes in real-time — when Claude writes to the phase file, the hook writes a marker so `monitor_phase_loop` reacts on the next poll instead of waiting for mtime changes. Also installs a StopFailure hook (matcher `rate_limit\|server_error\|authentication_failed\|billing_error`) that writes `PHASE:failed` with an `api_error` reason to the phase file and touches the phase-changed marker, so the orchestrator discovers API errors within one poll cycle instead of waiting for idle timeout. Also installs a SessionStart hook (matcher `compact`) that re-injects phase protocol instructions after context compaction — callers write the context file via `write_compact_context(phase_file, content)`, and the hook (`on-compact-reinject.sh`) outputs the file content to stdout so Claude retains critical instructions. When `MATRIX_THREAD_ID` is exported, also installs a Stop hook (`on-stop-matrix.sh`) that streams each Claude response to the Matrix thread. The PreToolUse guard hook (`on-pretooluse-guard.sh`) receives the session name as a third argument — formula agents (`gardener-*`, `planner-*`, `predictor-*`, `supervisor-*`) are identified this way and allowed to access `FACTORY_ROOT` from worktrees (they need env.sh, AGENTS.md, formulas/, lib/). `monitor_phase_loop` sets `_MONITOR_LOOP_EXIT` to one of: `done`, `idle_timeout`, `idle_prompt` (Claude returned to `>` for 3 consecutive polls without writing any phase — callback invoked with `PHASE:failed`, session already dead), `crashed`, or `PHASE:escalate` / other `PHASE:*` string. **Unified escalation**: `PHASE:escalate` is the signal that a session needs human input (renamed from `PHASE:needs_human`). **Callers must handle `idle_prompt`** in both their callback and their post-loop exit handler — see [`docs/PHASE-PROTOCOL.md` idle_prompt](docs/PHASE-PROTOCOL.md#idle_prompt-exit-reason) for the full contract. | dev-agent.sh, action-agent.sh |

View file

@ -1,4 +1,4 @@
<!-- last-reviewed: 251d160e213b19a4fcc0cd8f8e3be9ea3283887f --> <!-- last-reviewed: c9bf9fe5281c4037fd3f2219fde093dcbb053e00 -->
# Planner Agent # Planner Agent
**Role**: Strategic planning using a Prerequisite Tree (Theory of Constraints), **Role**: Strategic planning using a Prerequisite Tree (Theory of Constraints),

View file

@ -1,4 +1,4 @@
<!-- last-reviewed: 251d160e213b19a4fcc0cd8f8e3be9ea3283887f --> <!-- last-reviewed: c9bf9fe5281c4037fd3f2219fde093dcbb053e00 -->
# Predictor Agent # Predictor Agent
**Role**: Risk oracle and opportunity spotter (the "goblin"). Runs a 3-step **Role**: Risk oracle and opportunity spotter (the "goblin"). Runs a 3-step

View file

@ -1,4 +1,4 @@
<!-- last-reviewed: 251d160e213b19a4fcc0cd8f8e3be9ea3283887f --> <!-- last-reviewed: c9bf9fe5281c4037fd3f2219fde093dcbb053e00 -->
# Review Agent # Review Agent
**Role**: AI-powered PR review — post structured findings and formal **Role**: AI-powered PR review — post structured findings and formal

View file

@ -1,4 +1,4 @@
<!-- last-reviewed: 251d160e213b19a4fcc0cd8f8e3be9ea3283887f --> <!-- last-reviewed: c9bf9fe5281c4037fd3f2219fde093dcbb053e00 -->
# Supervisor Agent # Supervisor Agent
**Role**: Health monitoring and auto-remediation, executed as a formula-driven **Role**: Health monitoring and auto-remediation, executed as a formula-driven

View file

@ -1,4 +1,4 @@
<!-- last-reviewed: 251d160e213b19a4fcc0cd8f8e3be9ea3283887f --> <!-- last-reviewed: c9bf9fe5281c4037fd3f2219fde093dcbb053e00 -->
# Vault Agent # Vault Agent
**Role**: Dual-purpose gate — action safety classification and resource procurement. **Role**: Dual-purpose gate — action safety classification and resource procurement.