Merge pull request 'fix: fix: gardener runs as cron-driven formula — runtime wrapper (#246)' (#381) from fix/issue-246 into main

This commit is contained in:
johba 2026-03-20 14:29:03 +01:00
commit 3f93554430
7 changed files with 171 additions and 46 deletions

View file

@ -91,7 +91,7 @@ echo "=== 2/2 Function resolution ==="
# Functions provided by shared lib files (available to all agent scripts via source)
LIB_FUNS=$(
for f in lib/agent-session.sh lib/env.sh lib/ci-helpers.sh lib/load-project.sh; do
for f in lib/agent-session.sh lib/env.sh lib/ci-helpers.sh lib/load-project.sh lib/file-action-issue.sh; do
if [ -f "$f" ]; then get_fns "$f"; fi
done | sort -u
)
@ -159,6 +159,7 @@ check_script dev/dev-poll.sh
check_script dev/phase-test.sh
check_script gardener/gardener-agent.sh lib/agent-session.sh
check_script gardener/gardener-poll.sh
check_script gardener/gardener-run.sh
check_script review/review-pr.sh
check_script review/review-poll.sh
check_script planner/planner-poll.sh

View file

@ -16,7 +16,8 @@ See `README.md` for the full architecture and `BOOTSTRAP.md` for setup.
disinto/
├── dev/ dev-poll.sh, dev-agent.sh, phase-handler.sh — issue implementation
├── review/ review-poll.sh, review-pr.sh — PR review
├── gardener/ gardener-poll.sh, gardener-agent.sh — backlog grooming
├── gardener/ gardener-run.sh — files action issue for run-gardener formula
│ gardener-poll.sh, gardener-agent.sh — recipe engine + grooming
├── planner/ planner-poll.sh — files action issue for run-planner formula
│ prediction-poll.sh, prediction-agent.sh — evidence-based predictions
├── supervisor/ supervisor-poll.sh — health monitoring
@ -111,12 +112,17 @@ spawns `review-pr.sh <pr-number>`.
criteria, oversized issues, stale issues, and circular dependencies. Invoke
Claude to fix or escalate to a human via Matrix.
**Trigger**: `gardener-poll.sh` runs daily (or 2x/day) via cron. Accepts an
optional project TOML argument.
**Trigger**: `gardener-run.sh` runs 2x/day via cron. It files an `action`
issue referencing `formulas/run-gardener.toml`; the [action-agent](#action-action)
picks it up and executes the gardener steps in an interactive Claude tmux session.
Accepts an optional project TOML argument (configures which project the action
issue is filed against).
**Key files**:
- `gardener/gardener-poll.sh` — Cron wrapper: lock, escalation-reply injection for dev sessions, calls `gardener-agent.sh`, then processes dev-agent CI escalations via recipe engine
- `gardener/gardener-run.sh` — Cron wrapper: lock, memory guard, dedup check, files action issue
- `gardener/gardener-poll.sh` — Recipe engine: escalation-reply injection for dev sessions, processes dev-agent CI escalations via recipe engine (invoked by formula step ci-escalation-recipes)
- `gardener/gardener-agent.sh` — Orchestrator: bash pre-analysis, creates tmux session (`gardener-{project}`) with interactive `claude`, monitors phase file, parses result file (ACTION:/DUST:/ESCALATE), handles dust bundling
- `formulas/run-gardener.toml` — Execution spec: preflight, grooming, blocked-review, CI escalation recipes, agents-update, commit-and-pr
**Environment variables consumed**:
- `CODEBERG_TOKEN`, `CODEBERG_REPO`, `CODEBERG_API`, `PROJECT_NAME`, `PROJECT_REPO_ROOT`
@ -278,6 +284,7 @@ sourced as needed.
| `lib/load-project.sh` | Parses a `projects/*.toml` file into env vars (`PROJECT_NAME`, `CODEBERG_REPO`, `WOODPECKER_REPO_ID`, monitoring toggles, Matrix config, etc.). | env.sh (when `PROJECT_TOML` is set), supervisor-poll (per-project iteration) |
| `lib/parse-deps.sh` | Extracts dependency issue numbers from an issue body (stdin → stdout, one number per line). Matches `## Dependencies` / `## Depends on` / `## Blocked by` sections and inline `depends on #N` patterns. Not sourced — executed via `bash lib/parse-deps.sh`. | dev-poll, supervisor-poll |
| `lib/matrix_listener.sh` | Long-poll Matrix sync daemon. Dispatches thread replies to the correct agent via well-known files (`/tmp/{agent}-escalation-reply`). Handles supervisor, gardener, dev, review, vault, and action reply routing. Run as systemd service. | Standalone daemon |
| `lib/file-action-issue.sh` | `file_action_issue()` — dedup check, label lookup, and issue creation for formula-driven cron wrappers. Sets `FILED_ISSUE_NUM` on success. | gardener-run.sh, planner-poll.sh |
| `lib/agent-session.sh` | Shared tmux + Claude session helpers: `create_agent_session()`, `inject_formula()`, `agent_wait_for_claude_ready()`, `agent_inject_into_session()`, `agent_kill_session()`, `monitor_phase_loop()`, `read_phase()`. `create_agent_session(session, workdir, [phase_file])` optionally installs a PostToolUse hook (matcher `Bash\|Write`) that detects phase file writes in real-time — when Claude writes to the phase file, the hook writes a marker so `monitor_phase_loop` reacts on the next poll instead of waiting for mtime changes. Also installs a StopFailure hook (matcher `rate_limit\|server_error\|authentication_failed\|billing_error`) that writes `PHASE:failed` with an `api_error` reason to the phase file and touches the phase-changed marker, so the orchestrator discovers API errors within one poll cycle instead of waiting for idle timeout. When `MATRIX_THREAD_ID` is exported, also installs a Stop hook (`on-stop-matrix.sh`) that streams each Claude response to the Matrix thread. `monitor_phase_loop` sets `_MONITOR_LOOP_EXIT` to one of: `done`, `idle_timeout`, `idle_prompt` (Claude returned to `` for 3 consecutive polls without writing any phase — callback invoked with `PHASE:failed`, session already dead), `crashed`, or a `PHASE:*` string. Agents must handle `idle_prompt` in both their callback and their post-loop exit handler. | dev-agent.sh, gardener-agent.sh, action-agent.sh |
---

View file

@ -191,7 +191,7 @@ needs = ["grooming"]
id = "ci-escalation-recipes"
title = "CI escalation recipes (bash — gardener-poll.sh)"
executor = "bash"
script = "gardener/gardener-poll.sh"
script = "gardener/gardener-poll.sh --recipes-only"
description = """
NOT a Claude step executed by gardener-poll.sh before/after the Claude session.
Documented here so the formula covers the full gardener run.

View file

@ -25,8 +25,16 @@ set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
FACTORY_ROOT="$(dirname "$SCRIPT_DIR")"
# --recipes-only: skip grooming (used by formula ci-escalation-recipes step
# to avoid double-running grooming which the formula handles as its own step)
RECIPES_ONLY=0
if [ "${1:-}" = "--recipes-only" ]; then
RECIPES_ONLY=1
shift
fi
# Load shared environment (with optional project TOML override)
# Usage: gardener-poll.sh [projects/harb.toml]
# Usage: gardener-poll.sh [--recipes-only] [projects/harb.toml]
export PROJECT_TOML="${1:-}"
# shellcheck source=../lib/env.sh
source "$FACTORY_ROOT/lib/env.sh"
@ -108,8 +116,13 @@ Instructions:
done
# ── Backlog grooming (delegated to gardener-agent.sh) ────────────────────
# Skipped with --recipes-only (formula's grooming step handles this)
if [ "$RECIPES_ONLY" -eq 0 ]; then
log "Invoking gardener-agent.sh for backlog grooming"
bash "$SCRIPT_DIR/gardener-agent.sh" "${1:-}" || log "WARNING: gardener-agent.sh exited with error"
else
log "Skipping grooming (--recipes-only mode)"
fi
# ── Recipe matching engine ────────────────────────────────────────────────

75
gardener/gardener-run.sh Executable file
View file

@ -0,0 +1,75 @@
#!/usr/bin/env bash
# =============================================================================
# gardener-run.sh — Cron wrapper: files action issue for run-gardener formula
#
# Runs 2x/day (or on-demand). Guards against concurrent runs and low memory.
# Files an action issue referencing formulas/run-gardener.toml; the action-agent
# picks it up and executes the gardener steps in an interactive Claude session.
# =============================================================================
set -euo pipefail
FACTORY_ROOT="$(cd "$(dirname "$0")/.." && pwd)"
# Load shared environment (with optional project TOML override)
# Usage: gardener-run.sh [projects/harb.toml]
export PROJECT_TOML="${1:-}"
# shellcheck source=../lib/env.sh
source "$FACTORY_ROOT/lib/env.sh"
# shellcheck source=../lib/file-action-issue.sh
source "$FACTORY_ROOT/lib/file-action-issue.sh"
LOG_FILE="$FACTORY_ROOT/gardener/gardener.log"
LOCK_FILE="/tmp/gardener-run.lock"
log() { echo "[$(date -u +%Y-%m-%dT%H:%M:%S)Z] $*" >> "$LOG_FILE"; }
# ── Lock ──────────────────────────────────────────────────────────────────
if [ -f "$LOCK_FILE" ]; then
LOCK_PID=$(cat "$LOCK_FILE" 2>/dev/null || true)
if [ -n "$LOCK_PID" ] && kill -0 "$LOCK_PID" 2>/dev/null; then
log "run: gardener-run running (PID $LOCK_PID)"
exit 0
fi
rm -f "$LOCK_FILE"
fi
echo $$ > "$LOCK_FILE"
trap 'rm -f "$LOCK_FILE"' EXIT
# ── Memory guard ──────────────────────────────────────────────────────────
AVAIL_MB=$(free -m | awk '/Mem:/{print $7}')
if [ "${AVAIL_MB:-0}" -lt 2000 ]; then
log "run: skipping — only ${AVAIL_MB}MB available (need 2000)"
exit 0
fi
log "--- Gardener run start ---"
# ── File action issue for run-gardener formula ────────────────────────────
ISSUE_BODY="---
formula: run-gardener
model: opus
---
Periodic gardener housekeeping run. The action-agent reads \`formulas/run-gardener.toml\`
and executes the steps: preflight, grooming, blocked-review, CI escalation recipes,
AGENTS.md update, and commit-and-pr.
Filed automatically by \`gardener-run.sh\`."
_rc=0
file_action_issue "run-gardener" "action: run-gardener — periodic housekeeping" "$ISSUE_BODY" || _rc=$?
case "$_rc" in
0) ;;
1) log "run: open run-gardener action issue already exists — skipping"
log "--- Gardener run done ---"
exit 0 ;;
2) log "ERROR: 'action' label not found — cannot file gardener issue"
exit 1 ;;
*) log "ERROR: failed to create action issue for run-gardener"
exit 1 ;;
esac
log "Filed action issue #${FILED_ISSUE_NUM} for run-gardener formula"
matrix_send "gardener" "Filed action #${FILED_ISSUE_NUM}: run-gardener — periodic housekeeping" 2>/dev/null || true
log "--- Gardener run done ---"

49
lib/file-action-issue.sh Normal file
View file

@ -0,0 +1,49 @@
#!/usr/bin/env bash
# file-action-issue.sh — File an action issue for a formula run
#
# Usage: source this file, then call file_action_issue.
# Requires: codeberg_api() from lib/env.sh, jq
#
# file_action_issue <formula_name> <title> <body>
# Sets FILED_ISSUE_NUM on success.
# Returns: 0=created, 1=duplicate exists, 2=label not found, 3=API error
file_action_issue() {
local formula_name="$1" title="$2" body="$3"
FILED_ISSUE_NUM=""
# Dedup: skip if an open action issue for this formula already exists
local open_actions
open_actions=$(codeberg_api GET "/issues?state=open&type=issues&labels=action&limit=50" 2>/dev/null || true)
if [ -n "$open_actions" ] && [ "$open_actions" != "null" ]; then
local existing
existing=$(printf '%s' "$open_actions" | \
jq --arg f "$formula_name" '[.[] | select(.title | test($f))] | length' 2>/dev/null || echo 0)
if [ "${existing:-0}" -gt 0 ]; then
return 1
fi
fi
# Fetch 'action' label ID
local action_label_id
action_label_id=$(codeberg_api GET "/labels" 2>/dev/null | \
jq -r '.[] | select(.name == "action") | .id' 2>/dev/null || true)
if [ -z "$action_label_id" ]; then
return 2
fi
# Create the issue
local payload result
payload=$(jq -nc \
--arg title "$title" \
--arg body "$body" \
--argjson labels "[$action_label_id]" \
'{title: $title, body: $body, labels: $labels}')
result=$(codeberg_api POST "/issues" -d "$payload" 2>/dev/null || true)
FILED_ISSUE_NUM=$(printf '%s' "$result" | jq -r '.number // empty' 2>/dev/null || true)
if [ -z "$FILED_ISSUE_NUM" ]; then
return 3
fi
}

View file

@ -13,6 +13,8 @@ FACTORY_ROOT="$(dirname "$SCRIPT_DIR")"
# shellcheck source=../lib/env.sh
source "$FACTORY_ROOT/lib/env.sh"
# shellcheck source=../lib/file-action-issue.sh
source "$FACTORY_ROOT/lib/file-action-issue.sh"
LOG_FILE="$SCRIPT_DIR/planner.log"
LOCK_FILE="/tmp/planner-poll.lock"
@ -40,28 +42,7 @@ fi
log "--- Planner poll start ---"
# ── Dedup: skip if an open run-planner action issue already exists ────────
OPEN_ACTIONS=$(codeberg_api GET "/issues?state=open&type=issues&labels=action&limit=50" 2>/dev/null || true)
if [ -n "$OPEN_ACTIONS" ] && [ "$OPEN_ACTIONS" != "null" ]; then
EXISTING=$(printf '%s' "$OPEN_ACTIONS" | \
jq '[.[] | select(.title | test("run-planner"))] | length' 2>/dev/null || echo 0)
if [ "${EXISTING:-0}" -gt 0 ]; then
log "poll: open run-planner action issue already exists — skipping"
log "--- Planner poll done ---"
exit 0
fi
fi
# ── Fetch 'action' label ID ──────────────────────────────────────────────
ACTION_LABEL_ID=$(codeberg_api GET "/labels" 2>/dev/null | \
jq -r '.[] | select(.name == "action") | .id' 2>/dev/null || true)
if [ -z "$ACTION_LABEL_ID" ]; then
log "ERROR: 'action' label not found — cannot file planner issue"
exit 1
fi
# ── File action issue ─────────────────────────────────────────────────────
# ── File action issue for run-planner formula ─────────────────────────────
ISSUE_BODY="---
formula: run-planner
model: opus
@ -73,21 +54,20 @@ strategic planning (resource+leverage gap analysis), and memory update.
Filed automatically by \`planner-poll.sh\`."
PAYLOAD=$(jq -nc \
--arg title "action: run-planner — periodic strategic planning" \
--arg body "$ISSUE_BODY" \
--argjson labels "[$ACTION_LABEL_ID]" \
'{title: $title, body: $body, labels: $labels}')
_rc=0
file_action_issue "run-planner" "action: run-planner — periodic strategic planning" "$ISSUE_BODY" || _rc=$?
case "$_rc" in
0) ;;
1) log "poll: open run-planner action issue already exists — skipping"
log "--- Planner poll done ---"
exit 0 ;;
2) log "ERROR: 'action' label not found — cannot file planner issue"
exit 1 ;;
*) log "ERROR: failed to create action issue for run-planner"
exit 1 ;;
esac
RESULT=$(codeberg_api POST "/issues" -d "$PAYLOAD" 2>/dev/null || true)
ISSUE_NUM=$(printf '%s' "$RESULT" | jq -r '.number // empty' 2>/dev/null || true)
if [ -z "$ISSUE_NUM" ]; then
log "ERROR: failed to create action issue for run-planner"
exit 1
fi
log "Filed action issue #${ISSUE_NUM} for run-planner formula"
matrix_send "planner" "Filed action #${ISSUE_NUM}: run-planner — periodic strategic planning" 2>/dev/null || true
log "Filed action issue #${FILED_ISSUE_NUM} for run-planner formula"
matrix_send "planner" "Filed action #${FILED_ISSUE_NUM}: run-planner — periodic strategic planning" 2>/dev/null || true
log "--- Planner poll done ---"