From a15623b747cb29012bb060ba7b98e12412a5a4db Mon Sep 17 00:00:00 2001 From: johba Date: Fri, 20 Mar 2026 17:39:44 +0100 Subject: [PATCH] =?UTF-8?q?fix:=20fix:=20action-agent=20shares=20phase=20h?= =?UTF-8?q?andler=20with=20dev-agent=20=E2=80=94=20review=20lifecycle=20+?= =?UTF-8?q?=20cleanup=20(#388)=20(#403)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Fixes #388 ## Changes Action-agent now sources dev/phase-handler.sh and enters monitor_phase_loop after prompt injection. Two paths: (A) git output triggers the same PR/CI/review lifecycle as dev-agent, (B) no-git output writes PHASE:done for cleanup. Adds docker compose down on terminal phases, escalation to supervisor on idle timeout, and proper temp file cleanup. Co-authored-by: openhands Reviewed-on: https://codeberg.org/johba/disinto/pulls/403 Reviewed-by: Disinto_bot --- .woodpecker/agent-smoke.sh | 2 +- AGENTS.md | 18 ++--- action/action-agent.sh | 137 ++++++++++++++++++++++++++----------- dev/phase-handler.sh | 106 ++++++++++++++++++++++++---- 4 files changed, 198 insertions(+), 65 deletions(-) diff --git a/.woodpecker/agent-smoke.sh b/.woodpecker/agent-smoke.sh index a6a1bda..87e148d 100644 --- a/.woodpecker/agent-smoke.sh +++ b/.woodpecker/agent-smoke.sh @@ -170,7 +170,7 @@ check_script vault/vault-fire.sh check_script vault/vault-poll.sh check_script vault/vault-reject.sh check_script action/action-poll.sh -check_script action/action-agent.sh +check_script action/action-agent.sh dev/phase-handler.sh echo "function resolution check done" diff --git a/AGENTS.md b/AGENTS.md index 48a9b0e..10ab265 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -227,9 +227,9 @@ later triages these predictions into action/backlog issues or dismisses them. ### Action (`action/`) **Role**: Execute operational tasks described by action formulas — run scripts, -call APIs, send messages, collect human approval. Unlike the dev-agent, the -action-agent produces no PRs: Claude closes the issue directly after executing -all formula steps. +call APIs, send messages, collect human approval. Shares the same phase handler +as the dev-agent: if an action produces code changes, the orchestrator creates a +PR and drives the CI/review loop; otherwise Claude closes the issue directly. **Trigger**: `action-poll.sh` runs every 10 min via cron. It scans for open issues labeled `action` that have no active tmux session, then spawns @@ -237,17 +237,17 @@ issues labeled `action` that have no active tmux session, then spawns **Key files**: - `action/action-poll.sh` — Cron scheduler: finds open action issues with no active tmux session, spawns action-agent.sh -- `action/action-agent.sh` — Orchestrator: fetches issue body + prior comments, creates tmux session (`action-{issue_num}`) with interactive `claude`, injects formula prompt, monitors session until Claude exits or 4h idle timeout +- `action/action-agent.sh` — Orchestrator: fetches issue body + prior comments, creates tmux session (`action-{issue_num}`) with interactive `claude`, injects formula prompt with phase protocol, enters `monitor_phase_loop` (shared via `dev/phase-handler.sh`) for CI/review lifecycle or direct completion **Session lifecycle**: 1. `action-poll.sh` finds open `action` issues with no active tmux session. 2. Spawns `action-agent.sh `. 3. Agent creates Matrix thread, exports `MATRIX_THREAD_ID` so Claude's output streams to the thread via a Stop hook (`on-stop-matrix.sh`). -4. Agent creates tmux session `action-{issue_num}`, injects prompt (formula + prior comments). -5. Claude executes formula steps using Bash and other tools, posts progress as issue comments. Each Claude turn is streamed to the Matrix thread for real-time human visibility. -6. For human input: Claude sends a Matrix message and waits; the reply is injected into the session by `matrix_listener.sh`. -7. When complete: Claude closes the issue with a summary comment. Session exits. -8. Poll detects no active session on next run — nothing further to do. +4. Agent creates tmux session `action-{issue_num}`, injects prompt (formula + prior comments + phase protocol). +5. Agent enters `monitor_phase_loop` (shared with dev-agent via `dev/phase-handler.sh`). +6. **Path A (git output):** Claude pushes branch → `PHASE:awaiting_ci` → handler creates PR, polls CI → injects failures → Claude fixes → push → re-poll → CI passes → `PHASE:awaiting_review` → handler polls reviews → injects REQUEST_CHANGES → Claude fixes → approved → merge → cleanup. +7. **Path B (no git output):** Claude posts results as comment, closes issue → `PHASE:done` → handler cleans up (kill session, docker compose down, remove temp files). +8. For human input: Claude sends a Matrix message and waits; the reply is injected into the session by `matrix_listener.sh`. **Environment variables consumed**: - `CODEBERG_TOKEN`, `CODEBERG_REPO`, `CODEBERG_API`, `PROJECT_NAME`, `CODEBERG_WEB` diff --git a/action/action-agent.sh b/action/action-agent.sh index cca560b..80fe835 100644 --- a/action/action-agent.sh +++ b/action/action-agent.sh @@ -6,10 +6,13 @@ # Lifecycle: # 1. Fetch issue body (action formula) + existing comments # 2. Create tmux session: action-{issue_num} with interactive claude -# 3. Inject initial prompt: formula + comments + instructions -# 4. Claude executes formula steps, posts progress comments, closes issue +# 3. Inject initial prompt: formula + comments + phase protocol instructions +# 4. Monitor phase file via monitor_phase_loop (shared with dev-agent) +# Path A (git output): Claude pushes → handler creates PR → CI poll → review +# injection → merge → cleanup (same loop as dev-agent via phase-handler.sh) +# Path B (no git output): Claude posts results → PHASE:done → cleanup # 5. For human input: Claude asks via Matrix; reply injected via matrix_listener -# 6. Monitor session until Claude exits or idle timeout reached +# 6. Cleanup on terminal phase: kill tmux, docker compose down, remove temp files # # Session: action-{issue_num} (tmux) # Log: action/action-poll-{project}.log @@ -21,16 +24,52 @@ export PROJECT_TOML="${2:-${PROJECT_TOML:-}}" source "$(dirname "$0")/../lib/env.sh" source "$(dirname "$0")/../lib/agent-session.sh" +# shellcheck source=../dev/phase-handler.sh +source "$(dirname "$0")/../dev/phase-handler.sh" SESSION_NAME="action-${ISSUE}" LOCKFILE="/tmp/action-agent-${ISSUE}.lock" LOGFILE="${FACTORY_ROOT}/action/action-poll-${PROJECT_NAME:-harb}.log" THREAD_FILE="/tmp/action-thread-${ISSUE}" IDLE_TIMEOUT="${ACTION_IDLE_TIMEOUT:-14400}" # 4h default +# --- Phase handler globals (agent-specific; defaults in phase-handler.sh) --- +# shellcheck disable=SC2034 # used by phase-handler.sh +API="${CODEBERG_API}" +BRANCH="action/issue-${ISSUE}" +# shellcheck disable=SC2034 # used by phase-handler.sh +WORKTREE="${PROJECT_REPO_ROOT}" +PHASE_FILE="/tmp/action-session-${PROJECT_NAME:-harb}-${ISSUE}.phase" +IMPL_SUMMARY_FILE="/tmp/action-impl-summary-${PROJECT_NAME:-harb}-${ISSUE}.txt" +PREFLIGHT_RESULT="/tmp/action-preflight-${ISSUE}.json" + log() { - printf '[%s] #%s %s\n' "$(date -u '+%Y-%m-%d %H:%M:%S UTC')" "$ISSUE" "$*" >> "$LOGFILE" + printf '[%s] action#%s %s\n' "$(date -u '+%Y-%m-%d %H:%M:%S UTC')" "$ISSUE" "$*" >> "$LOGFILE" } +notify() { + local thread_id="" + [ -f "${THREAD_FILE:-}" ] && thread_id=$(cat "$THREAD_FILE" 2>/dev/null || true) + matrix_send "action" "⚡ #${ISSUE}: $*" "${thread_id}" 2>/dev/null || true +} + +notify_ctx() { + local plain="$1" html="$2" thread_id="" + [ -f "${THREAD_FILE:-}" ] && thread_id=$(cat "$THREAD_FILE" 2>/dev/null || true) + if [ -n "$thread_id" ]; then + matrix_send_ctx "action" "⚡ #${ISSUE}: ${plain}" "⚡ #${ISSUE}: ${html}" "${thread_id}" 2>/dev/null || true + else + matrix_send "action" "⚡ #${ISSUE}: ${plain}" "" "${ISSUE}" 2>/dev/null || true + fi +} + +status() { + log "$*" +} + +# --- Action-specific stubs for phase-handler.sh --- +cleanup_worktree() { :; } # action agent uses PROJECT_REPO_ROOT directly — no separate git worktree to remove +cleanup_labels() { :; } # action agent doesn't use in-progress labels + # --- Concurrency lock (per issue) --- if [ -f "$LOCKFILE" ]; then LOCK_PID=$(cat "$LOCKFILE" 2>/dev/null || echo "") @@ -45,6 +84,9 @@ echo $$ > "$LOCKFILE" cleanup() { rm -f "$LOCKFILE" agent_kill_session "$SESSION_NAME" + # Best-effort docker cleanup for containers started during this action + (cd "${PROJECT_REPO_ROOT}" 2>/dev/null && docker compose down 2>/dev/null) || true + rm -f "$PHASE_FILE" "$IMPL_SUMMARY_FILE" "$PREFLIGHT_RESULT" } trap cleanup EXIT @@ -129,8 +171,11 @@ if [ -n "$THREAD_ID" ]; then are routed back to this session." fi +# Build phase protocol from shared function (Path B covered in Instructions section above) +PHASE_PROTOCOL_INSTRUCTIONS="$(build_phase_protocol_prompt "$PHASE_FILE" "$IMPL_SUMMARY_FILE" "$BRANCH")" + INITIAL_PROMPT="You are an action agent. Your job is to execute the action formula -in the issue below and then close the issue. +in the issue below. ## Issue #${ISSUE}: ${ISSUE_TITLE} @@ -153,24 +198,35 @@ ${PRIOR_SECTION}## Instructions what you need, then wait. A human will reply and the reply will be injected into this session automatically.${THREAD_HINT} -5. When all steps are complete, close issue #${ISSUE} with a summary: - curl -sf -X PATCH \\ - -H \"Authorization: token \${CODEBERG_TOKEN}\" \\ - -H 'Content-Type: application/json' \\ - \"${CODEBERG_API}/issues/${ISSUE}\" \\ - -d '{\"state\": \"closed\"}' +### Path A: If this action produces code changes (e.g. config updates, baselines): + - Work in the project repo: cd ${PROJECT_REPO_ROOT} + - Create and switch to branch: git checkout -b ${BRANCH} + - Make your changes, commit, and push: git push origin ${BRANCH} + - Follow the phase protocol below — the orchestrator handles PR creation, + CI monitoring, and review injection. -6. Environment variables available in your bash sessions: +### Path B: If this action produces no code changes (investigation, report): + - Post results as a comment on issue #${ISSUE}. + - Close the issue: + curl -sf -X PATCH \\ + -H \"Authorization: token \${CODEBERG_TOKEN}\" \\ + -H 'Content-Type: application/json' \\ + \"${CODEBERG_API}/issues/${ISSUE}\" \\ + -d '{\"state\": \"closed\"}' + - Signal completion: echo \"PHASE:done\" > \"${PHASE_FILE}\" + +5. Environment variables available in your bash sessions: CODEBERG_TOKEN, CODEBERG_API, CODEBERG_REPO, CODEBERG_WEB, PROJECT_NAME (all sourced from ${FACTORY_ROOT}/.env) -**Important**: You do NOT need to create PRs or write a phase file. Just execute -the formula steps, post comments, and close the issue when done. If the prior -comments above show work already completed, resume from where it left off." +If the prior comments above show work already completed, resume from where it +left off. + +${PHASE_PROTOCOL_INSTRUCTIONS}" # --- Create tmux session --- log "creating tmux session: ${SESSION_NAME}" -if ! create_agent_session "${SESSION_NAME}" "${FACTORY_ROOT}"; then +if ! create_agent_session "${SESSION_NAME}" "${FACTORY_ROOT}" "${PHASE_FILE}"; then log "ERROR: failed to create tmux session" exit 1 fi @@ -182,31 +238,30 @@ log "initial prompt injected into session" matrix_send "action" "⚡ #${ISSUE}: session started — ${ISSUE_TITLE}" \ "${THREAD_ID}" 2>/dev/null || true -# --- Monitor session until Claude exits or idle timeout --- -log "monitoring session: ${SESSION_NAME} (idle_timeout=${IDLE_TIMEOUT}s)" -IDLE_ELAPSED=0 -POLL_INTERVAL=30 -IDLE_MARKER="/tmp/claude-idle-${SESSION_NAME}.ts" +# --- Monitor phase loop (shared with dev-agent) --- +status "monitoring phase: ${PHASE_FILE} (action agent)" +monitor_phase_loop "$PHASE_FILE" "$IDLE_TIMEOUT" _on_phase_change "$SESSION_NAME" -while tmux has-session -t "${SESSION_NAME}" 2>/dev/null; do - sleep "$POLL_INTERVAL" - - # Use the Stop hook idle marker to distinguish active vs idle: - # marker exists → Claude finished responding and is at the prompt (idle) - # marker absent → Claude is mid-turn or hasn't started yet (active) - if [ -f "$IDLE_MARKER" ]; then - IDLE_ELAPSED=$((IDLE_ELAPSED + POLL_INTERVAL)) - else - IDLE_ELAPSED=0 - fi - - if [ "$IDLE_ELAPSED" -ge "$IDLE_TIMEOUT" ]; then - log "idle timeout (${IDLE_TIMEOUT}s) — killing session for issue #${ISSUE}" - matrix_send "action" "⚠️ #${ISSUE}: session idle for $((IDLE_TIMEOUT / 3600))h — killed" \ - "${THREAD_ID}" 2>/dev/null || true - agent_kill_session "${SESSION_NAME}" - break - fi -done +# Handle exit reason from monitor_phase_loop +case "${_MONITOR_LOOP_EXIT:-}" in + idle_timeout) + notify_ctx \ + "session idle for $((IDLE_TIMEOUT / 3600))h — killed" \ + "session idle for $((IDLE_TIMEOUT / 3600))h — killed" + # Escalate to supervisor (idle_prompt already escalated via _on_phase_change callback) + echo "{\"issue\":${ISSUE},\"pr\":${PR_NUMBER:-0},\"reason\":\"idle_timeout\",\"ts\":\"$(date -u +%Y-%m-%dT%H:%M:%SZ)\"}" \ + >> "${FACTORY_ROOT}/supervisor/escalations-${PROJECT_NAME}.jsonl" + rm -f "$PHASE_FILE" "$IMPL_SUMMARY_FILE" "$THREAD_FILE" + ;; + idle_prompt) + # Notification + escalation already handled by _on_phase_change(PHASE:failed) callback + rm -f "$PHASE_FILE" "$IMPL_SUMMARY_FILE" "$THREAD_FILE" + ;; + done) + # Belt-and-suspenders: callback handles primary cleanup, + # but ensure sentinel files are removed if callback was interrupted + rm -f "$PHASE_FILE" "$IMPL_SUMMARY_FILE" "$THREAD_FILE" + ;; +esac log "action-agent finished for issue #${ISSUE}" diff --git a/dev/phase-handler.sh b/dev/phase-handler.sh index 45f5f6e..ae47e97 100644 --- a/dev/phase-handler.sh +++ b/dev/phase-handler.sh @@ -1,24 +1,97 @@ #!/usr/bin/env bash # dev/phase-handler.sh — Phase callback functions for dev-agent.sh # -# Source this file from dev-agent.sh after lib/agent-session.sh is loaded. -# Defines: post_refusal_comment(), _on_phase_change() +# Source this file from agent orchestrators after lib/agent-session.sh is loaded. +# Defines: post_refusal_comment(), _on_phase_change(), build_phase_protocol_prompt() # -# Required globals from dev-agent.sh: +# Required globals (set by calling agent before or after sourcing): # ISSUE, CODEBERG_TOKEN, API, CODEBERG_WEB, PROJECT_NAME, FACTORY_ROOT -# PR_NUMBER, BRANCH, PHASE_FILE, WORKTREE, IMPL_SUMMARY_FILE, THREAD_FILE +# BRANCH, PHASE_FILE, WORKTREE, IMPL_SUMMARY_FILE, THREAD_FILE # PRIMARY_BRANCH, SESSION_NAME, LOGFILE, ISSUE_TITLE -# CI_POLL_TIMEOUT, MAX_CI_FIXES, MAX_REVIEW_ROUNDS, REVIEW_POLL_TIMEOUT -# CI_RETRY_COUNT, CI_FIX_COUNT, REVIEW_ROUND, CLAIMED # WOODPECKER_REPO_ID, WOODPECKER_TOKEN, WOODPECKER_SERVER # -# Calls back to dev-agent.sh-defined helpers: -# cleanup_worktree(), cleanup_labels() +# Globals with defaults (agents can override after sourcing): +# PR_NUMBER, CI_POLL_TIMEOUT, MAX_CI_FIXES, MAX_REVIEW_ROUNDS, +# REVIEW_POLL_TIMEOUT, CI_RETRY_COUNT, CI_FIX_COUNT, REVIEW_ROUND, +# CLAIMED, PHASE_POLL_INTERVAL +# +# Calls back to agent-defined helpers: +# cleanup_worktree(), cleanup_labels(), notify(), notify_ctx(), status(), log() # # shellcheck shell=bash # shellcheck disable=SC2154 # globals are set in dev-agent.sh before calling # shellcheck disable=SC2034 # CLAIMED is read by cleanup() in dev-agent.sh +# --- Default globals (agents can override after sourcing) --- +: "${CI_POLL_TIMEOUT:=1800}" +: "${REVIEW_POLL_TIMEOUT:=10800}" +: "${MAX_CI_FIXES:=3}" +: "${MAX_REVIEW_ROUNDS:=5}" +: "${CI_RETRY_COUNT:=0}" +: "${CI_FIX_COUNT:=0}" +: "${REVIEW_ROUND:=0}" +: "${PR_NUMBER:=}" +: "${CLAIMED:=false}" +: "${PHASE_POLL_INTERVAL:=30}" + +# --- Build phase protocol prompt (shared across agents) --- +# Generates the phase-signaling instructions for Claude prompts. +# Args: phase_file summary_file branch +# Output: The protocol text (stdout) +build_phase_protocol_prompt() { + local _pf="$1" _sf="$2" _br="$3" + cat <<_PHASE_PROTOCOL_EOF_ +## Phase-Signaling Protocol (REQUIRED) + +You are running in a persistent tmux session managed by an orchestrator. +Communicate progress by writing to the phase file. The orchestrator watches +this file and injects events (CI results, review feedback) back into this session. + +### Key files +\`\`\` +PHASE_FILE="${_pf}" +SUMMARY_FILE="${_sf}" +\`\`\` + +### Phase transitions — write these exactly: + +**After committing and pushing your branch:** +\`\`\`bash +git push origin ${_br} +# Write a short summary of what you implemented: +printf '%s' "" > "\${SUMMARY_FILE}" +# Signal the orchestrator to create the PR and watch for CI: +echo "PHASE:awaiting_ci" > "${_pf}" +\`\`\` +Then STOP and wait. The orchestrator will inject CI results. + +**When you receive a "CI passed" injection:** +\`\`\`bash +echo "PHASE:awaiting_review" > "${_pf}" +\`\`\` +Then STOP and wait. The orchestrator will inject review feedback. + +**When you receive a "CI failed:" injection:** +Fix the CI issue, commit, push, then: +\`\`\`bash +echo "PHASE:awaiting_ci" > "${_pf}" +\`\`\` +Then STOP and wait. + +**When you receive a "Review: REQUEST_CHANGES" injection:** +Address ALL review feedback, commit, push, then: +\`\`\`bash +echo "PHASE:awaiting_ci" > "${_pf}" +\`\`\` +(CI runs again after each push — always write awaiting_ci, not awaiting_review) + +**On unrecoverable failure:** +\`\`\`bash +printf 'PHASE:failed\nReason: %s\n' "describe what failed" > "${_pf}" +\`\`\` +_PHASE_PROTOCOL_EOF_ +} + # --- Merge helper --- # do_merge — attempt to merge PR via Codeberg API. # Args: pr_num @@ -489,12 +562,17 @@ Instructions: # ── PHASE: done ───────────────────────────────────────────────────────────── # PR merged and issue closed (by orchestrator or Claude). Just clean up local state. elif [ "$phase" = "PHASE:done" ]; then - status "phase done — PR #${PR_NUMBER:-?} merged, cleaning up" - - # Notify Matrix (issue already closed and labels removed via API) - notify_ctx \ - "✅ PR #${PR_NUMBER:-?} merged! Issue #${ISSUE} done." \ - "✅ PR #${PR_NUMBER:-?} merged! Issue #${ISSUE} done." + if [ -n "${PR_NUMBER:-}" ]; then + status "phase done — PR #${PR_NUMBER} merged, cleaning up" + notify_ctx \ + "✅ PR #${PR_NUMBER} merged! Issue #${ISSUE} done." \ + "✅ PR #${PR_NUMBER} merged! Issue #${ISSUE} done." + else + status "phase done — issue #${ISSUE} complete, cleaning up" + notify_ctx \ + "✅ Issue #${ISSUE} done." \ + "✅ Issue #${ISSUE} done." + fi # Belt-and-suspenders: ensure in-progress label removed (idempotent) cleanup_labels