fix: fix: action-agent shares phase handler with dev-agent — review lifecycle + cleanup (#388) (#403)

Fixes #388 ## Changes Action-agent now sources dev/phase-handler.sh and enters monitor_phase_loop after prompt injection. Two paths: (A) git output triggers the same PR/CI/review lifecycle as dev-agent, (B) no-git output writes PHASE:done for cleanup. Adds docker compose down on terminal phases, escalation to supervisor on idle timeout, and proper temp file cleanup. Co-authored-by: openhands <openhands@all-hands.dev> Reviewed-on: https://codeberg.org/johba/disinto/pulls/403 Reviewed-by: Disinto_bot <disinto_bot@noreply.codeberg.org>
2026-03-20 17:39:44 +01:00 · 2026-03-20 17:39:44 +01:00 · a15623b747
commit a15623b747
parent 1fb6f0a032
4 changed files with 198 additions and 65 deletions
--- a/.woodpecker/agent-smoke.sh
+++ b/.woodpecker/agent-smoke.sh
@ -170,7 +170,7 @@ check_script vault/vault-fire.sh
 check_script vault/vault-poll.sh
 check_script vault/vault-reject.sh
 check_script action/action-poll.sh
-check_script action/action-agent.sh
+check_script action/action-agent.sh    dev/phase-handler.sh
 echo "function resolution check done"
--- a/AGENTS.md
+++ b/AGENTS.md
@ -227,9 +227,9 @@ later triages these predictions into action/backlog issues or dismisses them.
 ### Action (`action/`)
 **Role**: Execute operational tasks described by action formulas — run scripts,
-call APIs, send messages, collect human approval. Unlike the dev-agent, the
+call APIs, send messages, collect human approval. Shares the same phase handler
-action-agent produces no PRs: Claude closes the issue directly after executing
+as the dev-agent: if an action produces code changes, the orchestrator creates a
-all formula steps.
+PR and drives the CI/review loop; otherwise Claude closes the issue directly.
 **Trigger**: `action-poll.sh` runs every 10 min via cron. It scans for open
 issues labeled `action` that have no active tmux session, then spawns
@ -237,17 +237,17 @@ issues labeled `action` that have no active tmux session, then spawns
 **Key files**:
 - `action/action-poll.sh` — Cron scheduler: finds open action issues with no active tmux session, spawns action-agent.sh
- `action/action-agent.sh` — Orchestrator: fetches issue body + prior comments, creates tmux session (`action-{issue_num}`) with interactive `claude`, injects formula prompt, monitors session until Claude exits or 4h idle timeout
+- `action/action-agent.sh` — Orchestrator: fetches issue body + prior comments, creates tmux session (`action-{issue_num}`) with interactive `claude`, injects formula prompt with phase protocol, enters `monitor_phase_loop` (shared via `dev/phase-handler.sh`) for CI/review lifecycle or direct completion
 **Session lifecycle**:
 1. `action-poll.sh` finds open `action` issues with no active tmux session.
 2. Spawns `action-agent.sh <issue_num>`.
 3. Agent creates Matrix thread, exports `MATRIX_THREAD_ID` so Claude's output streams to the thread via a Stop hook (`on-stop-matrix.sh`).
-4. Agent creates tmux session `action-{issue_num}`, injects prompt (formula + prior comments).
+4. Agent creates tmux session `action-{issue_num}`, injects prompt (formula + prior comments + phase protocol).
-5. Claude executes formula steps using Bash and other tools, posts progress as issue comments. Each Claude turn is streamed to the Matrix thread for real-time human visibility.
+5. Agent enters `monitor_phase_loop` (shared with dev-agent via `dev/phase-handler.sh`).
-6. For human input: Claude sends a Matrix message and waits; the reply is injected into the session by `matrix_listener.sh`.
+6. **Path A (git output):** Claude pushes branch → `PHASE:awaiting_ci` → handler creates PR, polls CI → injects failures → Claude fixes → push → re-poll → CI passes → `PHASE:awaiting_review` → handler polls reviews → injects REQUEST_CHANGES → Claude fixes → approved → merge → cleanup.
-7. When complete: Claude closes the issue with a summary comment. Session exits.
+7. **Path B (no git output):** Claude posts results as comment, closes issue → `PHASE:done` → handler cleans up (kill session, docker compose down, remove temp files).
-8. Poll detects no active session on next run — nothing further to do.
+8. For human input: Claude sends a Matrix message and waits; the reply is injected into the session by `matrix_listener.sh`.
 **Environment variables consumed**:
 - `CODEBERG_TOKEN`, `CODEBERG_REPO`, `CODEBERG_API`, `PROJECT_NAME`, `CODEBERG_WEB`
--- a/action/action-agent.sh
+++ b/action/action-agent.sh
@ -6,10 +6,13 @@
 # Lifecycle:
 #   1. Fetch issue body (action formula) + existing comments
 #   2. Create tmux session: action-{issue_num} with interactive claude
-#   3. Inject initial prompt: formula + comments + instructions
+#   3. Inject initial prompt: formula + comments + phase protocol instructions
-#   4. Claude executes formula steps, posts progress comments, closes issue
+#   4. Monitor phase file via monitor_phase_loop (shared with dev-agent)
 #   Path A (git output): Claude pushes → handler creates PR → CI poll → review
 #     injection → merge → cleanup (same loop as dev-agent via phase-handler.sh)
 #   Path B (no git output): Claude posts results → PHASE:done → cleanup
 #   5. For human input: Claude asks via Matrix; reply injected via matrix_listener
-#   6. Monitor session until Claude exits or idle timeout reached
+#   6. Cleanup on terminal phase: kill tmux, docker compose down, remove temp files
 #
 # Session:  action-{issue_num} (tmux)
 # Log:      action/action-poll-{project}.log
@ -21,16 +24,52 @@ export PROJECT_TOML="${2:-${PROJECT_TOML:-}}"
 source "$(dirname "$0")/../lib/env.sh"
 source "$(dirname "$0")/../lib/agent-session.sh"
 # shellcheck source=../dev/phase-handler.sh
 source "$(dirname "$0")/../dev/phase-handler.sh"
 SESSION_NAME="action-${ISSUE}"
 LOCKFILE="/tmp/action-agent-${ISSUE}.lock"
 LOGFILE="${FACTORY_ROOT}/action/action-poll-${PROJECT_NAME:-harb}.log"
 THREAD_FILE="/tmp/action-thread-${ISSUE}"
 IDLE_TIMEOUT="${ACTION_IDLE_TIMEOUT:-14400}"  # 4h default
 # --- Phase handler globals (agent-specific; defaults in phase-handler.sh) ---
 # shellcheck disable=SC2034  # used by phase-handler.sh
 API="${CODEBERG_API}"
 BRANCH="action/issue-${ISSUE}"
 # shellcheck disable=SC2034  # used by phase-handler.sh
 WORKTREE="${PROJECT_REPO_ROOT}"
 PHASE_FILE="/tmp/action-session-${PROJECT_NAME:-harb}-${ISSUE}.phase"
 IMPL_SUMMARY_FILE="/tmp/action-impl-summary-${PROJECT_NAME:-harb}-${ISSUE}.txt"
 PREFLIGHT_RESULT="/tmp/action-preflight-${ISSUE}.json"
 log() {
-  printf '[%s] #%s %s\n' "$(date -u '+%Y-%m-%d %H:%M:%S UTC')" "$ISSUE" "$*" >> "$LOGFILE"
+  printf '[%s] action#%s %s\n' "$(date -u '+%Y-%m-%d %H:%M:%S UTC')" "$ISSUE" "$*" >> "$LOGFILE"
 }
 notify() {
  local thread_id=""
  [ -f "${THREAD_FILE:-}" ] && thread_id=$(cat "$THREAD_FILE" 2>/dev/null || true)
  matrix_send "action" "⚡ #${ISSUE}: $*" "${thread_id}" 2>/dev/null || true
 }
 notify_ctx() {
  local plain="$1" html="$2" thread_id=""
  [ -f "${THREAD_FILE:-}" ] && thread_id=$(cat "$THREAD_FILE" 2>/dev/null || true)
  if [ -n "$thread_id" ]; then
    matrix_send_ctx "action" "⚡ #${ISSUE}: ${plain}" "⚡ #${ISSUE}: ${html}" "${thread_id}" 2>/dev/null || true
  else
    matrix_send "action" "⚡ #${ISSUE}: ${plain}" "" "${ISSUE}" 2>/dev/null || true
  fi
 }
 status() {
  log "$*"
 }
 # --- Action-specific stubs for phase-handler.sh ---
 cleanup_worktree() { :; }  # action agent uses PROJECT_REPO_ROOT directly — no separate git worktree to remove
 cleanup_labels() { :; }    # action agent doesn't use in-progress labels
 # --- Concurrency lock (per issue) ---
 if [ -f "$LOCKFILE" ]; then
  LOCK_PID=$(cat "$LOCKFILE" 2>/dev/null || echo "")
@ -45,6 +84,9 @@ echo $$ > "$LOCKFILE"
 cleanup() {
  rm -f "$LOCKFILE"
  agent_kill_session "$SESSION_NAME"
  # Best-effort docker cleanup for containers started during this action
  (cd "${PROJECT_REPO_ROOT}" 2>/dev/null && docker compose down 2>/dev/null) || true
  rm -f "$PHASE_FILE" "$IMPL_SUMMARY_FILE" "$PREFLIGHT_RESULT"
 }
 trap cleanup EXIT
@ -129,8 +171,11 @@ if [ -n "$THREAD_ID" ]; then
   are routed back to this session."
 fi
 # Build phase protocol from shared function (Path B covered in Instructions section above)
 PHASE_PROTOCOL_INSTRUCTIONS="$(build_phase_protocol_prompt "$PHASE_FILE" "$IMPL_SUMMARY_FILE" "$BRANCH")"
 INITIAL_PROMPT="You are an action agent. Your job is to execute the action formula
-in the issue below and then close the issue.
+in the issue below.
 ## Issue #${ISSUE}: ${ISSUE_TITLE}
@ -153,24 +198,35 @@ ${PRIOR_SECTION}## Instructions
   what you need, then wait. A human will reply and the reply will be injected
   into this session automatically.${THREAD_HINT}
-5. When all steps are complete, close issue #${ISSUE} with a summary:
+### Path A: If this action produces code changes (e.g. config updates, baselines):
-   curl -sf -X PATCH \\
+   - Work in the project repo: cd ${PROJECT_REPO_ROOT}
-     -H \"Authorization: token \${CODEBERG_TOKEN}\" \\
+   - Create and switch to branch: git checkout -b ${BRANCH}
-     -H 'Content-Type: application/json' \\
+   - Make your changes, commit, and push: git push origin ${BRANCH}
-     \"${CODEBERG_API}/issues/${ISSUE}\" \\
+   - Follow the phase protocol below — the orchestrator handles PR creation,
-     -d '{\"state\": \"closed\"}'
+     CI monitoring, and review injection.
-6. Environment variables available in your bash sessions:
+### Path B: If this action produces no code changes (investigation, report):
   - Post results as a comment on issue #${ISSUE}.
   - Close the issue:
     curl -sf -X PATCH \\
       -H \"Authorization: token \${CODEBERG_TOKEN}\" \\
       -H 'Content-Type: application/json' \\
       \"${CODEBERG_API}/issues/${ISSUE}\" \\
       -d '{\"state\": \"closed\"}'
   - Signal completion: echo \"PHASE:done\" > \"${PHASE_FILE}\"
 5. Environment variables available in your bash sessions:
   CODEBERG_TOKEN, CODEBERG_API, CODEBERG_REPO, CODEBERG_WEB, PROJECT_NAME
   (all sourced from ${FACTORY_ROOT}/.env)
-**Important**: You do NOT need to create PRs or write a phase file. Just execute
+If the prior comments above show work already completed, resume from where it
-the formula steps, post comments, and close the issue when done. If the prior
+left off.
-comments above show work already completed, resume from where it left off."
+
 ${PHASE_PROTOCOL_INSTRUCTIONS}"
 # --- Create tmux session ---
 log "creating tmux session: ${SESSION_NAME}"
-if ! create_agent_session "${SESSION_NAME}" "${FACTORY_ROOT}"; then
+if ! create_agent_session "${SESSION_NAME}" "${FACTORY_ROOT}" "${PHASE_FILE}"; then
  log "ERROR: failed to create tmux session"
  exit 1
 fi
@ -182,31 +238,30 @@ log "initial prompt injected into session"
 matrix_send "action" "⚡ #${ISSUE}: session started — ${ISSUE_TITLE}" \
  "${THREAD_ID}" 2>/dev/null || true
-# --- Monitor session until Claude exits or idle timeout ---
+# --- Monitor phase loop (shared with dev-agent) ---
-log "monitoring session: ${SESSION_NAME} (idle_timeout=${IDLE_TIMEOUT}s)"
+status "monitoring phase: ${PHASE_FILE} (action agent)"
-IDLE_ELAPSED=0
+monitor_phase_loop "$PHASE_FILE" "$IDLE_TIMEOUT" _on_phase_change "$SESSION_NAME"
 POLL_INTERVAL=30
 IDLE_MARKER="/tmp/claude-idle-${SESSION_NAME}.ts"
-while tmux has-session -t "${SESSION_NAME}" 2>/dev/null; do
+# Handle exit reason from monitor_phase_loop
-  sleep "$POLL_INTERVAL"
+case "${_MONITOR_LOOP_EXIT:-}" in
-
+  idle_timeout)
-  # Use the Stop hook idle marker to distinguish active vs idle:
+    notify_ctx \
-  # marker exists → Claude finished responding and is at the prompt (idle)
+      "session idle for $((IDLE_TIMEOUT / 3600))h — killed" \
-  # marker absent → Claude is mid-turn or hasn't started yet (active)
+      "session idle for $((IDLE_TIMEOUT / 3600))h — killed"
-  if [ -f "$IDLE_MARKER" ]; then
+    # Escalate to supervisor (idle_prompt already escalated via _on_phase_change callback)
-    IDLE_ELAPSED=$((IDLE_ELAPSED + POLL_INTERVAL))
+    echo "{\"issue\":${ISSUE},\"pr\":${PR_NUMBER:-0},\"reason\":\"idle_timeout\",\"ts\":\"$(date -u +%Y-%m-%dT%H:%M:%SZ)\"}" \
-  else
+      >> "${FACTORY_ROOT}/supervisor/escalations-${PROJECT_NAME}.jsonl"
-    IDLE_ELAPSED=0
+    rm -f "$PHASE_FILE" "$IMPL_SUMMARY_FILE" "$THREAD_FILE"
-  fi
+    ;;
-
+  idle_prompt)
-  if [ "$IDLE_ELAPSED" -ge "$IDLE_TIMEOUT" ]; then
+    # Notification + escalation already handled by _on_phase_change(PHASE:failed) callback
-    log "idle timeout (${IDLE_TIMEOUT}s) — killing session for issue #${ISSUE}"
+    rm -f "$PHASE_FILE" "$IMPL_SUMMARY_FILE" "$THREAD_FILE"
-    matrix_send "action" "⚠️ #${ISSUE}: session idle for $((IDLE_TIMEOUT / 3600))h — killed" \
+    ;;
-      "${THREAD_ID}" 2>/dev/null || true
+  done)
-    agent_kill_session "${SESSION_NAME}"
+    # Belt-and-suspenders: callback handles primary cleanup,
-    break
+    # but ensure sentinel files are removed if callback was interrupted
-  fi
+    rm -f "$PHASE_FILE" "$IMPL_SUMMARY_FILE" "$THREAD_FILE"
-done
+    ;;
 esac
 log "action-agent finished for issue #${ISSUE}"
--- a/dev/phase-handler.sh
+++ b/dev/phase-handler.sh
@ -1,24 +1,97 @@
 #!/usr/bin/env bash
 # dev/phase-handler.sh — Phase callback functions for dev-agent.sh
 #
-# Source this file from dev-agent.sh after lib/agent-session.sh is loaded.
+# Source this file from agent orchestrators after lib/agent-session.sh is loaded.
-# Defines: post_refusal_comment(), _on_phase_change()
+# Defines: post_refusal_comment(), _on_phase_change(), build_phase_protocol_prompt()
 #
-# Required globals from dev-agent.sh:
+# Required globals (set by calling agent before or after sourcing):
 #   ISSUE, CODEBERG_TOKEN, API, CODEBERG_WEB, PROJECT_NAME, FACTORY_ROOT
-#   PR_NUMBER, BRANCH, PHASE_FILE, WORKTREE, IMPL_SUMMARY_FILE, THREAD_FILE
+#   BRANCH, PHASE_FILE, WORKTREE, IMPL_SUMMARY_FILE, THREAD_FILE
 #   PRIMARY_BRANCH, SESSION_NAME, LOGFILE, ISSUE_TITLE
 #   CI_POLL_TIMEOUT, MAX_CI_FIXES, MAX_REVIEW_ROUNDS, REVIEW_POLL_TIMEOUT
 #   CI_RETRY_COUNT, CI_FIX_COUNT, REVIEW_ROUND, CLAIMED
 #   WOODPECKER_REPO_ID, WOODPECKER_TOKEN, WOODPECKER_SERVER
 #
-# Calls back to dev-agent.sh-defined helpers:
+# Globals with defaults (agents can override after sourcing):
-#   cleanup_worktree(), cleanup_labels()
+#   PR_NUMBER, CI_POLL_TIMEOUT, MAX_CI_FIXES, MAX_REVIEW_ROUNDS,
 #   REVIEW_POLL_TIMEOUT, CI_RETRY_COUNT, CI_FIX_COUNT, REVIEW_ROUND,
 #   CLAIMED, PHASE_POLL_INTERVAL
 #
 # Calls back to agent-defined helpers:
 #   cleanup_worktree(), cleanup_labels(), notify(), notify_ctx(), status(), log()
 #
 # shellcheck shell=bash
 # shellcheck disable=SC2154  # globals are set in dev-agent.sh before calling
 # shellcheck disable=SC2034  # CLAIMED is read by cleanup() in dev-agent.sh
 # --- Default globals (agents can override after sourcing) ---
 : "${CI_POLL_TIMEOUT:=1800}"
 : "${REVIEW_POLL_TIMEOUT:=10800}"
 : "${MAX_CI_FIXES:=3}"
 : "${MAX_REVIEW_ROUNDS:=5}"
 : "${CI_RETRY_COUNT:=0}"
 : "${CI_FIX_COUNT:=0}"
 : "${REVIEW_ROUND:=0}"
 : "${PR_NUMBER:=}"
 : "${CLAIMED:=false}"
 : "${PHASE_POLL_INTERVAL:=30}"
 # --- Build phase protocol prompt (shared across agents) ---
 # Generates the phase-signaling instructions for Claude prompts.
 # Args: phase_file summary_file branch
 # Output: The protocol text (stdout)
 build_phase_protocol_prompt() {
  local _pf="$1" _sf="$2" _br="$3"
  cat <<_PHASE_PROTOCOL_EOF_
 ## Phase-Signaling Protocol (REQUIRED)
 You are running in a persistent tmux session managed by an orchestrator.
 Communicate progress by writing to the phase file. The orchestrator watches
 this file and injects events (CI results, review feedback) back into this session.
 ### Key files
 \`\`\`
 PHASE_FILE="${_pf}"
 SUMMARY_FILE="${_sf}"
 \`\`\`
 ### Phase transitions — write these exactly:
 **After committing and pushing your branch:**
 \`\`\`bash
 git push origin ${_br}
 # Write a short summary of what you implemented:
 printf '%s' "<your summary>" > "\${SUMMARY_FILE}"
 # Signal the orchestrator to create the PR and watch for CI:
 echo "PHASE:awaiting_ci" > "${_pf}"
 \`\`\`
 Then STOP and wait. The orchestrator will inject CI results.
 **When you receive a "CI passed" injection:**
 \`\`\`bash
 echo "PHASE:awaiting_review" > "${_pf}"
 \`\`\`
 Then STOP and wait. The orchestrator will inject review feedback.
 **When you receive a "CI failed:" injection:**
 Fix the CI issue, commit, push, then:
 \`\`\`bash
 echo "PHASE:awaiting_ci" > "${_pf}"
 \`\`\`
 Then STOP and wait.
 **When you receive a "Review: REQUEST_CHANGES" injection:**
 Address ALL review feedback, commit, push, then:
 \`\`\`bash
 echo "PHASE:awaiting_ci" > "${_pf}"
 \`\`\`
 (CI runs again after each push — always write awaiting_ci, not awaiting_review)
 **On unrecoverable failure:**
 \`\`\`bash
 printf 'PHASE:failed\nReason: %s\n' "describe what failed" > "${_pf}"
 \`\`\`
 _PHASE_PROTOCOL_EOF_
 }
 # --- Merge helper ---
 # do_merge — attempt to merge PR via Codeberg API.
 # Args: pr_num
@ -489,12 +562,17 @@ Instructions:
  # ── PHASE: done ─────────────────────────────────────────────────────────────
  # PR merged and issue closed (by orchestrator or Claude). Just clean up local state.
  elif [ "$phase" = "PHASE:done" ]; then
-    status "phase done — PR #${PR_NUMBER:-?} merged, cleaning up"
+    if [ -n "${PR_NUMBER:-}" ]; then
-
+      status "phase done — PR #${PR_NUMBER} merged, cleaning up"
-    # Notify Matrix (issue already closed and labels removed via API)
+      notify_ctx \
-    notify_ctx \
+        "✅ PR #${PR_NUMBER} merged! Issue #${ISSUE} done." \
-      "✅ PR #${PR_NUMBER:-?} merged! Issue #${ISSUE} done." \
+        "✅ PR <a href='${CODEBERG_WEB}/pulls/${PR_NUMBER}'>#${PR_NUMBER}</a> merged! <a href='${CODEBERG_WEB}/issues/${ISSUE}'>Issue #${ISSUE}</a> done."
-      "✅ PR <a href='${CODEBERG_WEB}/pulls/${PR_NUMBER:-?}'>#${PR_NUMBER:-?}</a> merged! <a href='${CODEBERG_WEB}/issues/${ISSUE}'>Issue #${ISSUE}</a> done."
+    else
      status "phase done — issue #${ISSUE} complete, cleaning up"
      notify_ctx \
        "✅ Issue #${ISSUE} done." \
        "✅ <a href='${CODEBERG_WEB}/issues/${ISSUE}'>Issue #${ISSUE}</a> done."
    fi
    # Belt-and-suspenders: ensure in-progress label removed (idempotent)
    cleanup_labels