fix: fix: action-agent shares phase handler with dev-agent — review lifecycle + cleanup (#388) (#403)

Fixes #388

## Changes
Action-agent now sources dev/phase-handler.sh and enters monitor_phase_loop after prompt injection. Two paths: (A) git output triggers the same PR/CI/review lifecycle as dev-agent, (B) no-git output writes PHASE:done for cleanup. Adds docker compose down on terminal phases, escalation to supervisor on idle timeout, and proper temp file cleanup.

Co-authored-by: openhands <openhands@all-hands.dev>
Reviewed-on: https://codeberg.org/johba/disinto/pulls/403
Reviewed-by: Disinto_bot <disinto_bot@noreply.codeberg.org>
This commit is contained in:
johba 2026-03-20 17:39:44 +01:00
parent 1fb6f0a032
commit a15623b747
4 changed files with 198 additions and 65 deletions

View file

@ -170,7 +170,7 @@ check_script vault/vault-fire.sh
check_script vault/vault-poll.sh check_script vault/vault-poll.sh
check_script vault/vault-reject.sh check_script vault/vault-reject.sh
check_script action/action-poll.sh check_script action/action-poll.sh
check_script action/action-agent.sh check_script action/action-agent.sh dev/phase-handler.sh
echo "function resolution check done" echo "function resolution check done"

View file

@ -227,9 +227,9 @@ later triages these predictions into action/backlog issues or dismisses them.
### Action (`action/`) ### Action (`action/`)
**Role**: Execute operational tasks described by action formulas — run scripts, **Role**: Execute operational tasks described by action formulas — run scripts,
call APIs, send messages, collect human approval. Unlike the dev-agent, the call APIs, send messages, collect human approval. Shares the same phase handler
action-agent produces no PRs: Claude closes the issue directly after executing as the dev-agent: if an action produces code changes, the orchestrator creates a
all formula steps. PR and drives the CI/review loop; otherwise Claude closes the issue directly.
**Trigger**: `action-poll.sh` runs every 10 min via cron. It scans for open **Trigger**: `action-poll.sh` runs every 10 min via cron. It scans for open
issues labeled `action` that have no active tmux session, then spawns issues labeled `action` that have no active tmux session, then spawns
@ -237,17 +237,17 @@ issues labeled `action` that have no active tmux session, then spawns
**Key files**: **Key files**:
- `action/action-poll.sh` — Cron scheduler: finds open action issues with no active tmux session, spawns action-agent.sh - `action/action-poll.sh` — Cron scheduler: finds open action issues with no active tmux session, spawns action-agent.sh
- `action/action-agent.sh` — Orchestrator: fetches issue body + prior comments, creates tmux session (`action-{issue_num}`) with interactive `claude`, injects formula prompt, monitors session until Claude exits or 4h idle timeout - `action/action-agent.sh` — Orchestrator: fetches issue body + prior comments, creates tmux session (`action-{issue_num}`) with interactive `claude`, injects formula prompt with phase protocol, enters `monitor_phase_loop` (shared via `dev/phase-handler.sh`) for CI/review lifecycle or direct completion
**Session lifecycle**: **Session lifecycle**:
1. `action-poll.sh` finds open `action` issues with no active tmux session. 1. `action-poll.sh` finds open `action` issues with no active tmux session.
2. Spawns `action-agent.sh <issue_num>`. 2. Spawns `action-agent.sh <issue_num>`.
3. Agent creates Matrix thread, exports `MATRIX_THREAD_ID` so Claude's output streams to the thread via a Stop hook (`on-stop-matrix.sh`). 3. Agent creates Matrix thread, exports `MATRIX_THREAD_ID` so Claude's output streams to the thread via a Stop hook (`on-stop-matrix.sh`).
4. Agent creates tmux session `action-{issue_num}`, injects prompt (formula + prior comments). 4. Agent creates tmux session `action-{issue_num}`, injects prompt (formula + prior comments + phase protocol).
5. Claude executes formula steps using Bash and other tools, posts progress as issue comments. Each Claude turn is streamed to the Matrix thread for real-time human visibility. 5. Agent enters `monitor_phase_loop` (shared with dev-agent via `dev/phase-handler.sh`).
6. For human input: Claude sends a Matrix message and waits; the reply is injected into the session by `matrix_listener.sh`. 6. **Path A (git output):** Claude pushes branch → `PHASE:awaiting_ci` → handler creates PR, polls CI → injects failures → Claude fixes → push → re-poll → CI passes → `PHASE:awaiting_review` → handler polls reviews → injects REQUEST_CHANGES → Claude fixes → approved → merge → cleanup.
7. When complete: Claude closes the issue with a summary comment. Session exits. 7. **Path B (no git output):** Claude posts results as comment, closes issue → `PHASE:done` → handler cleans up (kill session, docker compose down, remove temp files).
8. Poll detects no active session on next run — nothing further to do. 8. For human input: Claude sends a Matrix message and waits; the reply is injected into the session by `matrix_listener.sh`.
**Environment variables consumed**: **Environment variables consumed**:
- `CODEBERG_TOKEN`, `CODEBERG_REPO`, `CODEBERG_API`, `PROJECT_NAME`, `CODEBERG_WEB` - `CODEBERG_TOKEN`, `CODEBERG_REPO`, `CODEBERG_API`, `PROJECT_NAME`, `CODEBERG_WEB`

View file

@ -6,10 +6,13 @@
# Lifecycle: # Lifecycle:
# 1. Fetch issue body (action formula) + existing comments # 1. Fetch issue body (action formula) + existing comments
# 2. Create tmux session: action-{issue_num} with interactive claude # 2. Create tmux session: action-{issue_num} with interactive claude
# 3. Inject initial prompt: formula + comments + instructions # 3. Inject initial prompt: formula + comments + phase protocol instructions
# 4. Claude executes formula steps, posts progress comments, closes issue # 4. Monitor phase file via monitor_phase_loop (shared with dev-agent)
# Path A (git output): Claude pushes → handler creates PR → CI poll → review
# injection → merge → cleanup (same loop as dev-agent via phase-handler.sh)
# Path B (no git output): Claude posts results → PHASE:done → cleanup
# 5. For human input: Claude asks via Matrix; reply injected via matrix_listener # 5. For human input: Claude asks via Matrix; reply injected via matrix_listener
# 6. Monitor session until Claude exits or idle timeout reached # 6. Cleanup on terminal phase: kill tmux, docker compose down, remove temp files
# #
# Session: action-{issue_num} (tmux) # Session: action-{issue_num} (tmux)
# Log: action/action-poll-{project}.log # Log: action/action-poll-{project}.log
@ -21,16 +24,52 @@ export PROJECT_TOML="${2:-${PROJECT_TOML:-}}"
source "$(dirname "$0")/../lib/env.sh" source "$(dirname "$0")/../lib/env.sh"
source "$(dirname "$0")/../lib/agent-session.sh" source "$(dirname "$0")/../lib/agent-session.sh"
# shellcheck source=../dev/phase-handler.sh
source "$(dirname "$0")/../dev/phase-handler.sh"
SESSION_NAME="action-${ISSUE}" SESSION_NAME="action-${ISSUE}"
LOCKFILE="/tmp/action-agent-${ISSUE}.lock" LOCKFILE="/tmp/action-agent-${ISSUE}.lock"
LOGFILE="${FACTORY_ROOT}/action/action-poll-${PROJECT_NAME:-harb}.log" LOGFILE="${FACTORY_ROOT}/action/action-poll-${PROJECT_NAME:-harb}.log"
THREAD_FILE="/tmp/action-thread-${ISSUE}" THREAD_FILE="/tmp/action-thread-${ISSUE}"
IDLE_TIMEOUT="${ACTION_IDLE_TIMEOUT:-14400}" # 4h default IDLE_TIMEOUT="${ACTION_IDLE_TIMEOUT:-14400}" # 4h default
# --- Phase handler globals (agent-specific; defaults in phase-handler.sh) ---
# shellcheck disable=SC2034 # used by phase-handler.sh
API="${CODEBERG_API}"
BRANCH="action/issue-${ISSUE}"
# shellcheck disable=SC2034 # used by phase-handler.sh
WORKTREE="${PROJECT_REPO_ROOT}"
PHASE_FILE="/tmp/action-session-${PROJECT_NAME:-harb}-${ISSUE}.phase"
IMPL_SUMMARY_FILE="/tmp/action-impl-summary-${PROJECT_NAME:-harb}-${ISSUE}.txt"
PREFLIGHT_RESULT="/tmp/action-preflight-${ISSUE}.json"
log() { log() {
printf '[%s] #%s %s\n' "$(date -u '+%Y-%m-%d %H:%M:%S UTC')" "$ISSUE" "$*" >> "$LOGFILE" printf '[%s] action#%s %s\n' "$(date -u '+%Y-%m-%d %H:%M:%S UTC')" "$ISSUE" "$*" >> "$LOGFILE"
} }
notify() {
local thread_id=""
[ -f "${THREAD_FILE:-}" ] && thread_id=$(cat "$THREAD_FILE" 2>/dev/null || true)
matrix_send "action" "⚡ #${ISSUE}: $*" "${thread_id}" 2>/dev/null || true
}
notify_ctx() {
local plain="$1" html="$2" thread_id=""
[ -f "${THREAD_FILE:-}" ] && thread_id=$(cat "$THREAD_FILE" 2>/dev/null || true)
if [ -n "$thread_id" ]; then
matrix_send_ctx "action" "⚡ #${ISSUE}: ${plain}" "⚡ #${ISSUE}: ${html}" "${thread_id}" 2>/dev/null || true
else
matrix_send "action" "⚡ #${ISSUE}: ${plain}" "" "${ISSUE}" 2>/dev/null || true
fi
}
status() {
log "$*"
}
# --- Action-specific stubs for phase-handler.sh ---
cleanup_worktree() { :; } # action agent uses PROJECT_REPO_ROOT directly — no separate git worktree to remove
cleanup_labels() { :; } # action agent doesn't use in-progress labels
# --- Concurrency lock (per issue) --- # --- Concurrency lock (per issue) ---
if [ -f "$LOCKFILE" ]; then if [ -f "$LOCKFILE" ]; then
LOCK_PID=$(cat "$LOCKFILE" 2>/dev/null || echo "") LOCK_PID=$(cat "$LOCKFILE" 2>/dev/null || echo "")
@ -45,6 +84,9 @@ echo $$ > "$LOCKFILE"
cleanup() { cleanup() {
rm -f "$LOCKFILE" rm -f "$LOCKFILE"
agent_kill_session "$SESSION_NAME" agent_kill_session "$SESSION_NAME"
# Best-effort docker cleanup for containers started during this action
(cd "${PROJECT_REPO_ROOT}" 2>/dev/null && docker compose down 2>/dev/null) || true
rm -f "$PHASE_FILE" "$IMPL_SUMMARY_FILE" "$PREFLIGHT_RESULT"
} }
trap cleanup EXIT trap cleanup EXIT
@ -129,8 +171,11 @@ if [ -n "$THREAD_ID" ]; then
are routed back to this session." are routed back to this session."
fi fi
# Build phase protocol from shared function (Path B covered in Instructions section above)
PHASE_PROTOCOL_INSTRUCTIONS="$(build_phase_protocol_prompt "$PHASE_FILE" "$IMPL_SUMMARY_FILE" "$BRANCH")"
INITIAL_PROMPT="You are an action agent. Your job is to execute the action formula INITIAL_PROMPT="You are an action agent. Your job is to execute the action formula
in the issue below and then close the issue. in the issue below.
## Issue #${ISSUE}: ${ISSUE_TITLE} ## Issue #${ISSUE}: ${ISSUE_TITLE}
@ -153,24 +198,35 @@ ${PRIOR_SECTION}## Instructions
what you need, then wait. A human will reply and the reply will be injected what you need, then wait. A human will reply and the reply will be injected
into this session automatically.${THREAD_HINT} into this session automatically.${THREAD_HINT}
5. When all steps are complete, close issue #${ISSUE} with a summary: ### Path A: If this action produces code changes (e.g. config updates, baselines):
curl -sf -X PATCH \\ - Work in the project repo: cd ${PROJECT_REPO_ROOT}
-H \"Authorization: token \${CODEBERG_TOKEN}\" \\ - Create and switch to branch: git checkout -b ${BRANCH}
-H 'Content-Type: application/json' \\ - Make your changes, commit, and push: git push origin ${BRANCH}
\"${CODEBERG_API}/issues/${ISSUE}\" \\ - Follow the phase protocol below — the orchestrator handles PR creation,
-d '{\"state\": \"closed\"}' CI monitoring, and review injection.
6. Environment variables available in your bash sessions: ### Path B: If this action produces no code changes (investigation, report):
- Post results as a comment on issue #${ISSUE}.
- Close the issue:
curl -sf -X PATCH \\
-H \"Authorization: token \${CODEBERG_TOKEN}\" \\
-H 'Content-Type: application/json' \\
\"${CODEBERG_API}/issues/${ISSUE}\" \\
-d '{\"state\": \"closed\"}'
- Signal completion: echo \"PHASE:done\" > \"${PHASE_FILE}\"
5. Environment variables available in your bash sessions:
CODEBERG_TOKEN, CODEBERG_API, CODEBERG_REPO, CODEBERG_WEB, PROJECT_NAME CODEBERG_TOKEN, CODEBERG_API, CODEBERG_REPO, CODEBERG_WEB, PROJECT_NAME
(all sourced from ${FACTORY_ROOT}/.env) (all sourced from ${FACTORY_ROOT}/.env)
**Important**: You do NOT need to create PRs or write a phase file. Just execute If the prior comments above show work already completed, resume from where it
the formula steps, post comments, and close the issue when done. If the prior left off.
comments above show work already completed, resume from where it left off."
${PHASE_PROTOCOL_INSTRUCTIONS}"
# --- Create tmux session --- # --- Create tmux session ---
log "creating tmux session: ${SESSION_NAME}" log "creating tmux session: ${SESSION_NAME}"
if ! create_agent_session "${SESSION_NAME}" "${FACTORY_ROOT}"; then if ! create_agent_session "${SESSION_NAME}" "${FACTORY_ROOT}" "${PHASE_FILE}"; then
log "ERROR: failed to create tmux session" log "ERROR: failed to create tmux session"
exit 1 exit 1
fi fi
@ -182,31 +238,30 @@ log "initial prompt injected into session"
matrix_send "action" "⚡ #${ISSUE}: session started — ${ISSUE_TITLE}" \ matrix_send "action" "⚡ #${ISSUE}: session started — ${ISSUE_TITLE}" \
"${THREAD_ID}" 2>/dev/null || true "${THREAD_ID}" 2>/dev/null || true
# --- Monitor session until Claude exits or idle timeout --- # --- Monitor phase loop (shared with dev-agent) ---
log "monitoring session: ${SESSION_NAME} (idle_timeout=${IDLE_TIMEOUT}s)" status "monitoring phase: ${PHASE_FILE} (action agent)"
IDLE_ELAPSED=0 monitor_phase_loop "$PHASE_FILE" "$IDLE_TIMEOUT" _on_phase_change "$SESSION_NAME"
POLL_INTERVAL=30
IDLE_MARKER="/tmp/claude-idle-${SESSION_NAME}.ts"
while tmux has-session -t "${SESSION_NAME}" 2>/dev/null; do # Handle exit reason from monitor_phase_loop
sleep "$POLL_INTERVAL" case "${_MONITOR_LOOP_EXIT:-}" in
idle_timeout)
# Use the Stop hook idle marker to distinguish active vs idle: notify_ctx \
# marker exists → Claude finished responding and is at the prompt (idle) "session idle for $((IDLE_TIMEOUT / 3600))h — killed" \
# marker absent → Claude is mid-turn or hasn't started yet (active) "session idle for $((IDLE_TIMEOUT / 3600))h — killed"
if [ -f "$IDLE_MARKER" ]; then # Escalate to supervisor (idle_prompt already escalated via _on_phase_change callback)
IDLE_ELAPSED=$((IDLE_ELAPSED + POLL_INTERVAL)) echo "{\"issue\":${ISSUE},\"pr\":${PR_NUMBER:-0},\"reason\":\"idle_timeout\",\"ts\":\"$(date -u +%Y-%m-%dT%H:%M:%SZ)\"}" \
else >> "${FACTORY_ROOT}/supervisor/escalations-${PROJECT_NAME}.jsonl"
IDLE_ELAPSED=0 rm -f "$PHASE_FILE" "$IMPL_SUMMARY_FILE" "$THREAD_FILE"
fi ;;
idle_prompt)
if [ "$IDLE_ELAPSED" -ge "$IDLE_TIMEOUT" ]; then # Notification + escalation already handled by _on_phase_change(PHASE:failed) callback
log "idle timeout (${IDLE_TIMEOUT}s) — killing session for issue #${ISSUE}" rm -f "$PHASE_FILE" "$IMPL_SUMMARY_FILE" "$THREAD_FILE"
matrix_send "action" "⚠️ #${ISSUE}: session idle for $((IDLE_TIMEOUT / 3600))h — killed" \ ;;
"${THREAD_ID}" 2>/dev/null || true done)
agent_kill_session "${SESSION_NAME}" # Belt-and-suspenders: callback handles primary cleanup,
break # but ensure sentinel files are removed if callback was interrupted
fi rm -f "$PHASE_FILE" "$IMPL_SUMMARY_FILE" "$THREAD_FILE"
done ;;
esac
log "action-agent finished for issue #${ISSUE}" log "action-agent finished for issue #${ISSUE}"

View file

@ -1,24 +1,97 @@
#!/usr/bin/env bash #!/usr/bin/env bash
# dev/phase-handler.sh — Phase callback functions for dev-agent.sh # dev/phase-handler.sh — Phase callback functions for dev-agent.sh
# #
# Source this file from dev-agent.sh after lib/agent-session.sh is loaded. # Source this file from agent orchestrators after lib/agent-session.sh is loaded.
# Defines: post_refusal_comment(), _on_phase_change() # Defines: post_refusal_comment(), _on_phase_change(), build_phase_protocol_prompt()
# #
# Required globals from dev-agent.sh: # Required globals (set by calling agent before or after sourcing):
# ISSUE, CODEBERG_TOKEN, API, CODEBERG_WEB, PROJECT_NAME, FACTORY_ROOT # ISSUE, CODEBERG_TOKEN, API, CODEBERG_WEB, PROJECT_NAME, FACTORY_ROOT
# PR_NUMBER, BRANCH, PHASE_FILE, WORKTREE, IMPL_SUMMARY_FILE, THREAD_FILE # BRANCH, PHASE_FILE, WORKTREE, IMPL_SUMMARY_FILE, THREAD_FILE
# PRIMARY_BRANCH, SESSION_NAME, LOGFILE, ISSUE_TITLE # PRIMARY_BRANCH, SESSION_NAME, LOGFILE, ISSUE_TITLE
# CI_POLL_TIMEOUT, MAX_CI_FIXES, MAX_REVIEW_ROUNDS, REVIEW_POLL_TIMEOUT
# CI_RETRY_COUNT, CI_FIX_COUNT, REVIEW_ROUND, CLAIMED
# WOODPECKER_REPO_ID, WOODPECKER_TOKEN, WOODPECKER_SERVER # WOODPECKER_REPO_ID, WOODPECKER_TOKEN, WOODPECKER_SERVER
# #
# Calls back to dev-agent.sh-defined helpers: # Globals with defaults (agents can override after sourcing):
# cleanup_worktree(), cleanup_labels() # PR_NUMBER, CI_POLL_TIMEOUT, MAX_CI_FIXES, MAX_REVIEW_ROUNDS,
# REVIEW_POLL_TIMEOUT, CI_RETRY_COUNT, CI_FIX_COUNT, REVIEW_ROUND,
# CLAIMED, PHASE_POLL_INTERVAL
#
# Calls back to agent-defined helpers:
# cleanup_worktree(), cleanup_labels(), notify(), notify_ctx(), status(), log()
# #
# shellcheck shell=bash # shellcheck shell=bash
# shellcheck disable=SC2154 # globals are set in dev-agent.sh before calling # shellcheck disable=SC2154 # globals are set in dev-agent.sh before calling
# shellcheck disable=SC2034 # CLAIMED is read by cleanup() in dev-agent.sh # shellcheck disable=SC2034 # CLAIMED is read by cleanup() in dev-agent.sh
# --- Default globals (agents can override after sourcing) ---
: "${CI_POLL_TIMEOUT:=1800}"
: "${REVIEW_POLL_TIMEOUT:=10800}"
: "${MAX_CI_FIXES:=3}"
: "${MAX_REVIEW_ROUNDS:=5}"
: "${CI_RETRY_COUNT:=0}"
: "${CI_FIX_COUNT:=0}"
: "${REVIEW_ROUND:=0}"
: "${PR_NUMBER:=}"
: "${CLAIMED:=false}"
: "${PHASE_POLL_INTERVAL:=30}"
# --- Build phase protocol prompt (shared across agents) ---
# Generates the phase-signaling instructions for Claude prompts.
# Args: phase_file summary_file branch
# Output: The protocol text (stdout)
build_phase_protocol_prompt() {
local _pf="$1" _sf="$2" _br="$3"
cat <<_PHASE_PROTOCOL_EOF_
## Phase-Signaling Protocol (REQUIRED)
You are running in a persistent tmux session managed by an orchestrator.
Communicate progress by writing to the phase file. The orchestrator watches
this file and injects events (CI results, review feedback) back into this session.
### Key files
\`\`\`
PHASE_FILE="${_pf}"
SUMMARY_FILE="${_sf}"
\`\`\`
### Phase transitions — write these exactly:
**After committing and pushing your branch:**
\`\`\`bash
git push origin ${_br}
# Write a short summary of what you implemented:
printf '%s' "<your summary>" > "\${SUMMARY_FILE}"
# Signal the orchestrator to create the PR and watch for CI:
echo "PHASE:awaiting_ci" > "${_pf}"
\`\`\`
Then STOP and wait. The orchestrator will inject CI results.
**When you receive a "CI passed" injection:**
\`\`\`bash
echo "PHASE:awaiting_review" > "${_pf}"
\`\`\`
Then STOP and wait. The orchestrator will inject review feedback.
**When you receive a "CI failed:" injection:**
Fix the CI issue, commit, push, then:
\`\`\`bash
echo "PHASE:awaiting_ci" > "${_pf}"
\`\`\`
Then STOP and wait.
**When you receive a "Review: REQUEST_CHANGES" injection:**
Address ALL review feedback, commit, push, then:
\`\`\`bash
echo "PHASE:awaiting_ci" > "${_pf}"
\`\`\`
(CI runs again after each push — always write awaiting_ci, not awaiting_review)
**On unrecoverable failure:**
\`\`\`bash
printf 'PHASE:failed\nReason: %s\n' "describe what failed" > "${_pf}"
\`\`\`
_PHASE_PROTOCOL_EOF_
}
# --- Merge helper --- # --- Merge helper ---
# do_merge — attempt to merge PR via Codeberg API. # do_merge — attempt to merge PR via Codeberg API.
# Args: pr_num # Args: pr_num
@ -489,12 +562,17 @@ Instructions:
# ── PHASE: done ───────────────────────────────────────────────────────────── # ── PHASE: done ─────────────────────────────────────────────────────────────
# PR merged and issue closed (by orchestrator or Claude). Just clean up local state. # PR merged and issue closed (by orchestrator or Claude). Just clean up local state.
elif [ "$phase" = "PHASE:done" ]; then elif [ "$phase" = "PHASE:done" ]; then
status "phase done — PR #${PR_NUMBER:-?} merged, cleaning up" if [ -n "${PR_NUMBER:-}" ]; then
status "phase done — PR #${PR_NUMBER} merged, cleaning up"
# Notify Matrix (issue already closed and labels removed via API) notify_ctx \
notify_ctx \ "✅ PR #${PR_NUMBER} merged! Issue #${ISSUE} done." \
"✅ PR #${PR_NUMBER:-?} merged! Issue #${ISSUE} done." \ "✅ PR <a href='${CODEBERG_WEB}/pulls/${PR_NUMBER}'>#${PR_NUMBER}</a> merged! <a href='${CODEBERG_WEB}/issues/${ISSUE}'>Issue #${ISSUE}</a> done."
"✅ PR <a href='${CODEBERG_WEB}/pulls/${PR_NUMBER:-?}'>#${PR_NUMBER:-?}</a> merged! <a href='${CODEBERG_WEB}/issues/${ISSUE}'>Issue #${ISSUE}</a> done." else
status "phase done — issue #${ISSUE} complete, cleaning up"
notify_ctx \
"✅ Issue #${ISSUE} done." \
"✅ <a href='${CODEBERG_WEB}/issues/${ISSUE}'>Issue #${ISSUE}</a> done."
fi
# Belt-and-suspenders: ensure in-progress label removed (idempotent) # Belt-and-suspenders: ensure in-progress label removed (idempotent)
cleanup_labels cleanup_labels