From dc545a817b346bdc9d7ded8f4cec6f01e09a8cd9 Mon Sep 17 00:00:00 2001 From: Agent Date: Tue, 31 Mar 2026 19:42:25 +0000 Subject: [PATCH 1/2] fix: chore(26a): delete action-agent.sh, action-poll.sh, and action/AGENTS.md (#65) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Delete action/ directory and all its contents - Remove action-bot from bin/disinto bot token mapping and collaborator lists - Remove FORGE_ACTION_TOKEN from lib/env.sh and .env.example - Remove action-bot from FORGE_BOT_USERNAMES in lib/env.sh and .env.example - Update .woodpecker/agent-smoke.sh to remove action script checks - Update AGENTS.md: remove action agent from description and table - Update lib/AGENTS.md: remove action-agent references from sourced by columns - Update docs/PHASE-PROTOCOL.md: remove action-agent reference - Update docs/AGENT-DESIGN.md: remove action-agent from agent table - Update planner/AGENTS.md: update action formula execution reference - Update README.md: update formula-driven execution reference Part of #26 — retire action-agent system. --- .env.example | 3 +- .woodpecker/agent-smoke.sh | 2 - AGENTS.md | 10 +- README.md | 2 +- action/AGENTS.md | 34 ---- action/action-agent.sh | 323 ------------------------------------- action/action-poll.sh | 75 --------- bin/disinto | 7 +- docs/AGENT-DESIGN.md | 1 - docs/PHASE-PROTOCOL.md | 2 +- lib/AGENTS.md | 14 +- lib/env.sh | 3 +- planner/AGENTS.md | 2 +- 13 files changed, 19 insertions(+), 459 deletions(-) delete mode 100644 action/AGENTS.md delete mode 100755 action/action-agent.sh delete mode 100755 action/action-poll.sh diff --git a/.env.example b/.env.example index 7ca5ba6..7f70675 100644 --- a/.env.example +++ b/.env.example @@ -26,8 +26,7 @@ FORGE_GARDENER_TOKEN= # [SECRET] gardener-bot API token FORGE_VAULT_TOKEN= # [SECRET] vault-bot API token FORGE_SUPERVISOR_TOKEN= # [SECRET] supervisor-bot API token FORGE_PREDICTOR_TOKEN= # [SECRET] predictor-bot API token -FORGE_ACTION_TOKEN= # [SECRET] action-bot API token -FORGE_BOT_USERNAMES=dev-bot,review-bot,planner-bot,gardener-bot,vault-bot,supervisor-bot,predictor-bot,action-bot +FORGE_BOT_USERNAMES=dev-bot,review-bot,planner-bot,gardener-bot,vault-bot,supervisor-bot,predictor-bot # ── Backwards compatibility ─────────────────────────────────────────────── # If CODEBERG_TOKEN is set but FORGE_TOKEN is not, env.sh falls back to diff --git a/.woodpecker/agent-smoke.sh b/.woodpecker/agent-smoke.sh index 9a37bf4..eddfe87 100644 --- a/.woodpecker/agent-smoke.sh +++ b/.woodpecker/agent-smoke.sh @@ -214,8 +214,6 @@ check_script vault/vault-agent.sh check_script vault/vault-fire.sh check_script vault/vault-poll.sh check_script vault/vault-reject.sh -check_script action/action-poll.sh -check_script action/action-agent.sh check_script supervisor/supervisor-run.sh check_script supervisor/preflight.sh check_script predictor/predictor-run.sh diff --git a/AGENTS.md b/AGENTS.md index 04a0ac1..7fe6be8 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -3,8 +3,8 @@ ## What this repo is -Disinto is an autonomous code factory. It manages eight agents (dev, review, -gardener, supervisor, planner, predictor, action, vault) that pick up issues from forge, +Disinto is an autonomous code factory. It manages seven agents (dev, review, +gardener, supervisor, planner, predictor, vault) that pick up issues from forge, implement them, review PRs, plan from the vision, gate dangerous actions, and keep the system healthy — all via cron and `claude -p`. @@ -23,7 +23,6 @@ disinto/ (code repo) │ preflight.sh — pre-flight data collection for supervisor formula │ supervisor-poll.sh — legacy bash orchestrator (superseded) ├── vault/ vault-poll.sh, vault-agent.sh, vault-fire.sh — action gating + procurement -├── action/ action-poll.sh, action-agent.sh — operational task execution ├── lib/ env.sh, agent-session.sh, ci-helpers.sh, ci-debug.sh, load-project.sh, parse-deps.sh, guard.sh, mirrors.sh, pr-lifecycle.sh, issue-lifecycle.sh, worktree.sh, build-graph.py ├── projects/ *.toml.example — templates; *.toml — local per-box config (gitignored) ├── formulas/ Issue templates (TOML specs for multi-step agent tasks) @@ -90,7 +89,6 @@ bash dev/phase-test.sh | Supervisor | `supervisor/` | Health monitoring | [supervisor/AGENTS.md](supervisor/AGENTS.md) | | Planner | `planner/` | Strategic planning | [planner/AGENTS.md](planner/AGENTS.md) | | Predictor | `predictor/` | Infrastructure pattern detection | [predictor/AGENTS.md](predictor/AGENTS.md) | -| Action | `action/` | Operational task execution | [action/AGENTS.md](action/AGENTS.md) | | Vault | `vault/` | Action gating + resource procurement | [vault/AGENTS.md](vault/AGENTS.md) | See [lib/AGENTS.md](lib/AGENTS.md) for the full shared helper reference. @@ -108,14 +106,14 @@ Issues flow: `backlog` → `in-progress` → PR → CI → review → merge → | `backlog` | Issue is queued for implementation. Dev-poll picks the first ready one. | Planner, gardener, humans | | `priority` | Queue tier above plain backlog. Issues with both `priority` and `backlog` are picked before plain `backlog` issues. FIFO within each tier. | Planner, humans | | `in-progress` | Dev-agent is actively working on this issue. Only one issue per project is in-progress at a time. | dev-agent.sh (claims issue) | -| `blocked` | Issue is stuck — agent session failed, crashed, timed out, or CI exhausted. Diagnostic comment on the issue has details. Also used for unmet dependencies. | dev-agent.sh, action-agent.sh, dev-poll.sh (on failure) | +| `blocked` | Issue is stuck — agent session failed, crashed, timed out, or CI exhausted. Diagnostic comment on the issue has details. Also used for unmet dependencies. | dev-agent.sh, dev-poll.sh (on failure) | | `tech-debt` | Pre-existing issue flagged by AI reviewer, not introduced by a PR. | review-pr.sh (auto-created follow-ups) | | `underspecified` | Dev-agent refused the issue as too large or vague. | dev-poll.sh (on preflight `too_large`), dev-agent.sh (on mid-run `too_large` refusal) | | `vision` | Goal anchors — high-level objectives from VISION.md. | Planner, humans | | `prediction/unreviewed` | Unprocessed prediction filed by predictor. | predictor-run.sh | | `prediction/dismissed` | Prediction triaged as DISMISS — planner disagrees, closed with reason. | Planner (triage-predictions step) | | `prediction/actioned` | Prediction promoted or dismissed by planner. | Planner (triage-predictions step) | -| `action` | Operational task for the action-agent to execute via formula. | Planner, humans | +| `action` | Operational task for the dispatcher to execute via formula. | Planner, humans | ### Dependency conventions diff --git a/README.md b/README.md index 6a5479e..abb47a1 100644 --- a/README.md +++ b/README.md @@ -123,7 +123,7 @@ disinto/ │ └── best-practices.md # Gardener knowledge base ├── planner/ │ ├── planner-poll.sh # Cron entry: weekly vision gap analysis -│ └── (formula-driven) # run-planner.toml executed by action-agent +│ └── (formula-driven) # run-planner.toml executed by dispatcher ├── vault/ │ ├── vault-poll.sh # Cron entry: process pending dangerous actions │ ├── vault-agent.sh # Classifies and routes actions (claude -p) diff --git a/action/AGENTS.md b/action/AGENTS.md deleted file mode 100644 index 55dadae..0000000 --- a/action/AGENTS.md +++ /dev/null @@ -1,34 +0,0 @@ - -# Action Agent - -**Role**: Execute operational tasks described by action formulas — run scripts, -call APIs, send messages, collect human approval. Shares the same phase handler -as the dev-agent: if an action produces code changes, the orchestrator creates a -PR and drives the CI/review loop; otherwise Claude closes the issue directly. - -**Trigger**: `action-poll.sh` runs every 10 min via cron. Sources `lib/guard.sh` -and calls `check_active action` first — skips if `$FACTORY_ROOT/state/.action-active` -is absent. Then scans for open issues labeled `action` that have no active tmux -session, and spawns `action-agent.sh `. - -**Key files**: -- `action/action-poll.sh` — Cron scheduler: finds open action issues with no active tmux session, spawns action-agent.sh -- `action/action-agent.sh` — Orchestrator: fetches issue body + prior comments, **checks all dependencies via `lib/parse-deps.sh` before spawning** (skips silently if any dep is still open), creates tmux session (`action-{project}-{issue_num}`) with interactive `claude`, injects formula prompt with phase protocol, enters `monitor_phase_loop` (shared via `dev/phase-handler.sh`) for CI/review lifecycle or direct completion - -**Session lifecycle**: -1. `action-poll.sh` finds open `action` issues with no active tmux session. -2. Spawns `action-agent.sh `. -3. Agent creates tmux session `action-{project}-{issue_num}`, injects prompt (formula + prior comments + phase protocol). -4. Agent enters `monitor_phase_loop` (shared with dev-agent via `dev/phase-handler.sh`). -5. **Path A (git output):** Claude pushes branch → `PHASE:awaiting_ci` → handler creates PR, polls CI → injects failures → Claude fixes → push → re-poll → CI passes → `PHASE:awaiting_review` → handler polls reviews → injects REQUEST_CHANGES → Claude fixes → approved → merge → cleanup. -6. **Path B (no git output):** Claude posts results as comment, closes issue → `PHASE:done` → handler cleans up (kill session, docker compose down, remove temp files). -7. For human input: Claude writes `PHASE:escalate`; human responds via vault/forge. - -**Crash recovery**: on `PHASE:crashed` or non-zero exit, the worktree is **preserved** (not destroyed) for debugging. Location logged. Supervisor housekeeping removes stale crashed worktrees older than 24h. - -**Environment variables consumed**: -- `FORGE_TOKEN`, `FORGE_ACTION_TOKEN` (falls back to FORGE_TOKEN), `FORGE_REPO`, `FORGE_API`, `FORGE_URL`, `PROJECT_NAME`, `FORGE_WEB` -- `ACTION_IDLE_TIMEOUT` — Max seconds before killing idle session (default 14400 = 4h) -- `ACTION_MAX_LIFETIME` — Max total session wall-clock seconds (default 28800 = 8h); caps session independently of idle timeout - -**FORGE_REMOTE**: `action-agent.sh` auto-detects the git remote for `FORGE_URL` (same logic as dev-agent). Exported as `FORGE_REMOTE`, used for worktree creation and push instructions injected into the Claude prompt. diff --git a/action/action-agent.sh b/action/action-agent.sh deleted file mode 100755 index 38d7d39..0000000 --- a/action/action-agent.sh +++ /dev/null @@ -1,323 +0,0 @@ -#!/usr/bin/env bash -# ============================================================================= -# action-agent.sh — Synchronous action agent: SDK + shared libraries -# -# Synchronous bash loop using claude -p (one-shot invocation). -# No tmux sessions, no phase files — the bash script IS the state machine. -# -# Usage: ./action-agent.sh [project.toml] -# -# Flow: -# 1. Preflight: issue_check_deps(), memory guard, concurrency lock -# 2. Parse model from YAML front matter in issue body (custom model selection) -# 3. Worktree: worktree_create() for action isolation -# 4. Load formula from issue body -# 5. Build prompt: formula + prior non-bot comments (resume context) -# 6. agent_run(worktree, prompt) → Claude executes action, may push -# 7. If pushed: pr_walk_to_merge() from lib/pr-lifecycle.sh -# 8. Cleanup: worktree_cleanup(), issue_close() -# -# Action-specific (stays in runner): -# - YAML front matter parsing (model selection) -# - Bot username filtering for prior comments -# - Lifetime watchdog (MAX_LIFETIME=8h wall-clock cap) -# - Child process cleanup (docker compose, background jobs) -# -# From shared libraries: -# - Issue lifecycle: lib/issue-lifecycle.sh -# - Worktree: lib/worktree.sh -# - PR lifecycle: lib/pr-lifecycle.sh -# - Agent SDK: lib/agent-sdk.sh -# -# Log: action/action-poll-{project}.log -# ============================================================================= -set -euo pipefail - -ISSUE="${1:?Usage: action-agent.sh [project.toml]}" -export PROJECT_TOML="${2:-${PROJECT_TOML:-}}" - -SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)" -FACTORY_ROOT="$(dirname "$SCRIPT_DIR")" - -# shellcheck source=../lib/env.sh -source "$FACTORY_ROOT/lib/env.sh" -# Use action-bot's own Forgejo identity (#747) -FORGE_TOKEN="${FORGE_ACTION_TOKEN:-${FORGE_TOKEN}}" -# shellcheck source=../lib/ci-helpers.sh -source "$FACTORY_ROOT/lib/ci-helpers.sh" -# shellcheck source=../lib/worktree.sh -source "$FACTORY_ROOT/lib/worktree.sh" -# shellcheck source=../lib/issue-lifecycle.sh -source "$FACTORY_ROOT/lib/issue-lifecycle.sh" -# shellcheck source=../lib/agent-sdk.sh -source "$FACTORY_ROOT/lib/agent-sdk.sh" -# shellcheck source=../lib/pr-lifecycle.sh -source "$FACTORY_ROOT/lib/pr-lifecycle.sh" - -BRANCH="action/issue-${ISSUE}" -WORKTREE="/tmp/action-${ISSUE}-$(date +%s)" -LOCKFILE="/tmp/action-agent-${ISSUE}.lock" -LOGFILE="${DISINTO_LOG_DIR}/action/action-poll-${PROJECT_NAME:-default}.log" -# shellcheck disable=SC2034 # consumed by agent-sdk.sh -SID_FILE="/tmp/action-session-${PROJECT_NAME:-default}-${ISSUE}.sid" -MAX_LIFETIME="${ACTION_MAX_LIFETIME:-28800}" # 8h default wall-clock cap -SESSION_START_EPOCH=$(date +%s) - -log() { - printf '[%s] action#%s %s\n' "$(date -u '+%Y-%m-%d %H:%M:%S UTC')" "$ISSUE" "$*" >> "$LOGFILE" -} - -# --- Concurrency lock (per issue) --- -if [ -f "$LOCKFILE" ]; then - LOCK_PID=$(cat "$LOCKFILE" 2>/dev/null || echo "") - if [ -n "$LOCK_PID" ] && kill -0 "$LOCK_PID" 2>/dev/null; then - log "SKIP: action-agent already running for #${ISSUE} (PID ${LOCK_PID})" - exit 0 - fi - rm -f "$LOCKFILE" -fi -echo $$ > "$LOCKFILE" - -cleanup() { - local exit_code=$? - # Kill lifetime watchdog if running - if [ -n "${LIFETIME_WATCHDOG_PID:-}" ] && kill -0 "$LIFETIME_WATCHDOG_PID" 2>/dev/null; then - kill "$LIFETIME_WATCHDOG_PID" 2>/dev/null || true - wait "$LIFETIME_WATCHDOG_PID" 2>/dev/null || true - fi - rm -f "$LOCKFILE" - # Kill any remaining child processes spawned during the run - local children - children=$(jobs -p 2>/dev/null) || true - if [ -n "$children" ]; then - # shellcheck disable=SC2086 # intentional word splitting - kill $children 2>/dev/null || true - # shellcheck disable=SC2086 - wait $children 2>/dev/null || true - fi - # Best-effort docker cleanup for containers started during this action - (cd "${WORKTREE}" 2>/dev/null && docker compose down 2>/dev/null) || true - # Preserve worktree on crash for debugging; clean up on success - if [ "$exit_code" -ne 0 ]; then - worktree_preserve "$WORKTREE" "crashed (exit=$exit_code)" - else - worktree_cleanup "$WORKTREE" - fi - rm -f "$SID_FILE" -} -trap cleanup EXIT - -# --- Memory guard --- -memory_guard 2000 - -# --- Fetch issue --- -log "fetching issue #${ISSUE}" -ISSUE_JSON=$(curl -sf -H "Authorization: token ${FORGE_TOKEN}" \ - "${FORGE_API}/issues/${ISSUE}") || true - -if [ -z "$ISSUE_JSON" ] || ! printf '%s' "$ISSUE_JSON" | jq -e '.id' >/dev/null 2>&1; then - log "ERROR: failed to fetch issue #${ISSUE}" - exit 1 -fi - -ISSUE_TITLE=$(printf '%s' "$ISSUE_JSON" | jq -r '.title') -ISSUE_BODY=$(printf '%s' "$ISSUE_JSON" | jq -r '.body // ""') -ISSUE_STATE=$(printf '%s' "$ISSUE_JSON" | jq -r '.state') - -if [ "$ISSUE_STATE" != "open" ]; then - log "SKIP: issue #${ISSUE} is ${ISSUE_STATE}" - exit 0 -fi - -log "Issue: ${ISSUE_TITLE}" - -# --- Dependency check (shared library) --- -if ! issue_check_deps "$ISSUE"; then - log "SKIP: issue #${ISSUE} blocked by: ${_ISSUE_BLOCKED_BY[*]}" - exit 0 -fi - -# --- Extract model from YAML front matter (if present) --- -YAML_MODEL=$(printf '%s' "$ISSUE_BODY" | \ - sed -n '/^---$/,/^---$/p' | grep '^model:' | awk '{print $2}' | tr -d '"' || true) -if [ -n "$YAML_MODEL" ]; then - export CLAUDE_MODEL="$YAML_MODEL" - log "model from front matter: ${YAML_MODEL}" -fi - -# --- Resolve bot username(s) for comment filtering --- -_bot_login=$(curl -sf -H "Authorization: token ${FORGE_TOKEN}" \ - "${FORGE_API%%/repos*}/user" | jq -r '.login // empty' 2>/dev/null || true) - -# Build list: token owner + any extra names from FORGE_BOT_USERNAMES (comma-separated) -_bot_logins="${_bot_login}" -if [ -n "${FORGE_BOT_USERNAMES:-}" ]; then - _bot_logins="${_bot_logins:+${_bot_logins},}${FORGE_BOT_USERNAMES}" -fi - -# --- Fetch existing comments (resume context, excluding bot comments) --- -COMMENTS_JSON=$(curl -sf -H "Authorization: token ${FORGE_TOKEN}" \ - "${FORGE_API}/issues/${ISSUE}/comments?limit=50") || true - -PRIOR_COMMENTS="" -if [ -n "$COMMENTS_JSON" ] && [ "$COMMENTS_JSON" != "null" ] && [ "$COMMENTS_JSON" != "[]" ]; then - PRIOR_COMMENTS=$(printf '%s' "$COMMENTS_JSON" | \ - jq -r --arg bots "$_bot_logins" \ - '($bots | split(",") | map(select(. != ""))) as $bl | - .[] | select(.user.login as $u | $bl | index($u) | not) | - "[\(.user.login) at \(.created_at[:19])]\n\(.body)\n---"' 2>/dev/null || true) -fi - -# --- Determine git remote --- -cd "${PROJECT_REPO_ROOT}" -_forge_host=$(echo "$FORGE_URL" | sed 's|https\?://||; s|/.*||') -FORGE_REMOTE=$(git remote -v | awk -v host="$_forge_host" '$2 ~ host && /\(push\)/ {print $1; exit}') -FORGE_REMOTE="${FORGE_REMOTE:-origin}" -export FORGE_REMOTE - -# --- Create isolated worktree --- -log "creating worktree: ${WORKTREE}" -git fetch "${FORGE_REMOTE}" "${PRIMARY_BRANCH}" 2>/dev/null || true -if ! worktree_create "$WORKTREE" "$BRANCH"; then - log "ERROR: worktree creation failed" - exit 1 -fi -log "worktree ready: ${WORKTREE}" - -# --- Build prompt --- -PRIOR_SECTION="" -if [ -n "$PRIOR_COMMENTS" ]; then - PRIOR_SECTION="## Prior comments (resume context) - -${PRIOR_COMMENTS} - -" -fi - -GIT_INSTRUCTIONS=$(build_phase_protocol_prompt "$BRANCH" "$FORGE_REMOTE") - -PROMPT="You are an action agent. Your job is to execute the action formula -in the issue below. - -## Issue #${ISSUE}: ${ISSUE_TITLE} - -${ISSUE_BODY} - -${PRIOR_SECTION}## Instructions - -1. Read the action formula steps in the issue body carefully. - -2. Execute each step in order using your Bash tool and any other tools available. - -3. Post progress as comments on issue #${ISSUE} after significant steps: - curl -sf -X POST \\ - -H \"Authorization: token \${FORGE_TOKEN}\" \\ - -H 'Content-Type: application/json' \\ - \"${FORGE_API}/issues/${ISSUE}/comments\" \\ - -d \"{\\\"body\\\": \\\"your comment here\\\"}\" - -4. If a step requires human input or approval, post a comment explaining what - is needed and stop — the orchestrator will block the issue. - -### Path A: If this action produces code changes (e.g. config updates, baselines): - - You are already in an isolated worktree at: ${WORKTREE} - - You are on branch: ${BRANCH} - - Make your changes, commit, and push: git push ${FORGE_REMOTE} ${BRANCH} - - **IMPORTANT:** The worktree is destroyed after completion. Push all - results before finishing — unpushed work will be lost. - -### Path B: If this action produces no code changes (investigation, report): - - Post results as a comment on issue #${ISSUE}. - - **IMPORTANT:** The worktree is destroyed after completion. Copy any - files you need to persistent paths before finishing. - -5. Environment variables available in your bash sessions: - FORGE_TOKEN, FORGE_API, FORGE_REPO, FORGE_WEB, PROJECT_NAME - (all sourced from ${FACTORY_ROOT}/.env) - -### CRITICAL: Never embed secrets in issue bodies, comments, or PR descriptions - - NEVER put API keys, tokens, passwords, or private keys in issue text or comments. - - Always reference secrets via env var names (e.g. \\\$BASE_RPC_URL, \\\${FORGE_TOKEN}). - - If a formula step needs a secret, read it from .env or the environment at runtime. - - Before posting any comment, verify it contains no credentials, hex keys > 32 chars, - or URLs with embedded API keys. - -If the prior comments above show work already completed, resume from where it -left off. - -${GIT_INSTRUCTIONS}" - -# --- Wall-clock lifetime watchdog (background) --- -# Caps total run time independently of claude -p timeout. When the cap is -# hit the watchdog kills the main process, which triggers cleanup via trap. -_lifetime_watchdog() { - local remaining=$(( MAX_LIFETIME - ($(date +%s) - SESSION_START_EPOCH) )) - [ "$remaining" -le 0 ] && remaining=1 - sleep "$remaining" - local hours=$(( MAX_LIFETIME / 3600 )) - log "MAX_LIFETIME (${hours}h) reached — killing agent" - # Post summary comment on issue - local body="Action agent killed: wall-clock lifetime cap (${hours}h) reached." - curl -sf -X POST \ - -H "Authorization: token ${FORGE_TOKEN}" \ - -H 'Content-Type: application/json' \ - "${FORGE_API}/issues/${ISSUE}/comments" \ - -d "{\"body\": \"${body}\"}" >/dev/null 2>&1 || true - kill $$ 2>/dev/null || true -} -_lifetime_watchdog & -LIFETIME_WATCHDOG_PID=$! - -# --- Run agent --- -log "running agent (worktree: ${WORKTREE})" -agent_run --worktree "$WORKTREE" "$PROMPT" -log "agent_run complete" - -# --- Detect if branch was pushed (Path A vs Path B) --- -PUSHED=false -# Check if remote branch exists -git fetch "${FORGE_REMOTE}" "$BRANCH" 2>/dev/null || true -if git rev-parse --verify "${FORGE_REMOTE}/${BRANCH}" >/dev/null 2>&1; then - PUSHED=true -fi -# Fallback: check local commits ahead of base -if [ "$PUSHED" = false ]; then - if git -C "$WORKTREE" log "${FORGE_REMOTE}/${PRIMARY_BRANCH}..${BRANCH}" --oneline 2>/dev/null | grep -q .; then - PUSHED=true - fi -fi - -if [ "$PUSHED" = true ]; then - # --- Path A: code changes pushed — create PR and walk to merge --- - log "branch pushed — creating PR" - PR_NUMBER="" - PR_NUMBER=$(pr_create "$BRANCH" "action: ${ISSUE_TITLE}" \ - "Closes #${ISSUE} - -Automated action execution by action-agent.") || true - - if [ -n "$PR_NUMBER" ]; then - log "walking PR #${PR_NUMBER} to merge" - pr_walk_to_merge "$PR_NUMBER" "$_AGENT_SESSION_ID" "$WORKTREE" || true - - case "${_PR_WALK_EXIT_REASON:-}" in - merged) - log "PR #${PR_NUMBER} merged — closing issue" - issue_close "$ISSUE" - ;; - *) - log "PR #${PR_NUMBER} not merged (reason: ${_PR_WALK_EXIT_REASON:-unknown})" - issue_block "$ISSUE" "pr_not_merged: ${_PR_WALK_EXIT_REASON:-unknown}" - ;; - esac - else - log "ERROR: failed to create PR" - issue_block "$ISSUE" "pr_creation_failed" - fi -else - # --- Path B: no code changes — close issue directly --- - log "no branch pushed — closing issue (Path B)" - issue_close "$ISSUE" -fi - -log "action-agent finished for issue #${ISSUE}" diff --git a/action/action-poll.sh b/action/action-poll.sh deleted file mode 100755 index 8d67c47..0000000 --- a/action/action-poll.sh +++ /dev/null @@ -1,75 +0,0 @@ -#!/usr/bin/env bash -# action-poll.sh — Cron scheduler: find open 'action' issues, spawn action-agent -# -# An issue is ready for action if: -# - It is open and labeled 'action' -# - No tmux session named action-{project}-{issue_num} is already active -# -# Usage: -# cron every 10min -# action-poll.sh [projects/foo.toml] # optional project config - -set -euo pipefail - -export PROJECT_TOML="${1:-}" -source "$(dirname "$0")/../lib/env.sh" -# Use action-bot's own Forgejo identity (#747) -FORGE_TOKEN="${FORGE_ACTION_TOKEN:-${FORGE_TOKEN}}" -# shellcheck source=../lib/guard.sh -source "$(dirname "$0")/../lib/guard.sh" -check_active action - -LOGFILE="${DISINTO_LOG_DIR}/action/action-poll-${PROJECT_NAME:-default}.log" -SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)" - -log() { - printf '[%s] poll: %s\n' "$(date -u '+%Y-%m-%d %H:%M:%S UTC')" "$*" >> "$LOGFILE" -} - -# --- Memory guard --- -memory_guard 2000 - -# --- Find open 'action' issues --- -log "scanning for open action issues" -ACTION_ISSUES=$(curl -sf -H "Authorization: token ${FORGE_TOKEN}" \ - "${FORGE_API}/issues?state=open&labels=action&limit=50&type=issues") || true - -if [ -z "$ACTION_ISSUES" ] || [ "$ACTION_ISSUES" = "null" ]; then - log "no action issues found" - exit 0 -fi - -COUNT=$(printf '%s' "$ACTION_ISSUES" | jq 'length') -if [ "$COUNT" -eq 0 ]; then - log "no action issues found" - exit 0 -fi - -log "found ${COUNT} open action issue(s)" - -# Spawn action-agent for each issue that has no active tmux session. -# Only one agent is spawned per poll to avoid memory pressure; the next -# poll picks up remaining issues. -for i in $(seq 0 $((COUNT - 1))); do - ISSUE_NUM=$(printf '%s' "$ACTION_ISSUES" | jq -r ".[$i].number") - SESSION="action-${PROJECT_NAME}-${ISSUE_NUM}" - - if tmux has-session -t "$SESSION" 2>/dev/null; then - log "issue #${ISSUE_NUM}: session ${SESSION} already active, skipping" - continue - fi - - LOCKFILE="/tmp/action-agent-${ISSUE_NUM}.lock" - if [ -f "$LOCKFILE" ]; then - LOCK_PID=$(cat "$LOCKFILE" 2>/dev/null || echo "") - if [ -n "$LOCK_PID" ] && kill -0 "$LOCK_PID" 2>/dev/null; then - log "issue #${ISSUE_NUM}: agent starting (PID ${LOCK_PID}), skipping" - continue - fi - fi - - log "spawning action-agent for issue #${ISSUE_NUM}" - nohup "${SCRIPT_DIR}/action-agent.sh" "$ISSUE_NUM" "$PROJECT_TOML" >> "$LOGFILE" 2>&1 & - log "started action-agent PID $! for issue #${ISSUE_NUM}" - break -done diff --git a/bin/disinto b/bin/disinto index f58ebfb..7a30cc4 100755 --- a/bin/disinto +++ b/bin/disinto @@ -695,13 +695,12 @@ setup_forge() { [vault-bot]="FORGE_VAULT_TOKEN" [supervisor-bot]="FORGE_SUPERVISOR_TOKEN" [predictor-bot]="FORGE_PREDICTOR_TOKEN" - [action-bot]="FORGE_ACTION_TOKEN" ) local env_file="${FACTORY_ROOT}/.env" local bot_user bot_pass token token_var - for bot_user in dev-bot review-bot planner-bot gardener-bot vault-bot supervisor-bot predictor-bot action-bot; do + for bot_user in dev-bot review-bot planner-bot gardener-bot vault-bot supervisor-bot predictor-bot; do bot_pass="bot-$(head -c 16 /dev/urandom | base64 | tr -dc 'a-zA-Z0-9' | head -c 20)" token_var="${bot_token_vars[$bot_user]}" @@ -812,7 +811,7 @@ setup_forge() { fi # Add all bot users as collaborators - for bot_user in dev-bot review-bot planner-bot gardener-bot vault-bot supervisor-bot predictor-bot action-bot; do + for bot_user in dev-bot review-bot planner-bot gardener-bot vault-bot supervisor-bot predictor-bot; do curl -sf -X PUT \ -H "Authorization: token ${admin_token:-${FORGE_TOKEN}}" \ -H "Content-Type: application/json" \ @@ -860,7 +859,7 @@ setup_ops_repo() { # Add all bot users as collaborators local bot_user - for bot_user in dev-bot review-bot planner-bot gardener-bot vault-bot supervisor-bot predictor-bot action-bot; do + for bot_user in dev-bot review-bot planner-bot gardener-bot vault-bot supervisor-bot predictor-bot; do curl -sf -X PUT \ -H "Authorization: token ${admin_token:-${FORGE_TOKEN}}" \ -H "Content-Type: application/json" \ diff --git a/docs/AGENT-DESIGN.md b/docs/AGENT-DESIGN.md index 107affa..7af8a38 100644 --- a/docs/AGENT-DESIGN.md +++ b/docs/AGENT-DESIGN.md @@ -114,4 +114,3 @@ When reviewing PRs or designing new agents, ask: | gardener | 1242 (agent 471 + poll 771) | Medium — backlog triage, duplicate detection, tech-debt scoring | Poll is heavy orchestration; agent is prompt-driven | | vault | 442 (4 scripts) | Medium — approval flow, human gate decisions | Intentionally bash-heavy (security gate should be deterministic) | | planner | 382 | Medium — AGENTS.md update, gap analysis | Tmux+formula (done, #232) | -| action-agent | 192 | Light — formula execution | Close to target | diff --git a/docs/PHASE-PROTOCOL.md b/docs/PHASE-PROTOCOL.md index 40d1661..73c9a5f 100644 --- a/docs/PHASE-PROTOCOL.md +++ b/docs/PHASE-PROTOCOL.md @@ -117,7 +117,7 @@ signal to the phase file. - **Post-loop exit handler (`case $_MONITOR_LOOP_EXIT`):** Must include an `idle_prompt)` branch. Typical actions: log the event, clean up temp files, and (for agents that use escalation) write an escalation entry or notify via - vault/forge. See `dev/dev-agent.sh`, `action/action-agent.sh`, and + vault/forge. See `dev/dev-agent.sh` and `gardener/gardener-agent.sh` for reference implementations. ## Crash Recovery diff --git a/lib/AGENTS.md b/lib/AGENTS.md index 7bfc736..cb558bc 100644 --- a/lib/AGENTS.md +++ b/lib/AGENTS.md @@ -6,19 +6,19 @@ sourced as needed. | File | What it provides | Sourced by | |---|---|---| -| `lib/env.sh` | Loads `.env`, sets `FACTORY_ROOT`, exports project config (`FORGE_REPO`, `PROJECT_NAME`, etc.), defines `log()`, `forge_api()`, `forge_api_all()` (accepts optional second TOKEN parameter, defaults to `$FORGE_TOKEN`), `woodpecker_api()`, `wpdb()`, `memory_guard()` (skips agent if RAM < threshold). Auto-loads project TOML if `PROJECT_TOML` is set. Exports per-agent tokens (`FORGE_PLANNER_TOKEN`, `FORGE_GARDENER_TOKEN`, `FORGE_VAULT_TOKEN`, `FORGE_SUPERVISOR_TOKEN`, `FORGE_PREDICTOR_TOKEN`, `FORGE_ACTION_TOKEN`) — each falls back to `$FORGE_TOKEN` if not set. **Vault-only token guard (AD-006)**: `unset GITHUB_TOKEN CLAWHUB_TOKEN` so agents never hold external-action tokens — only the runner container receives them. **Container note**: when `DISINTO_CONTAINER=1`, `.env` is NOT re-sourced — compose already injects env vars (including `FORGE_URL=http://forgejo:3000`) and re-sourcing would clobber them. | Every agent | +| `lib/env.sh` | Loads `.env`, sets `FACTORY_ROOT`, exports project config (`FORGE_REPO`, `PROJECT_NAME`, etc.), defines `log()`, `forge_api()`, `forge_api_all()` (accepts optional second TOKEN parameter, defaults to `$FORGE_TOKEN`), `woodpecker_api()`, `wpdb()`, `memory_guard()` (skips agent if RAM < threshold). Auto-loads project TOML if `PROJECT_TOML` is set. Exports per-agent tokens (`FORGE_PLANNER_TOKEN`, `FORGE_GARDENER_TOKEN`, `FORGE_VAULT_TOKEN`, `FORGE_SUPERVISOR_TOKEN`, `FORGE_PREDICTOR_TOKEN`) — each falls back to `$FORGE_TOKEN` if not set. **Vault-only token guard (AD-006)**: `unset GITHUB_TOKEN CLAWHUB_TOKEN` so agents never hold external-action tokens — only the runner container receives them. **Container note**: when `DISINTO_CONTAINER=1`, `.env` is NOT re-sourced — compose already injects env vars (including `FORGE_URL=http://forgejo:3000`) and re-sourcing would clobber them. | Every agent | | `lib/ci-helpers.sh` | `ci_passed()` — returns 0 if CI state is "success" (or no CI configured). `ci_required_for_pr()` — returns 0 if PR has code files (CI required), 1 if non-code only (CI not required). `is_infra_step()` — returns 0 if a single CI step failure matches infra heuristics (clone/git exit 128, any exit 137, log timeout patterns). `classify_pipeline_failure()` — returns "infra \" if any failed Woodpecker step matches infra heuristics via `is_infra_step()`, else "code". `ensure_priority_label()` — looks up (or creates) the `priority` label and returns its ID; caches in `_PRIORITY_LABEL_ID`. `ci_commit_status ` — queries Woodpecker directly for CI state, falls back to forge commit status API. `ci_pipeline_number ` — returns the Woodpecker pipeline number for a commit, falls back to parsing forge status `target_url`. `ci_promote ` — promotes a pipeline to a named Woodpecker environment (vault-gated deployment: vault approves, vault-fire calls this). | dev-poll, review-poll, review-pr, supervisor-poll | | `lib/ci-debug.sh` | CLI tool for Woodpecker CI: `list`, `status`, `logs`, `failures` subcommands. Not sourced — run directly. | Humans / dev-agent (tool access) | | `lib/load-project.sh` | Parses a `projects/*.toml` file into env vars (`PROJECT_NAME`, `FORGE_REPO`, `WOODPECKER_REPO_ID`, monitoring toggles, mirror config, etc.). | env.sh (when `PROJECT_TOML` is set), supervisor-poll (per-project iteration) | | `lib/parse-deps.sh` | Extracts dependency issue numbers from an issue body (stdin → stdout, one number per line). Matches `## Dependencies` / `## Depends on` / `## Blocked by` sections and inline `depends on #N` / `blocked by #N` patterns. Inline scan skips fenced code blocks to prevent false positives from code examples in issue bodies. Not sourced — executed via `bash lib/parse-deps.sh`. | dev-poll, supervisor-poll | -| `lib/formula-session.sh` | `acquire_cron_lock()`, `check_memory()`, `load_formula()`, `build_context_block()`, `consume_escalation_reply()`, `start_formula_session()`, `formula_phase_callback()`, `build_prompt_footer()`, `build_graph_section()`, `run_formula_and_monitor(AGENT [TIMEOUT] [CALLBACK])` — shared helpers for formula-driven cron agents (lock, memory guard, formula loading, prompt assembly, tmux session, monitor loop, crash recovery). `build_graph_section()` generates the structural-analysis section (runs `lib/build-graph.py`, formats JSON output) — previously duplicated in planner-run.sh and predictor-run.sh, now shared here. `formula_phase_callback()` handles `PHASE:escalate` (unified escalation path — kills the session). `run_formula_and_monitor` accepts an optional CALLBACK (default: `formula_phase_callback`) so callers can install custom merge-through or escalation handlers. `cleanup_stale_crashed_worktrees()` — thin wrapper around `worktree_cleanup_stale()` from `lib/worktree.sh` (kept for backwards compatibility). | planner-run.sh, predictor-run.sh, gardener-run.sh, supervisor-run.sh, dev-agent.sh, action-agent.sh | -| `lib/guard.sh` | `check_active(agent_name)` — reads `$FACTORY_ROOT/state/.{agent_name}-active`; exits 0 (skip) if the file is absent. Factory is off by default — state files must be created to enable each agent. **Logs a message to stderr** when skipping (`[check_active] SKIP: state file not found`), so agent dropout is visible in cron logs. Sourced by dev-poll.sh, review-poll.sh, action-poll.sh, predictor-run.sh, supervisor-run.sh. | cron entry points | +| `lib/formula-session.sh` | `acquire_cron_lock()`, `check_memory()`, `load_formula()`, `build_context_block()`, `consume_escalation_reply()`, `start_formula_session()`, `formula_phase_callback()`, `build_prompt_footer()`, `build_graph_section()`, `run_formula_and_monitor(AGENT [TIMEOUT] [CALLBACK])` — shared helpers for formula-driven cron agents (lock, memory guard, formula loading, prompt assembly, tmux session, monitor loop, crash recovery). `build_graph_section()` generates the structural-analysis section (runs `lib/build-graph.py`, formats JSON output) — previously duplicated in planner-run.sh and predictor-run.sh, now shared here. `formula_phase_callback()` handles `PHASE:escalate` (unified escalation path — kills the session). `run_formula_and_monitor` accepts an optional CALLBACK (default: `formula_phase_callback`) so callers can install custom merge-through or escalation handlers. `cleanup_stale_crashed_worktrees()` — thin wrapper around `worktree_cleanup_stale()` from `lib/worktree.sh` (kept for backwards compatibility). | planner-run.sh, predictor-run.sh, gardener-run.sh, supervisor-run.sh, dev-agent.sh | +| `lib/guard.sh` | `check_active(agent_name)` — reads `$FACTORY_ROOT/state/.{agent_name}-active`; exits 0 (skip) if the file is absent. Factory is off by default — state files must be created to enable each agent. **Logs a message to stderr** when skipping (`[check_active] SKIP: state file not found`), so agent dropout is visible in cron logs. Sourced by dev-poll.sh, review-poll.sh, predictor-run.sh, supervisor-run.sh. | cron entry points | | `lib/mirrors.sh` | `mirror_push()` — pushes `$PRIMARY_BRANCH` + tags to all configured mirror remotes (fire-and-forget background pushes). Reads `MIRROR_NAMES` and `MIRROR_*` vars exported by `load-project.sh` from the `[mirrors]` TOML section. Failures are logged but never block the pipeline. Sourced by dev-poll.sh and dev/phase-handler.sh — called after every successful merge. | dev-poll.sh, phase-handler.sh | | `lib/build-graph.py` | Python tool: parses VISION.md, prerequisites.md (from ops repo), AGENTS.md, formulas/*.toml, evidence/ (from ops repo), and forge issues/labels into a NetworkX DiGraph. Runs structural analyses (orphaned objectives, stale prerequisites, thin evidence, circular deps) and outputs a JSON report. Used by `review-pr.sh` (per-PR changed-file analysis) and `predictor-run.sh` (full-project analysis) to provide structural context to Claude. | review-pr.sh, predictor-run.sh | | `lib/secret-scan.sh` | `scan_for_secrets()` — detects potential secrets (API keys, bearer tokens, private keys, URLs with embedded credentials) in text; returns 1 if secrets found. `redact_secrets()` — replaces detected secret patterns with `[REDACTED]`. | file-action-issue.sh, phase-handler.sh | | `lib/file-action-issue.sh` | `file_action_issue()` — dedup check, secret scan, label lookup, and issue creation for formula-driven cron wrappers. Sets `FILED_ISSUE_NUM` on success. Returns 4 if secrets detected in body. | (available for future use) | | `lib/tea-helpers.sh` | `tea_file_issue(title, body, labels...)` — create issue via tea CLI with secret scanning; sets `FILED_ISSUE_NUM`. `tea_relabel(issue_num, labels...)` — replace labels using tea's `edit` subcommand (not `label`). `tea_comment(issue_num, body)` — add comment with secret scanning. `tea_close(issue_num)` — close issue. All use `TEA_LOGIN` and `FORGE_REPO` from env.sh. Labels by name (no ID lookup). Tea binary download verified via sha256 checksum. Sourced by env.sh when `tea` binary is available. | env.sh (conditional) | -| `lib/worktree.sh` | Reusable git worktree management: `worktree_create(path, branch, [base_ref])` — create worktree, checkout base, fetch submodules. `worktree_recover(path, branch, [remote])` — detect existing worktree, reuse if on correct branch (sets `_WORKTREE_REUSED`), otherwise clean and recreate. `worktree_cleanup(path)` — `git worktree remove --force`, clear Claude Code project cache (`~/.claude/projects/` matching path). `worktree_cleanup_stale([max_age_hours])` — scan `/tmp` for orphaned worktrees older than threshold, skip preserved and active tmux worktrees, prune. `worktree_preserve(path, reason)` — mark worktree as preserved for debugging (writes `.worktree-preserved` marker, skipped by stale cleanup). | dev-agent.sh, action-agent.sh, supervisor-run.sh, planner-run.sh, predictor-run.sh, gardener-run.sh | -| `lib/pr-lifecycle.sh` | Reusable PR lifecycle library: `pr_create()`, `pr_find_by_branch()`, `pr_poll_ci()`, `pr_poll_review()`, `pr_merge()`, `pr_is_merged()`, `pr_walk_to_merge()`, `build_phase_protocol_prompt()`. Requires `lib/ci-helpers.sh`. | dev-agent.sh (future), action-agent.sh (future) | -| `lib/issue-lifecycle.sh` | Reusable issue lifecycle library: `issue_claim()` (add in-progress, remove backlog), `issue_release()` (remove in-progress, add backlog), `issue_block()` (post diagnostic comment with secret redaction, add blocked label), `issue_close()`, `issue_check_deps()` (parse deps, check transitive closure; sets `_ISSUE_BLOCKED_BY`, `_ISSUE_SUGGESTION`), `issue_suggest_next()` (find next unblocked backlog issue; sets `_ISSUE_NEXT`), `issue_post_refusal()` (structured refusal comment with dedup). Label IDs cached in globals on first lookup. Sources `lib/secret-scan.sh`. | dev-agent.sh (future), action-agent.sh (future) | -| `lib/agent-session.sh` | Shared tmux + Claude session helpers: `create_agent_session()`, `inject_formula()`, `agent_wait_for_claude_ready()`, `agent_inject_into_session()`, `agent_kill_session()`, `monitor_phase_loop()`, `read_phase()`, `write_compact_context()`. `create_agent_session(session, workdir, [phase_file])` optionally installs a PostToolUse hook (matcher `Bash\|Write`) that detects phase file writes in real-time — when Claude writes to the phase file, the hook writes a marker so `monitor_phase_loop` reacts on the next poll instead of waiting for mtime changes. Also installs a StopFailure hook (matcher `rate_limit\|server_error\|authentication_failed\|billing_error`) that writes `PHASE:failed` with an `api_error` reason to the phase file and touches the phase-changed marker, so the orchestrator discovers API errors within one poll cycle instead of waiting for idle timeout. Also installs a SessionStart hook (matcher `compact`) that re-injects phase protocol instructions after context compaction — callers write the context file via `write_compact_context(phase_file, content)`, and the hook (`on-compact-reinject.sh`) outputs the file content to stdout so Claude retains critical instructions. When `phase_file` is set, passes it to the idle stop hook (`on-idle-stop.sh`) so the hook can **nudge Claude** (up to 2 times) if Claude returns to the prompt without writing to the phase file — the hook injects a tmux reminder asking Claude to signal PHASE:done or PHASE:awaiting_ci. The PreToolUse guard hook (`on-pretooluse-guard.sh`) receives the session name as a third argument — formula agents (`gardener-*`, `planner-*`, `predictor-*`, `supervisor-*`) are identified this way and allowed to access `FACTORY_ROOT` from worktrees (they need env.sh, AGENTS.md, formulas/, lib/). **OAuth flock**: when `DISINTO_CONTAINER=1`, Claude CLI is wrapped in `flock -w 300 ~/.claude/session.lock` to queue concurrent token refresh attempts and prevent rotation races across agents sharing the same credentials. `monitor_phase_loop` sets `_MONITOR_LOOP_EXIT` to one of: `done`, `idle_timeout`, `idle_prompt` (Claude returned to `>` for 3 consecutive polls without writing any phase — callback invoked with `PHASE:failed`, session already dead), `crashed`, or `PHASE:escalate` / other `PHASE:*` string. **Unified escalation**: `PHASE:escalate` is the signal that a session needs human input (renamed from `PHASE:needs_human`). **Callers must handle `idle_prompt`** in both their callback and their post-loop exit handler — see [`docs/PHASE-PROTOCOL.md` idle_prompt](docs/PHASE-PROTOCOL.md#idle_prompt-exit-reason) for the full contract. | dev-agent.sh, action-agent.sh | +| `lib/worktree.sh` | Reusable git worktree management: `worktree_create(path, branch, [base_ref])` — create worktree, checkout base, fetch submodules. `worktree_recover(path, branch, [remote])` — detect existing worktree, reuse if on correct branch (sets `_WORKTREE_REUSED`), otherwise clean and recreate. `worktree_cleanup(path)` — `git worktree remove --force`, clear Claude Code project cache (`~/.claude/projects/` matching path). `worktree_cleanup_stale([max_age_hours])` — scan `/tmp` for orphaned worktrees older than threshold, skip preserved and active tmux worktrees, prune. `worktree_preserve(path, reason)` — mark worktree as preserved for debugging (writes `.worktree-preserved` marker, skipped by stale cleanup). | dev-agent.sh, supervisor-run.sh, planner-run.sh, predictor-run.sh, gardener-run.sh | +| `lib/pr-lifecycle.sh` | Reusable PR lifecycle library: `pr_create()`, `pr_find_by_branch()`, `pr_poll_ci()`, `pr_poll_review()`, `pr_merge()`, `pr_is_merged()`, `pr_walk_to_merge()`, `build_phase_protocol_prompt()`. Requires `lib/ci-helpers.sh`. | dev-agent.sh (future) | +| `lib/issue-lifecycle.sh` | Reusable issue lifecycle library: `issue_claim()` (add in-progress, remove backlog), `issue_release()` (remove in-progress, add backlog), `issue_block()` (post diagnostic comment with secret redaction, add blocked label), `issue_close()`, `issue_check_deps()` (parse deps, check transitive closure; sets `_ISSUE_BLOCKED_BY`, `_ISSUE_SUGGESTION`), `issue_suggest_next()` (find next unblocked backlog issue; sets `_ISSUE_NEXT`), `issue_post_refusal()` (structured refusal comment with dedup). Label IDs cached in globals on first lookup. Sources `lib/secret-scan.sh`. | dev-agent.sh (future) | +| `lib/agent-session.sh` | Shared tmux + Claude session helpers: `create_agent_session()`, `inject_formula()`, `agent_wait_for_claude_ready()`, `agent_inject_into_session()`, `agent_kill_session()`, `monitor_phase_loop()`, `read_phase()`, `write_compact_context()`. `create_agent_session(session, workdir, [phase_file])` optionally installs a PostToolUse hook (matcher `Bash\|Write`) that detects phase file writes in real-time — when Claude writes to the phase file, the hook writes a marker so `monitor_phase_loop` reacts on the next poll instead of waiting for mtime changes. Also installs a StopFailure hook (matcher `rate_limit\|server_error\|authentication_failed\|billing_error`) that writes `PHASE:failed` with an `api_error` reason to the phase file and touches the phase-changed marker, so the orchestrator discovers API errors within one poll cycle instead of waiting for idle timeout. Also installs a SessionStart hook (matcher `compact`) that re-injects phase protocol instructions after context compaction — callers write the context file via `write_compact_context(phase_file, content)`, and the hook (`on-compact-reinject.sh`) outputs the file content to stdout so Claude retains critical instructions. When `phase_file` is set, passes it to the idle stop hook (`on-idle-stop.sh`) so the hook can **nudge Claude** (up to 2 times) if Claude returns to the prompt without writing to the phase file — the hook injects a tmux reminder asking Claude to signal PHASE:done or PHASE:awaiting_ci. The PreToolUse guard hook (`on-pretooluse-guard.sh`) receives the session name as a third argument — formula agents (`gardener-*`, `planner-*`, `predictor-*`, `supervisor-*`) are identified this way and allowed to access `FACTORY_ROOT` from worktrees (they need env.sh, AGENTS.md, formulas/, lib/). **OAuth flock**: when `DISINTO_CONTAINER=1`, Claude CLI is wrapped in `flock -w 300 ~/.claude/session.lock` to queue concurrent token refresh attempts and prevent rotation races across agents sharing the same credentials. `monitor_phase_loop` sets `_MONITOR_LOOP_EXIT` to one of: `done`, `idle_timeout`, `idle_prompt` (Claude returned to `>` for 3 consecutive polls without writing any phase — callback invoked with `PHASE:failed`, session already dead), `crashed`, or `PHASE:escalate` / other `PHASE:*` string. **Unified escalation**: `PHASE:escalate` is the signal that a session needs human input (renamed from `PHASE:needs_human`). **Callers must handle `idle_prompt`** in both their callback and their post-loop exit handler — see [`docs/PHASE-PROTOCOL.md` idle_prompt](docs/PHASE-PROTOCOL.md#idle_prompt-exit-reason) for the full contract. | dev-agent.sh | diff --git a/lib/env.sh b/lib/env.sh index fb479ec..a2c98a9 100755 --- a/lib/env.sh +++ b/lib/env.sh @@ -95,10 +95,9 @@ export FORGE_GARDENER_TOKEN="${FORGE_GARDENER_TOKEN:-${FORGE_TOKEN}}" export FORGE_VAULT_TOKEN="${FORGE_VAULT_TOKEN:-${FORGE_TOKEN}}" export FORGE_SUPERVISOR_TOKEN="${FORGE_SUPERVISOR_TOKEN:-${FORGE_TOKEN}}" export FORGE_PREDICTOR_TOKEN="${FORGE_PREDICTOR_TOKEN:-${FORGE_TOKEN}}" -export FORGE_ACTION_TOKEN="${FORGE_ACTION_TOKEN:-${FORGE_TOKEN}}" # Bot usernames filter: FORGE_BOT_USERNAMES > legacy CODEBERG_BOT_USERNAMES -export FORGE_BOT_USERNAMES="${FORGE_BOT_USERNAMES:-${CODEBERG_BOT_USERNAMES:-dev-bot,review-bot,planner-bot,gardener-bot,vault-bot,supervisor-bot,predictor-bot,action-bot}}" +export FORGE_BOT_USERNAMES="${FORGE_BOT_USERNAMES:-${CODEBERG_BOT_USERNAMES:-dev-bot,review-bot,planner-bot,gardener-bot,vault-bot,supervisor-bot,predictor-bot}}" export CODEBERG_BOT_USERNAMES="${FORGE_BOT_USERNAMES}" # backwards compat # Project config (FORGE_* preferred, CODEBERG_* fallback) diff --git a/planner/AGENTS.md b/planner/AGENTS.md index 9749afd..4f53f9f 100644 --- a/planner/AGENTS.md +++ b/planner/AGENTS.md @@ -23,7 +23,7 @@ need human decisions or external resources are filed as vault procurement items (`$OPS_REPO_ROOT/vault/pending/*.md`) instead of being escalated. Phase 3 (file-at-constraints): identify the top 3 unresolved prerequisites that block the most downstream objectives — file issues as either `backlog` (code changes, -dev-agent) or `action` (run existing formula, action-agent). **Stuck issues +dev-agent) or `action` (run existing formula, dispatcher). **Stuck issues (detected BOUNCED/LABEL_CHURN) are dispatched to the `groom-backlog` formula in breakdown mode instead of being re-promoted** — this breaks the ping-pong loop by splitting them into dev-agent-sized sub-issues. **Human-blocked issues -- 2.49.1 From d9a60301275af3d702873a71fcdddd00212ec1bc Mon Sep 17 00:00:00 2001 From: Agent Date: Tue, 31 Mar 2026 19:55:00 +0000 Subject: [PATCH 2/2] fix: remove remaining action-agent references from docs and configs - Remove action-agent card from site/docs/architecture.html - Remove action/ directory line from architecture.html - Update formula comments to reference dispatcher instead of action-agent - Remove action/action.log from log scan loops in preflight.sh and collect-metrics.sh - Remove action from find command in agent-smoke.sh --- .woodpecker/agent-smoke.sh | 2 +- formulas/run-publish-site.toml | 2 +- formulas/run-rent-a-human.toml | 2 +- site/collect-metrics.sh | 2 +- site/docs/architecture.html | 6 ------ supervisor/preflight.sh | 3 +-- 6 files changed, 5 insertions(+), 12 deletions(-) diff --git a/.woodpecker/agent-smoke.sh b/.woodpecker/agent-smoke.sh index eddfe87..6d1d76b 100644 --- a/.woodpecker/agent-smoke.sh +++ b/.woodpecker/agent-smoke.sh @@ -84,7 +84,7 @@ while IFS= read -r -d '' f; do printf 'FAIL [syntax] %s\n' "$f" FAILED=1 fi -done < <(find dev gardener review planner supervisor lib vault action -name "*.sh" -print0 2>/dev/null) +done < <(find dev gardener review planner supervisor lib vault -name "*.sh" -print0 2>/dev/null) echo "syntax check done" # ── 2. Function-resolution check ───────────────────────────────────────────── diff --git a/formulas/run-publish-site.toml b/formulas/run-publish-site.toml index 2de4455..9a7c1e7 100644 --- a/formulas/run-publish-site.toml +++ b/formulas/run-publish-site.toml @@ -3,7 +3,7 @@ # Trigger: action issue created by planner (gap analysis), dev-poll (post-merge # hook detecting site/ changes), or gardener (periodic SHA drift check). # -# The action-agent picks up the issue, executes these steps, posts results +# The dispatcher picks up the issue, executes these steps, posts results # as a comment, and closes the issue. name = "run-publish-site" diff --git a/formulas/run-rent-a-human.toml b/formulas/run-rent-a-human.toml index 9009418..41b8f1f 100644 --- a/formulas/run-rent-a-human.toml +++ b/formulas/run-rent-a-human.toml @@ -5,7 +5,7 @@ # the action and notifies the human for one-click copy-paste execution. # # Trigger: action issue created by planner or any formula. -# The action-agent picks up the issue, executes these steps, writes a draft +# The dispatcher picks up the issue, executes these steps, writes a draft # to vault/outreach/{platform}/drafts/, notifies the human via the forge, # and closes the issue. # diff --git a/site/collect-metrics.sh b/site/collect-metrics.sh index a52bbcc..31e2ea6 100644 --- a/site/collect-metrics.sh +++ b/site/collect-metrics.sh @@ -188,7 +188,7 @@ collect_agent_metrics() { local agent_name log_path age_min last_active for log_entry in dev/dev-agent.log review/review.log gardener/gardener.log \ planner/planner.log predictor/predictor.log supervisor/supervisor.log \ - action/action.log vault/vault.log; do + vault/vault.log; do agent_name=$(basename "$(dirname "$log_entry")") log_path="${FACTORY_ROOT}/${log_entry}" if [ -f "$log_path" ]; then diff --git a/site/docs/architecture.html b/site/docs/architecture.html index 2bce787..c35edf3 100644 --- a/site/docs/architecture.html +++ b/site/docs/architecture.html @@ -397,11 +397,6 @@
Detects infrastructure patterns — recurring failures, resource trends, emerging issues. Files predictions for triage.
Cron: daily
-
-
action-agent
-
Executes operational tasks defined as formulas — site deployments, data migrations, any multi-step procedure.
-
Cron: every 5 min
-
vault
Safety gate. Reviews dangerous actions before they execute. Auto-approves safe operations, escalates risky ones to a human.
@@ -525,7 +520,6 @@ disinto/ ├── planner/ planner-run.sh (weekly cron executor) ├── supervisor/ supervisor-run.sh (health monitoring) ├── vault/ vault-poll.sh, vault-agent.sh, vault-fire.sh -├── action/ action-poll.sh, action-agent.sh ├── lib/ env.sh, agent-session.sh, ci-helpers.sh ├── projects/ *.toml per-project config ├── formulas/ TOML specs for multi-step agent tasks diff --git a/supervisor/preflight.sh b/supervisor/preflight.sh index ba740b7..e9e4de2 100755 --- a/supervisor/preflight.sh +++ b/supervisor/preflight.sh @@ -132,8 +132,7 @@ echo "" echo "## Recent Agent Logs" for _log in supervisor/supervisor.log dev/dev-agent.log review/review.log \ - gardener/gardener.log planner/planner.log predictor/predictor.log \ - action/action.log; do + gardener/gardener.log planner/planner.log predictor/predictor.log; do _logpath="${FACTORY_ROOT}/${_log}" if [ -f "$_logpath" ]; then _log_age_min=$(( ($(date +%s) - $(stat -c %Y "$_logpath" 2>/dev/null || echo 0)) / 60 )) -- 2.49.1