fix: chore(26a): delete action-agent.sh, action-poll.sh, and action/AGENTS.md (#65) #72
13 changed files with 19 additions and 459 deletions
|
|
@ -26,8 +26,7 @@ FORGE_GARDENER_TOKEN= # [SECRET] gardener-bot API token
|
|||
FORGE_VAULT_TOKEN= # [SECRET] vault-bot API token
|
||||
FORGE_SUPERVISOR_TOKEN= # [SECRET] supervisor-bot API token
|
||||
FORGE_PREDICTOR_TOKEN= # [SECRET] predictor-bot API token
|
||||
FORGE_ACTION_TOKEN= # [SECRET] action-bot API token
|
||||
FORGE_BOT_USERNAMES=dev-bot,review-bot,planner-bot,gardener-bot,vault-bot,supervisor-bot,predictor-bot,action-bot
|
||||
FORGE_BOT_USERNAMES=dev-bot,review-bot,planner-bot,gardener-bot,vault-bot,supervisor-bot,predictor-bot
|
||||
|
||||
# ── Backwards compatibility ───────────────────────────────────────────────
|
||||
# If CODEBERG_TOKEN is set but FORGE_TOKEN is not, env.sh falls back to
|
||||
|
|
|
|||
|
|
@ -214,8 +214,6 @@ check_script vault/vault-agent.sh
|
|||
check_script vault/vault-fire.sh
|
||||
check_script vault/vault-poll.sh
|
||||
check_script vault/vault-reject.sh
|
||||
check_script action/action-poll.sh
|
||||
check_script action/action-agent.sh
|
||||
check_script supervisor/supervisor-run.sh
|
||||
check_script supervisor/preflight.sh
|
||||
check_script predictor/predictor-run.sh
|
||||
|
|
|
|||
10
AGENTS.md
10
AGENTS.md
|
|
@ -3,8 +3,8 @@
|
|||
|
||||
## What this repo is
|
||||
|
||||
Disinto is an autonomous code factory. It manages eight agents (dev, review,
|
||||
gardener, supervisor, planner, predictor, action, vault) that pick up issues from forge,
|
||||
Disinto is an autonomous code factory. It manages seven agents (dev, review,
|
||||
gardener, supervisor, planner, predictor, vault) that pick up issues from forge,
|
||||
implement them, review PRs, plan from the vision, gate dangerous actions, and
|
||||
keep the system healthy — all via cron and `claude -p`.
|
||||
|
||||
|
|
@ -23,7 +23,6 @@ disinto/ (code repo)
|
|||
│ preflight.sh — pre-flight data collection for supervisor formula
|
||||
│ supervisor-poll.sh — legacy bash orchestrator (superseded)
|
||||
├── vault/ vault-poll.sh, vault-agent.sh, vault-fire.sh — action gating + procurement
|
||||
├── action/ action-poll.sh, action-agent.sh — operational task execution
|
||||
├── lib/ env.sh, agent-session.sh, ci-helpers.sh, ci-debug.sh, load-project.sh, parse-deps.sh, guard.sh, mirrors.sh, pr-lifecycle.sh, issue-lifecycle.sh, worktree.sh, build-graph.py
|
||||
├── projects/ *.toml.example — templates; *.toml — local per-box config (gitignored)
|
||||
├── formulas/ Issue templates (TOML specs for multi-step agent tasks)
|
||||
|
|
@ -90,7 +89,6 @@ bash dev/phase-test.sh
|
|||
| Supervisor | `supervisor/` | Health monitoring | [supervisor/AGENTS.md](supervisor/AGENTS.md) |
|
||||
| Planner | `planner/` | Strategic planning | [planner/AGENTS.md](planner/AGENTS.md) |
|
||||
| Predictor | `predictor/` | Infrastructure pattern detection | [predictor/AGENTS.md](predictor/AGENTS.md) |
|
||||
| Action | `action/` | Operational task execution | [action/AGENTS.md](action/AGENTS.md) |
|
||||
| Vault | `vault/` | Action gating + resource procurement | [vault/AGENTS.md](vault/AGENTS.md) |
|
||||
|
||||
See [lib/AGENTS.md](lib/AGENTS.md) for the full shared helper reference.
|
||||
|
|
@ -108,14 +106,14 @@ Issues flow: `backlog` → `in-progress` → PR → CI → review → merge →
|
|||
| `backlog` | Issue is queued for implementation. Dev-poll picks the first ready one. | Planner, gardener, humans |
|
||||
| `priority` | Queue tier above plain backlog. Issues with both `priority` and `backlog` are picked before plain `backlog` issues. FIFO within each tier. | Planner, humans |
|
||||
| `in-progress` | Dev-agent is actively working on this issue. Only one issue per project is in-progress at a time. | dev-agent.sh (claims issue) |
|
||||
| `blocked` | Issue is stuck — agent session failed, crashed, timed out, or CI exhausted. Diagnostic comment on the issue has details. Also used for unmet dependencies. | dev-agent.sh, action-agent.sh, dev-poll.sh (on failure) |
|
||||
| `blocked` | Issue is stuck — agent session failed, crashed, timed out, or CI exhausted. Diagnostic comment on the issue has details. Also used for unmet dependencies. | dev-agent.sh, dev-poll.sh (on failure) |
|
||||
| `tech-debt` | Pre-existing issue flagged by AI reviewer, not introduced by a PR. | review-pr.sh (auto-created follow-ups) |
|
||||
| `underspecified` | Dev-agent refused the issue as too large or vague. | dev-poll.sh (on preflight `too_large`), dev-agent.sh (on mid-run `too_large` refusal) |
|
||||
| `vision` | Goal anchors — high-level objectives from VISION.md. | Planner, humans |
|
||||
| `prediction/unreviewed` | Unprocessed prediction filed by predictor. | predictor-run.sh |
|
||||
| `prediction/dismissed` | Prediction triaged as DISMISS — planner disagrees, closed with reason. | Planner (triage-predictions step) |
|
||||
| `prediction/actioned` | Prediction promoted or dismissed by planner. | Planner (triage-predictions step) |
|
||||
| `action` | Operational task for the action-agent to execute via formula. | Planner, humans |
|
||||
| `action` | Operational task for the dispatcher to execute via formula. | Planner, humans |
|
||||
|
||||
### Dependency conventions
|
||||
|
||||
|
|
|
|||
|
|
@ -123,7 +123,7 @@ disinto/
|
|||
│ └── best-practices.md # Gardener knowledge base
|
||||
├── planner/
|
||||
│ ├── planner-poll.sh # Cron entry: weekly vision gap analysis
|
||||
│ └── (formula-driven) # run-planner.toml executed by action-agent
|
||||
│ └── (formula-driven) # run-planner.toml executed by dispatcher
|
||||
├── vault/
|
||||
│ ├── vault-poll.sh # Cron entry: process pending dangerous actions
|
||||
│ ├── vault-agent.sh # Classifies and routes actions (claude -p)
|
||||
|
|
|
|||
|
|
@ -1,34 +0,0 @@
|
|||
<!-- last-reviewed: f32707ba659de278a3af434e3549fb8a8dce9d3a -->
|
||||
# Action Agent
|
||||
|
||||
**Role**: Execute operational tasks described by action formulas — run scripts,
|
||||
call APIs, send messages, collect human approval. Shares the same phase handler
|
||||
as the dev-agent: if an action produces code changes, the orchestrator creates a
|
||||
PR and drives the CI/review loop; otherwise Claude closes the issue directly.
|
||||
|
||||
**Trigger**: `action-poll.sh` runs every 10 min via cron. Sources `lib/guard.sh`
|
||||
and calls `check_active action` first — skips if `$FACTORY_ROOT/state/.action-active`
|
||||
is absent. Then scans for open issues labeled `action` that have no active tmux
|
||||
session, and spawns `action-agent.sh <issue-number>`.
|
||||
|
||||
**Key files**:
|
||||
- `action/action-poll.sh` — Cron scheduler: finds open action issues with no active tmux session, spawns action-agent.sh
|
||||
- `action/action-agent.sh` — Orchestrator: fetches issue body + prior comments, **checks all dependencies via `lib/parse-deps.sh` before spawning** (skips silently if any dep is still open), creates tmux session (`action-{project}-{issue_num}`) with interactive `claude`, injects formula prompt with phase protocol, enters `monitor_phase_loop` (shared via `dev/phase-handler.sh`) for CI/review lifecycle or direct completion
|
||||
|
||||
**Session lifecycle**:
|
||||
1. `action-poll.sh` finds open `action` issues with no active tmux session.
|
||||
2. Spawns `action-agent.sh <issue_num>`.
|
||||
3. Agent creates tmux session `action-{project}-{issue_num}`, injects prompt (formula + prior comments + phase protocol).
|
||||
4. Agent enters `monitor_phase_loop` (shared with dev-agent via `dev/phase-handler.sh`).
|
||||
5. **Path A (git output):** Claude pushes branch → `PHASE:awaiting_ci` → handler creates PR, polls CI → injects failures → Claude fixes → push → re-poll → CI passes → `PHASE:awaiting_review` → handler polls reviews → injects REQUEST_CHANGES → Claude fixes → approved → merge → cleanup.
|
||||
6. **Path B (no git output):** Claude posts results as comment, closes issue → `PHASE:done` → handler cleans up (kill session, docker compose down, remove temp files).
|
||||
7. For human input: Claude writes `PHASE:escalate`; human responds via vault/forge.
|
||||
|
||||
**Crash recovery**: on `PHASE:crashed` or non-zero exit, the worktree is **preserved** (not destroyed) for debugging. Location logged. Supervisor housekeeping removes stale crashed worktrees older than 24h.
|
||||
|
||||
**Environment variables consumed**:
|
||||
- `FORGE_TOKEN`, `FORGE_ACTION_TOKEN` (falls back to FORGE_TOKEN), `FORGE_REPO`, `FORGE_API`, `FORGE_URL`, `PROJECT_NAME`, `FORGE_WEB`
|
||||
- `ACTION_IDLE_TIMEOUT` — Max seconds before killing idle session (default 14400 = 4h)
|
||||
- `ACTION_MAX_LIFETIME` — Max total session wall-clock seconds (default 28800 = 8h); caps session independently of idle timeout
|
||||
|
||||
**FORGE_REMOTE**: `action-agent.sh` auto-detects the git remote for `FORGE_URL` (same logic as dev-agent). Exported as `FORGE_REMOTE`, used for worktree creation and push instructions injected into the Claude prompt.
|
||||
|
|
@ -1,323 +0,0 @@
|
|||
#!/usr/bin/env bash
|
||||
# =============================================================================
|
||||
# action-agent.sh — Synchronous action agent: SDK + shared libraries
|
||||
#
|
||||
# Synchronous bash loop using claude -p (one-shot invocation).
|
||||
# No tmux sessions, no phase files — the bash script IS the state machine.
|
||||
#
|
||||
# Usage: ./action-agent.sh <issue-number> [project.toml]
|
||||
#
|
||||
# Flow:
|
||||
# 1. Preflight: issue_check_deps(), memory guard, concurrency lock
|
||||
# 2. Parse model from YAML front matter in issue body (custom model selection)
|
||||
# 3. Worktree: worktree_create() for action isolation
|
||||
# 4. Load formula from issue body
|
||||
# 5. Build prompt: formula + prior non-bot comments (resume context)
|
||||
# 6. agent_run(worktree, prompt) → Claude executes action, may push
|
||||
# 7. If pushed: pr_walk_to_merge() from lib/pr-lifecycle.sh
|
||||
# 8. Cleanup: worktree_cleanup(), issue_close()
|
||||
#
|
||||
# Action-specific (stays in runner):
|
||||
# - YAML front matter parsing (model selection)
|
||||
# - Bot username filtering for prior comments
|
||||
# - Lifetime watchdog (MAX_LIFETIME=8h wall-clock cap)
|
||||
# - Child process cleanup (docker compose, background jobs)
|
||||
#
|
||||
# From shared libraries:
|
||||
# - Issue lifecycle: lib/issue-lifecycle.sh
|
||||
# - Worktree: lib/worktree.sh
|
||||
# - PR lifecycle: lib/pr-lifecycle.sh
|
||||
# - Agent SDK: lib/agent-sdk.sh
|
||||
#
|
||||
# Log: action/action-poll-{project}.log
|
||||
# =============================================================================
|
||||
set -euo pipefail
|
||||
|
||||
ISSUE="${1:?Usage: action-agent.sh <issue-number> [project.toml]}"
|
||||
export PROJECT_TOML="${2:-${PROJECT_TOML:-}}"
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
|
||||
FACTORY_ROOT="$(dirname "$SCRIPT_DIR")"
|
||||
|
||||
# shellcheck source=../lib/env.sh
|
||||
source "$FACTORY_ROOT/lib/env.sh"
|
||||
# Use action-bot's own Forgejo identity (#747)
|
||||
FORGE_TOKEN="${FORGE_ACTION_TOKEN:-${FORGE_TOKEN}}"
|
||||
# shellcheck source=../lib/ci-helpers.sh
|
||||
source "$FACTORY_ROOT/lib/ci-helpers.sh"
|
||||
# shellcheck source=../lib/worktree.sh
|
||||
source "$FACTORY_ROOT/lib/worktree.sh"
|
||||
# shellcheck source=../lib/issue-lifecycle.sh
|
||||
source "$FACTORY_ROOT/lib/issue-lifecycle.sh"
|
||||
# shellcheck source=../lib/agent-sdk.sh
|
||||
source "$FACTORY_ROOT/lib/agent-sdk.sh"
|
||||
# shellcheck source=../lib/pr-lifecycle.sh
|
||||
source "$FACTORY_ROOT/lib/pr-lifecycle.sh"
|
||||
|
||||
BRANCH="action/issue-${ISSUE}"
|
||||
WORKTREE="/tmp/action-${ISSUE}-$(date +%s)"
|
||||
LOCKFILE="/tmp/action-agent-${ISSUE}.lock"
|
||||
LOGFILE="${DISINTO_LOG_DIR}/action/action-poll-${PROJECT_NAME:-default}.log"
|
||||
# shellcheck disable=SC2034 # consumed by agent-sdk.sh
|
||||
SID_FILE="/tmp/action-session-${PROJECT_NAME:-default}-${ISSUE}.sid"
|
||||
MAX_LIFETIME="${ACTION_MAX_LIFETIME:-28800}" # 8h default wall-clock cap
|
||||
SESSION_START_EPOCH=$(date +%s)
|
||||
|
||||
log() {
|
||||
printf '[%s] action#%s %s\n' "$(date -u '+%Y-%m-%d %H:%M:%S UTC')" "$ISSUE" "$*" >> "$LOGFILE"
|
||||
}
|
||||
|
||||
# --- Concurrency lock (per issue) ---
|
||||
if [ -f "$LOCKFILE" ]; then
|
||||
LOCK_PID=$(cat "$LOCKFILE" 2>/dev/null || echo "")
|
||||
if [ -n "$LOCK_PID" ] && kill -0 "$LOCK_PID" 2>/dev/null; then
|
||||
log "SKIP: action-agent already running for #${ISSUE} (PID ${LOCK_PID})"
|
||||
exit 0
|
||||
fi
|
||||
rm -f "$LOCKFILE"
|
||||
fi
|
||||
echo $$ > "$LOCKFILE"
|
||||
|
||||
cleanup() {
|
||||
local exit_code=$?
|
||||
# Kill lifetime watchdog if running
|
||||
if [ -n "${LIFETIME_WATCHDOG_PID:-}" ] && kill -0 "$LIFETIME_WATCHDOG_PID" 2>/dev/null; then
|
||||
kill "$LIFETIME_WATCHDOG_PID" 2>/dev/null || true
|
||||
wait "$LIFETIME_WATCHDOG_PID" 2>/dev/null || true
|
||||
fi
|
||||
rm -f "$LOCKFILE"
|
||||
# Kill any remaining child processes spawned during the run
|
||||
local children
|
||||
children=$(jobs -p 2>/dev/null) || true
|
||||
if [ -n "$children" ]; then
|
||||
# shellcheck disable=SC2086 # intentional word splitting
|
||||
kill $children 2>/dev/null || true
|
||||
# shellcheck disable=SC2086
|
||||
wait $children 2>/dev/null || true
|
||||
fi
|
||||
# Best-effort docker cleanup for containers started during this action
|
||||
(cd "${WORKTREE}" 2>/dev/null && docker compose down 2>/dev/null) || true
|
||||
# Preserve worktree on crash for debugging; clean up on success
|
||||
if [ "$exit_code" -ne 0 ]; then
|
||||
worktree_preserve "$WORKTREE" "crashed (exit=$exit_code)"
|
||||
else
|
||||
worktree_cleanup "$WORKTREE"
|
||||
fi
|
||||
rm -f "$SID_FILE"
|
||||
}
|
||||
trap cleanup EXIT
|
||||
|
||||
# --- Memory guard ---
|
||||
memory_guard 2000
|
||||
|
||||
# --- Fetch issue ---
|
||||
log "fetching issue #${ISSUE}"
|
||||
ISSUE_JSON=$(curl -sf -H "Authorization: token ${FORGE_TOKEN}" \
|
||||
"${FORGE_API}/issues/${ISSUE}") || true
|
||||
|
||||
if [ -z "$ISSUE_JSON" ] || ! printf '%s' "$ISSUE_JSON" | jq -e '.id' >/dev/null 2>&1; then
|
||||
log "ERROR: failed to fetch issue #${ISSUE}"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
ISSUE_TITLE=$(printf '%s' "$ISSUE_JSON" | jq -r '.title')
|
||||
ISSUE_BODY=$(printf '%s' "$ISSUE_JSON" | jq -r '.body // ""')
|
||||
ISSUE_STATE=$(printf '%s' "$ISSUE_JSON" | jq -r '.state')
|
||||
|
||||
if [ "$ISSUE_STATE" != "open" ]; then
|
||||
log "SKIP: issue #${ISSUE} is ${ISSUE_STATE}"
|
||||
exit 0
|
||||
fi
|
||||
|
||||
log "Issue: ${ISSUE_TITLE}"
|
||||
|
||||
# --- Dependency check (shared library) ---
|
||||
if ! issue_check_deps "$ISSUE"; then
|
||||
log "SKIP: issue #${ISSUE} blocked by: ${_ISSUE_BLOCKED_BY[*]}"
|
||||
exit 0
|
||||
fi
|
||||
|
||||
# --- Extract model from YAML front matter (if present) ---
|
||||
YAML_MODEL=$(printf '%s' "$ISSUE_BODY" | \
|
||||
sed -n '/^---$/,/^---$/p' | grep '^model:' | awk '{print $2}' | tr -d '"' || true)
|
||||
if [ -n "$YAML_MODEL" ]; then
|
||||
export CLAUDE_MODEL="$YAML_MODEL"
|
||||
log "model from front matter: ${YAML_MODEL}"
|
||||
fi
|
||||
|
||||
# --- Resolve bot username(s) for comment filtering ---
|
||||
_bot_login=$(curl -sf -H "Authorization: token ${FORGE_TOKEN}" \
|
||||
"${FORGE_API%%/repos*}/user" | jq -r '.login // empty' 2>/dev/null || true)
|
||||
|
||||
# Build list: token owner + any extra names from FORGE_BOT_USERNAMES (comma-separated)
|
||||
_bot_logins="${_bot_login}"
|
||||
if [ -n "${FORGE_BOT_USERNAMES:-}" ]; then
|
||||
_bot_logins="${_bot_logins:+${_bot_logins},}${FORGE_BOT_USERNAMES}"
|
||||
fi
|
||||
|
||||
# --- Fetch existing comments (resume context, excluding bot comments) ---
|
||||
COMMENTS_JSON=$(curl -sf -H "Authorization: token ${FORGE_TOKEN}" \
|
||||
"${FORGE_API}/issues/${ISSUE}/comments?limit=50") || true
|
||||
|
||||
PRIOR_COMMENTS=""
|
||||
if [ -n "$COMMENTS_JSON" ] && [ "$COMMENTS_JSON" != "null" ] && [ "$COMMENTS_JSON" != "[]" ]; then
|
||||
PRIOR_COMMENTS=$(printf '%s' "$COMMENTS_JSON" | \
|
||||
jq -r --arg bots "$_bot_logins" \
|
||||
'($bots | split(",") | map(select(. != ""))) as $bl |
|
||||
.[] | select(.user.login as $u | $bl | index($u) | not) |
|
||||
"[\(.user.login) at \(.created_at[:19])]\n\(.body)\n---"' 2>/dev/null || true)
|
||||
fi
|
||||
|
||||
# --- Determine git remote ---
|
||||
cd "${PROJECT_REPO_ROOT}"
|
||||
_forge_host=$(echo "$FORGE_URL" | sed 's|https\?://||; s|/.*||')
|
||||
FORGE_REMOTE=$(git remote -v | awk -v host="$_forge_host" '$2 ~ host && /\(push\)/ {print $1; exit}')
|
||||
FORGE_REMOTE="${FORGE_REMOTE:-origin}"
|
||||
export FORGE_REMOTE
|
||||
|
||||
# --- Create isolated worktree ---
|
||||
log "creating worktree: ${WORKTREE}"
|
||||
git fetch "${FORGE_REMOTE}" "${PRIMARY_BRANCH}" 2>/dev/null || true
|
||||
if ! worktree_create "$WORKTREE" "$BRANCH"; then
|
||||
log "ERROR: worktree creation failed"
|
||||
exit 1
|
||||
fi
|
||||
log "worktree ready: ${WORKTREE}"
|
||||
|
||||
# --- Build prompt ---
|
||||
PRIOR_SECTION=""
|
||||
if [ -n "$PRIOR_COMMENTS" ]; then
|
||||
PRIOR_SECTION="## Prior comments (resume context)
|
||||
|
||||
${PRIOR_COMMENTS}
|
||||
|
||||
"
|
||||
fi
|
||||
|
||||
GIT_INSTRUCTIONS=$(build_phase_protocol_prompt "$BRANCH" "$FORGE_REMOTE")
|
||||
|
||||
PROMPT="You are an action agent. Your job is to execute the action formula
|
||||
in the issue below.
|
||||
|
||||
## Issue #${ISSUE}: ${ISSUE_TITLE}
|
||||
|
||||
${ISSUE_BODY}
|
||||
|
||||
${PRIOR_SECTION}## Instructions
|
||||
|
||||
1. Read the action formula steps in the issue body carefully.
|
||||
|
||||
2. Execute each step in order using your Bash tool and any other tools available.
|
||||
|
||||
3. Post progress as comments on issue #${ISSUE} after significant steps:
|
||||
curl -sf -X POST \\
|
||||
-H \"Authorization: token \${FORGE_TOKEN}\" \\
|
||||
-H 'Content-Type: application/json' \\
|
||||
\"${FORGE_API}/issues/${ISSUE}/comments\" \\
|
||||
-d \"{\\\"body\\\": \\\"your comment here\\\"}\"
|
||||
|
||||
4. If a step requires human input or approval, post a comment explaining what
|
||||
is needed and stop — the orchestrator will block the issue.
|
||||
|
||||
### Path A: If this action produces code changes (e.g. config updates, baselines):
|
||||
- You are already in an isolated worktree at: ${WORKTREE}
|
||||
- You are on branch: ${BRANCH}
|
||||
- Make your changes, commit, and push: git push ${FORGE_REMOTE} ${BRANCH}
|
||||
- **IMPORTANT:** The worktree is destroyed after completion. Push all
|
||||
results before finishing — unpushed work will be lost.
|
||||
|
||||
### Path B: If this action produces no code changes (investigation, report):
|
||||
- Post results as a comment on issue #${ISSUE}.
|
||||
- **IMPORTANT:** The worktree is destroyed after completion. Copy any
|
||||
files you need to persistent paths before finishing.
|
||||
|
||||
5. Environment variables available in your bash sessions:
|
||||
FORGE_TOKEN, FORGE_API, FORGE_REPO, FORGE_WEB, PROJECT_NAME
|
||||
(all sourced from ${FACTORY_ROOT}/.env)
|
||||
|
||||
### CRITICAL: Never embed secrets in issue bodies, comments, or PR descriptions
|
||||
- NEVER put API keys, tokens, passwords, or private keys in issue text or comments.
|
||||
- Always reference secrets via env var names (e.g. \\\$BASE_RPC_URL, \\\${FORGE_TOKEN}).
|
||||
- If a formula step needs a secret, read it from .env or the environment at runtime.
|
||||
- Before posting any comment, verify it contains no credentials, hex keys > 32 chars,
|
||||
or URLs with embedded API keys.
|
||||
|
||||
If the prior comments above show work already completed, resume from where it
|
||||
left off.
|
||||
|
||||
${GIT_INSTRUCTIONS}"
|
||||
|
||||
# --- Wall-clock lifetime watchdog (background) ---
|
||||
# Caps total run time independently of claude -p timeout. When the cap is
|
||||
# hit the watchdog kills the main process, which triggers cleanup via trap.
|
||||
_lifetime_watchdog() {
|
||||
local remaining=$(( MAX_LIFETIME - ($(date +%s) - SESSION_START_EPOCH) ))
|
||||
[ "$remaining" -le 0 ] && remaining=1
|
||||
sleep "$remaining"
|
||||
local hours=$(( MAX_LIFETIME / 3600 ))
|
||||
log "MAX_LIFETIME (${hours}h) reached — killing agent"
|
||||
# Post summary comment on issue
|
||||
local body="Action agent killed: wall-clock lifetime cap (${hours}h) reached."
|
||||
curl -sf -X POST \
|
||||
-H "Authorization: token ${FORGE_TOKEN}" \
|
||||
-H 'Content-Type: application/json' \
|
||||
"${FORGE_API}/issues/${ISSUE}/comments" \
|
||||
-d "{\"body\": \"${body}\"}" >/dev/null 2>&1 || true
|
||||
kill $$ 2>/dev/null || true
|
||||
}
|
||||
_lifetime_watchdog &
|
||||
LIFETIME_WATCHDOG_PID=$!
|
||||
|
||||
# --- Run agent ---
|
||||
log "running agent (worktree: ${WORKTREE})"
|
||||
agent_run --worktree "$WORKTREE" "$PROMPT"
|
||||
log "agent_run complete"
|
||||
|
||||
# --- Detect if branch was pushed (Path A vs Path B) ---
|
||||
PUSHED=false
|
||||
# Check if remote branch exists
|
||||
git fetch "${FORGE_REMOTE}" "$BRANCH" 2>/dev/null || true
|
||||
if git rev-parse --verify "${FORGE_REMOTE}/${BRANCH}" >/dev/null 2>&1; then
|
||||
PUSHED=true
|
||||
fi
|
||||
# Fallback: check local commits ahead of base
|
||||
if [ "$PUSHED" = false ]; then
|
||||
if git -C "$WORKTREE" log "${FORGE_REMOTE}/${PRIMARY_BRANCH}..${BRANCH}" --oneline 2>/dev/null | grep -q .; then
|
||||
PUSHED=true
|
||||
fi
|
||||
fi
|
||||
|
||||
if [ "$PUSHED" = true ]; then
|
||||
# --- Path A: code changes pushed — create PR and walk to merge ---
|
||||
log "branch pushed — creating PR"
|
||||
PR_NUMBER=""
|
||||
PR_NUMBER=$(pr_create "$BRANCH" "action: ${ISSUE_TITLE}" \
|
||||
"Closes #${ISSUE}
|
||||
|
||||
Automated action execution by action-agent.") || true
|
||||
|
||||
if [ -n "$PR_NUMBER" ]; then
|
||||
log "walking PR #${PR_NUMBER} to merge"
|
||||
pr_walk_to_merge "$PR_NUMBER" "$_AGENT_SESSION_ID" "$WORKTREE" || true
|
||||
|
||||
case "${_PR_WALK_EXIT_REASON:-}" in
|
||||
merged)
|
||||
log "PR #${PR_NUMBER} merged — closing issue"
|
||||
issue_close "$ISSUE"
|
||||
;;
|
||||
*)
|
||||
log "PR #${PR_NUMBER} not merged (reason: ${_PR_WALK_EXIT_REASON:-unknown})"
|
||||
issue_block "$ISSUE" "pr_not_merged: ${_PR_WALK_EXIT_REASON:-unknown}"
|
||||
;;
|
||||
esac
|
||||
else
|
||||
log "ERROR: failed to create PR"
|
||||
issue_block "$ISSUE" "pr_creation_failed"
|
||||
fi
|
||||
else
|
||||
# --- Path B: no code changes — close issue directly ---
|
||||
log "no branch pushed — closing issue (Path B)"
|
||||
issue_close "$ISSUE"
|
||||
fi
|
||||
|
||||
log "action-agent finished for issue #${ISSUE}"
|
||||
|
|
@ -1,75 +0,0 @@
|
|||
#!/usr/bin/env bash
|
||||
# action-poll.sh — Cron scheduler: find open 'action' issues, spawn action-agent
|
||||
#
|
||||
# An issue is ready for action if:
|
||||
# - It is open and labeled 'action'
|
||||
# - No tmux session named action-{project}-{issue_num} is already active
|
||||
#
|
||||
# Usage:
|
||||
# cron every 10min
|
||||
# action-poll.sh [projects/foo.toml] # optional project config
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
export PROJECT_TOML="${1:-}"
|
||||
source "$(dirname "$0")/../lib/env.sh"
|
||||
# Use action-bot's own Forgejo identity (#747)
|
||||
FORGE_TOKEN="${FORGE_ACTION_TOKEN:-${FORGE_TOKEN}}"
|
||||
# shellcheck source=../lib/guard.sh
|
||||
source "$(dirname "$0")/../lib/guard.sh"
|
||||
check_active action
|
||||
|
||||
LOGFILE="${DISINTO_LOG_DIR}/action/action-poll-${PROJECT_NAME:-default}.log"
|
||||
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
|
||||
|
||||
log() {
|
||||
printf '[%s] poll: %s\n' "$(date -u '+%Y-%m-%d %H:%M:%S UTC')" "$*" >> "$LOGFILE"
|
||||
}
|
||||
|
||||
# --- Memory guard ---
|
||||
memory_guard 2000
|
||||
|
||||
# --- Find open 'action' issues ---
|
||||
log "scanning for open action issues"
|
||||
ACTION_ISSUES=$(curl -sf -H "Authorization: token ${FORGE_TOKEN}" \
|
||||
"${FORGE_API}/issues?state=open&labels=action&limit=50&type=issues") || true
|
||||
|
||||
if [ -z "$ACTION_ISSUES" ] || [ "$ACTION_ISSUES" = "null" ]; then
|
||||
log "no action issues found"
|
||||
exit 0
|
||||
fi
|
||||
|
||||
COUNT=$(printf '%s' "$ACTION_ISSUES" | jq 'length')
|
||||
if [ "$COUNT" -eq 0 ]; then
|
||||
log "no action issues found"
|
||||
exit 0
|
||||
fi
|
||||
|
||||
log "found ${COUNT} open action issue(s)"
|
||||
|
||||
# Spawn action-agent for each issue that has no active tmux session.
|
||||
# Only one agent is spawned per poll to avoid memory pressure; the next
|
||||
# poll picks up remaining issues.
|
||||
for i in $(seq 0 $((COUNT - 1))); do
|
||||
ISSUE_NUM=$(printf '%s' "$ACTION_ISSUES" | jq -r ".[$i].number")
|
||||
SESSION="action-${PROJECT_NAME}-${ISSUE_NUM}"
|
||||
|
||||
if tmux has-session -t "$SESSION" 2>/dev/null; then
|
||||
log "issue #${ISSUE_NUM}: session ${SESSION} already active, skipping"
|
||||
continue
|
||||
fi
|
||||
|
||||
LOCKFILE="/tmp/action-agent-${ISSUE_NUM}.lock"
|
||||
if [ -f "$LOCKFILE" ]; then
|
||||
LOCK_PID=$(cat "$LOCKFILE" 2>/dev/null || echo "")
|
||||
if [ -n "$LOCK_PID" ] && kill -0 "$LOCK_PID" 2>/dev/null; then
|
||||
log "issue #${ISSUE_NUM}: agent starting (PID ${LOCK_PID}), skipping"
|
||||
continue
|
||||
fi
|
||||
fi
|
||||
|
||||
log "spawning action-agent for issue #${ISSUE_NUM}"
|
||||
nohup "${SCRIPT_DIR}/action-agent.sh" "$ISSUE_NUM" "$PROJECT_TOML" >> "$LOGFILE" 2>&1 &
|
||||
log "started action-agent PID $! for issue #${ISSUE_NUM}"
|
||||
break
|
||||
done
|
||||
|
|
@ -695,13 +695,12 @@ setup_forge() {
|
|||
[vault-bot]="FORGE_VAULT_TOKEN"
|
||||
[supervisor-bot]="FORGE_SUPERVISOR_TOKEN"
|
||||
[predictor-bot]="FORGE_PREDICTOR_TOKEN"
|
||||
[action-bot]="FORGE_ACTION_TOKEN"
|
||||
)
|
||||
|
||||
local env_file="${FACTORY_ROOT}/.env"
|
||||
local bot_user bot_pass token token_var
|
||||
|
||||
for bot_user in dev-bot review-bot planner-bot gardener-bot vault-bot supervisor-bot predictor-bot action-bot; do
|
||||
for bot_user in dev-bot review-bot planner-bot gardener-bot vault-bot supervisor-bot predictor-bot; do
|
||||
bot_pass="bot-$(head -c 16 /dev/urandom | base64 | tr -dc 'a-zA-Z0-9' | head -c 20)"
|
||||
token_var="${bot_token_vars[$bot_user]}"
|
||||
|
||||
|
|
@ -812,7 +811,7 @@ setup_forge() {
|
|||
fi
|
||||
|
||||
# Add all bot users as collaborators
|
||||
for bot_user in dev-bot review-bot planner-bot gardener-bot vault-bot supervisor-bot predictor-bot action-bot; do
|
||||
for bot_user in dev-bot review-bot planner-bot gardener-bot vault-bot supervisor-bot predictor-bot; do
|
||||
curl -sf -X PUT \
|
||||
-H "Authorization: token ${admin_token:-${FORGE_TOKEN}}" \
|
||||
-H "Content-Type: application/json" \
|
||||
|
|
@ -860,7 +859,7 @@ setup_ops_repo() {
|
|||
|
||||
# Add all bot users as collaborators
|
||||
local bot_user
|
||||
for bot_user in dev-bot review-bot planner-bot gardener-bot vault-bot supervisor-bot predictor-bot action-bot; do
|
||||
for bot_user in dev-bot review-bot planner-bot gardener-bot vault-bot supervisor-bot predictor-bot; do
|
||||
curl -sf -X PUT \
|
||||
-H "Authorization: token ${admin_token:-${FORGE_TOKEN}}" \
|
||||
-H "Content-Type: application/json" \
|
||||
|
|
|
|||
|
|
@ -114,4 +114,3 @@ When reviewing PRs or designing new agents, ask:
|
|||
| gardener | 1242 (agent 471 + poll 771) | Medium — backlog triage, duplicate detection, tech-debt scoring | Poll is heavy orchestration; agent is prompt-driven |
|
||||
| vault | 442 (4 scripts) | Medium — approval flow, human gate decisions | Intentionally bash-heavy (security gate should be deterministic) |
|
||||
| planner | 382 | Medium — AGENTS.md update, gap analysis | Tmux+formula (done, #232) |
|
||||
| action-agent | 192 | Light — formula execution | Close to target |
|
||||
|
|
|
|||
|
|
@ -117,7 +117,7 @@ signal to the phase file.
|
|||
- **Post-loop exit handler (`case $_MONITOR_LOOP_EXIT`):** Must include an
|
||||
`idle_prompt)` branch. Typical actions: log the event, clean up temp files,
|
||||
and (for agents that use escalation) write an escalation entry or notify via
|
||||
vault/forge. See `dev/dev-agent.sh`, `action/action-agent.sh`, and
|
||||
vault/forge. See `dev/dev-agent.sh` and
|
||||
`gardener/gardener-agent.sh` for reference implementations.
|
||||
|
||||
## Crash Recovery
|
||||
|
|
|
|||
|
|
@ -6,19 +6,19 @@ sourced as needed.
|
|||
|
||||
| File | What it provides | Sourced by |
|
||||
|---|---|---|
|
||||
| `lib/env.sh` | Loads `.env`, sets `FACTORY_ROOT`, exports project config (`FORGE_REPO`, `PROJECT_NAME`, etc.), defines `log()`, `forge_api()`, `forge_api_all()` (accepts optional second TOKEN parameter, defaults to `$FORGE_TOKEN`), `woodpecker_api()`, `wpdb()`, `memory_guard()` (skips agent if RAM < threshold). Auto-loads project TOML if `PROJECT_TOML` is set. Exports per-agent tokens (`FORGE_PLANNER_TOKEN`, `FORGE_GARDENER_TOKEN`, `FORGE_VAULT_TOKEN`, `FORGE_SUPERVISOR_TOKEN`, `FORGE_PREDICTOR_TOKEN`, `FORGE_ACTION_TOKEN`) — each falls back to `$FORGE_TOKEN` if not set. **Vault-only token guard (AD-006)**: `unset GITHUB_TOKEN CLAWHUB_TOKEN` so agents never hold external-action tokens — only the runner container receives them. **Container note**: when `DISINTO_CONTAINER=1`, `.env` is NOT re-sourced — compose already injects env vars (including `FORGE_URL=http://forgejo:3000`) and re-sourcing would clobber them. | Every agent |
|
||||
| `lib/env.sh` | Loads `.env`, sets `FACTORY_ROOT`, exports project config (`FORGE_REPO`, `PROJECT_NAME`, etc.), defines `log()`, `forge_api()`, `forge_api_all()` (accepts optional second TOKEN parameter, defaults to `$FORGE_TOKEN`), `woodpecker_api()`, `wpdb()`, `memory_guard()` (skips agent if RAM < threshold). Auto-loads project TOML if `PROJECT_TOML` is set. Exports per-agent tokens (`FORGE_PLANNER_TOKEN`, `FORGE_GARDENER_TOKEN`, `FORGE_VAULT_TOKEN`, `FORGE_SUPERVISOR_TOKEN`, `FORGE_PREDICTOR_TOKEN`) — each falls back to `$FORGE_TOKEN` if not set. **Vault-only token guard (AD-006)**: `unset GITHUB_TOKEN CLAWHUB_TOKEN` so agents never hold external-action tokens — only the runner container receives them. **Container note**: when `DISINTO_CONTAINER=1`, `.env` is NOT re-sourced — compose already injects env vars (including `FORGE_URL=http://forgejo:3000`) and re-sourcing would clobber them. | Every agent |
|
||||
| `lib/ci-helpers.sh` | `ci_passed()` — returns 0 if CI state is "success" (or no CI configured). `ci_required_for_pr()` — returns 0 if PR has code files (CI required), 1 if non-code only (CI not required). `is_infra_step()` — returns 0 if a single CI step failure matches infra heuristics (clone/git exit 128, any exit 137, log timeout patterns). `classify_pipeline_failure()` — returns "infra \<reason>" if any failed Woodpecker step matches infra heuristics via `is_infra_step()`, else "code". `ensure_priority_label()` — looks up (or creates) the `priority` label and returns its ID; caches in `_PRIORITY_LABEL_ID`. `ci_commit_status <sha>` — queries Woodpecker directly for CI state, falls back to forge commit status API. `ci_pipeline_number <sha>` — returns the Woodpecker pipeline number for a commit, falls back to parsing forge status `target_url`. `ci_promote <repo_id> <pipeline_num> <environment>` — promotes a pipeline to a named Woodpecker environment (vault-gated deployment: vault approves, vault-fire calls this). | dev-poll, review-poll, review-pr, supervisor-poll |
|
||||
| `lib/ci-debug.sh` | CLI tool for Woodpecker CI: `list`, `status`, `logs`, `failures` subcommands. Not sourced — run directly. | Humans / dev-agent (tool access) |
|
||||
| `lib/load-project.sh` | Parses a `projects/*.toml` file into env vars (`PROJECT_NAME`, `FORGE_REPO`, `WOODPECKER_REPO_ID`, monitoring toggles, mirror config, etc.). | env.sh (when `PROJECT_TOML` is set), supervisor-poll (per-project iteration) |
|
||||
| `lib/parse-deps.sh` | Extracts dependency issue numbers from an issue body (stdin → stdout, one number per line). Matches `## Dependencies` / `## Depends on` / `## Blocked by` sections and inline `depends on #N` / `blocked by #N` patterns. Inline scan skips fenced code blocks to prevent false positives from code examples in issue bodies. Not sourced — executed via `bash lib/parse-deps.sh`. | dev-poll, supervisor-poll |
|
||||
| `lib/formula-session.sh` | `acquire_cron_lock()`, `check_memory()`, `load_formula()`, `build_context_block()`, `consume_escalation_reply()`, `start_formula_session()`, `formula_phase_callback()`, `build_prompt_footer()`, `build_graph_section()`, `run_formula_and_monitor(AGENT [TIMEOUT] [CALLBACK])` — shared helpers for formula-driven cron agents (lock, memory guard, formula loading, prompt assembly, tmux session, monitor loop, crash recovery). `build_graph_section()` generates the structural-analysis section (runs `lib/build-graph.py`, formats JSON output) — previously duplicated in planner-run.sh and predictor-run.sh, now shared here. `formula_phase_callback()` handles `PHASE:escalate` (unified escalation path — kills the session). `run_formula_and_monitor` accepts an optional CALLBACK (default: `formula_phase_callback`) so callers can install custom merge-through or escalation handlers. `cleanup_stale_crashed_worktrees()` — thin wrapper around `worktree_cleanup_stale()` from `lib/worktree.sh` (kept for backwards compatibility). | planner-run.sh, predictor-run.sh, gardener-run.sh, supervisor-run.sh, dev-agent.sh, action-agent.sh |
|
||||
| `lib/guard.sh` | `check_active(agent_name)` — reads `$FACTORY_ROOT/state/.{agent_name}-active`; exits 0 (skip) if the file is absent. Factory is off by default — state files must be created to enable each agent. **Logs a message to stderr** when skipping (`[check_active] SKIP: state file not found`), so agent dropout is visible in cron logs. Sourced by dev-poll.sh, review-poll.sh, action-poll.sh, predictor-run.sh, supervisor-run.sh. | cron entry points |
|
||||
| `lib/formula-session.sh` | `acquire_cron_lock()`, `check_memory()`, `load_formula()`, `build_context_block()`, `consume_escalation_reply()`, `start_formula_session()`, `formula_phase_callback()`, `build_prompt_footer()`, `build_graph_section()`, `run_formula_and_monitor(AGENT [TIMEOUT] [CALLBACK])` — shared helpers for formula-driven cron agents (lock, memory guard, formula loading, prompt assembly, tmux session, monitor loop, crash recovery). `build_graph_section()` generates the structural-analysis section (runs `lib/build-graph.py`, formats JSON output) — previously duplicated in planner-run.sh and predictor-run.sh, now shared here. `formula_phase_callback()` handles `PHASE:escalate` (unified escalation path — kills the session). `run_formula_and_monitor` accepts an optional CALLBACK (default: `formula_phase_callback`) so callers can install custom merge-through or escalation handlers. `cleanup_stale_crashed_worktrees()` — thin wrapper around `worktree_cleanup_stale()` from `lib/worktree.sh` (kept for backwards compatibility). | planner-run.sh, predictor-run.sh, gardener-run.sh, supervisor-run.sh, dev-agent.sh |
|
||||
| `lib/guard.sh` | `check_active(agent_name)` — reads `$FACTORY_ROOT/state/.{agent_name}-active`; exits 0 (skip) if the file is absent. Factory is off by default — state files must be created to enable each agent. **Logs a message to stderr** when skipping (`[check_active] SKIP: state file not found`), so agent dropout is visible in cron logs. Sourced by dev-poll.sh, review-poll.sh, predictor-run.sh, supervisor-run.sh. | cron entry points |
|
||||
| `lib/mirrors.sh` | `mirror_push()` — pushes `$PRIMARY_BRANCH` + tags to all configured mirror remotes (fire-and-forget background pushes). Reads `MIRROR_NAMES` and `MIRROR_*` vars exported by `load-project.sh` from the `[mirrors]` TOML section. Failures are logged but never block the pipeline. Sourced by dev-poll.sh and dev/phase-handler.sh — called after every successful merge. | dev-poll.sh, phase-handler.sh |
|
||||
| `lib/build-graph.py` | Python tool: parses VISION.md, prerequisites.md (from ops repo), AGENTS.md, formulas/*.toml, evidence/ (from ops repo), and forge issues/labels into a NetworkX DiGraph. Runs structural analyses (orphaned objectives, stale prerequisites, thin evidence, circular deps) and outputs a JSON report. Used by `review-pr.sh` (per-PR changed-file analysis) and `predictor-run.sh` (full-project analysis) to provide structural context to Claude. | review-pr.sh, predictor-run.sh |
|
||||
| `lib/secret-scan.sh` | `scan_for_secrets()` — detects potential secrets (API keys, bearer tokens, private keys, URLs with embedded credentials) in text; returns 1 if secrets found. `redact_secrets()` — replaces detected secret patterns with `[REDACTED]`. | file-action-issue.sh, phase-handler.sh |
|
||||
| `lib/file-action-issue.sh` | `file_action_issue()` — dedup check, secret scan, label lookup, and issue creation for formula-driven cron wrappers. Sets `FILED_ISSUE_NUM` on success. Returns 4 if secrets detected in body. | (available for future use) |
|
||||
| `lib/tea-helpers.sh` | `tea_file_issue(title, body, labels...)` — create issue via tea CLI with secret scanning; sets `FILED_ISSUE_NUM`. `tea_relabel(issue_num, labels...)` — replace labels using tea's `edit` subcommand (not `label`). `tea_comment(issue_num, body)` — add comment with secret scanning. `tea_close(issue_num)` — close issue. All use `TEA_LOGIN` and `FORGE_REPO` from env.sh. Labels by name (no ID lookup). Tea binary download verified via sha256 checksum. Sourced by env.sh when `tea` binary is available. | env.sh (conditional) |
|
||||
| `lib/worktree.sh` | Reusable git worktree management: `worktree_create(path, branch, [base_ref])` — create worktree, checkout base, fetch submodules. `worktree_recover(path, branch, [remote])` — detect existing worktree, reuse if on correct branch (sets `_WORKTREE_REUSED`), otherwise clean and recreate. `worktree_cleanup(path)` — `git worktree remove --force`, clear Claude Code project cache (`~/.claude/projects/` matching path). `worktree_cleanup_stale([max_age_hours])` — scan `/tmp` for orphaned worktrees older than threshold, skip preserved and active tmux worktrees, prune. `worktree_preserve(path, reason)` — mark worktree as preserved for debugging (writes `.worktree-preserved` marker, skipped by stale cleanup). | dev-agent.sh, action-agent.sh, supervisor-run.sh, planner-run.sh, predictor-run.sh, gardener-run.sh |
|
||||
| `lib/pr-lifecycle.sh` | Reusable PR lifecycle library: `pr_create()`, `pr_find_by_branch()`, `pr_poll_ci()`, `pr_poll_review()`, `pr_merge()`, `pr_is_merged()`, `pr_walk_to_merge()`, `build_phase_protocol_prompt()`. Requires `lib/ci-helpers.sh`. | dev-agent.sh (future), action-agent.sh (future) |
|
||||
| `lib/issue-lifecycle.sh` | Reusable issue lifecycle library: `issue_claim()` (add in-progress, remove backlog), `issue_release()` (remove in-progress, add backlog), `issue_block()` (post diagnostic comment with secret redaction, add blocked label), `issue_close()`, `issue_check_deps()` (parse deps, check transitive closure; sets `_ISSUE_BLOCKED_BY`, `_ISSUE_SUGGESTION`), `issue_suggest_next()` (find next unblocked backlog issue; sets `_ISSUE_NEXT`), `issue_post_refusal()` (structured refusal comment with dedup). Label IDs cached in globals on first lookup. Sources `lib/secret-scan.sh`. | dev-agent.sh (future), action-agent.sh (future) |
|
||||
| `lib/agent-session.sh` | Shared tmux + Claude session helpers: `create_agent_session()`, `inject_formula()`, `agent_wait_for_claude_ready()`, `agent_inject_into_session()`, `agent_kill_session()`, `monitor_phase_loop()`, `read_phase()`, `write_compact_context()`. `create_agent_session(session, workdir, [phase_file])` optionally installs a PostToolUse hook (matcher `Bash\|Write`) that detects phase file writes in real-time — when Claude writes to the phase file, the hook writes a marker so `monitor_phase_loop` reacts on the next poll instead of waiting for mtime changes. Also installs a StopFailure hook (matcher `rate_limit\|server_error\|authentication_failed\|billing_error`) that writes `PHASE:failed` with an `api_error` reason to the phase file and touches the phase-changed marker, so the orchestrator discovers API errors within one poll cycle instead of waiting for idle timeout. Also installs a SessionStart hook (matcher `compact`) that re-injects phase protocol instructions after context compaction — callers write the context file via `write_compact_context(phase_file, content)`, and the hook (`on-compact-reinject.sh`) outputs the file content to stdout so Claude retains critical instructions. When `phase_file` is set, passes it to the idle stop hook (`on-idle-stop.sh`) so the hook can **nudge Claude** (up to 2 times) if Claude returns to the prompt without writing to the phase file — the hook injects a tmux reminder asking Claude to signal PHASE:done or PHASE:awaiting_ci. The PreToolUse guard hook (`on-pretooluse-guard.sh`) receives the session name as a third argument — formula agents (`gardener-*`, `planner-*`, `predictor-*`, `supervisor-*`) are identified this way and allowed to access `FACTORY_ROOT` from worktrees (they need env.sh, AGENTS.md, formulas/, lib/). **OAuth flock**: when `DISINTO_CONTAINER=1`, Claude CLI is wrapped in `flock -w 300 ~/.claude/session.lock` to queue concurrent token refresh attempts and prevent rotation races across agents sharing the same credentials. `monitor_phase_loop` sets `_MONITOR_LOOP_EXIT` to one of: `done`, `idle_timeout`, `idle_prompt` (Claude returned to `>` for 3 consecutive polls without writing any phase — callback invoked with `PHASE:failed`, session already dead), `crashed`, or `PHASE:escalate` / other `PHASE:*` string. **Unified escalation**: `PHASE:escalate` is the signal that a session needs human input (renamed from `PHASE:needs_human`). **Callers must handle `idle_prompt`** in both their callback and their post-loop exit handler — see [`docs/PHASE-PROTOCOL.md` idle_prompt](docs/PHASE-PROTOCOL.md#idle_prompt-exit-reason) for the full contract. | dev-agent.sh, action-agent.sh |
|
||||
| `lib/worktree.sh` | Reusable git worktree management: `worktree_create(path, branch, [base_ref])` — create worktree, checkout base, fetch submodules. `worktree_recover(path, branch, [remote])` — detect existing worktree, reuse if on correct branch (sets `_WORKTREE_REUSED`), otherwise clean and recreate. `worktree_cleanup(path)` — `git worktree remove --force`, clear Claude Code project cache (`~/.claude/projects/` matching path). `worktree_cleanup_stale([max_age_hours])` — scan `/tmp` for orphaned worktrees older than threshold, skip preserved and active tmux worktrees, prune. `worktree_preserve(path, reason)` — mark worktree as preserved for debugging (writes `.worktree-preserved` marker, skipped by stale cleanup). | dev-agent.sh, supervisor-run.sh, planner-run.sh, predictor-run.sh, gardener-run.sh |
|
||||
| `lib/pr-lifecycle.sh` | Reusable PR lifecycle library: `pr_create()`, `pr_find_by_branch()`, `pr_poll_ci()`, `pr_poll_review()`, `pr_merge()`, `pr_is_merged()`, `pr_walk_to_merge()`, `build_phase_protocol_prompt()`. Requires `lib/ci-helpers.sh`. | dev-agent.sh (future) |
|
||||
| `lib/issue-lifecycle.sh` | Reusable issue lifecycle library: `issue_claim()` (add in-progress, remove backlog), `issue_release()` (remove in-progress, add backlog), `issue_block()` (post diagnostic comment with secret redaction, add blocked label), `issue_close()`, `issue_check_deps()` (parse deps, check transitive closure; sets `_ISSUE_BLOCKED_BY`, `_ISSUE_SUGGESTION`), `issue_suggest_next()` (find next unblocked backlog issue; sets `_ISSUE_NEXT`), `issue_post_refusal()` (structured refusal comment with dedup). Label IDs cached in globals on first lookup. Sources `lib/secret-scan.sh`. | dev-agent.sh (future) |
|
||||
| `lib/agent-session.sh` | Shared tmux + Claude session helpers: `create_agent_session()`, `inject_formula()`, `agent_wait_for_claude_ready()`, `agent_inject_into_session()`, `agent_kill_session()`, `monitor_phase_loop()`, `read_phase()`, `write_compact_context()`. `create_agent_session(session, workdir, [phase_file])` optionally installs a PostToolUse hook (matcher `Bash\|Write`) that detects phase file writes in real-time — when Claude writes to the phase file, the hook writes a marker so `monitor_phase_loop` reacts on the next poll instead of waiting for mtime changes. Also installs a StopFailure hook (matcher `rate_limit\|server_error\|authentication_failed\|billing_error`) that writes `PHASE:failed` with an `api_error` reason to the phase file and touches the phase-changed marker, so the orchestrator discovers API errors within one poll cycle instead of waiting for idle timeout. Also installs a SessionStart hook (matcher `compact`) that re-injects phase protocol instructions after context compaction — callers write the context file via `write_compact_context(phase_file, content)`, and the hook (`on-compact-reinject.sh`) outputs the file content to stdout so Claude retains critical instructions. When `phase_file` is set, passes it to the idle stop hook (`on-idle-stop.sh`) so the hook can **nudge Claude** (up to 2 times) if Claude returns to the prompt without writing to the phase file — the hook injects a tmux reminder asking Claude to signal PHASE:done or PHASE:awaiting_ci. The PreToolUse guard hook (`on-pretooluse-guard.sh`) receives the session name as a third argument — formula agents (`gardener-*`, `planner-*`, `predictor-*`, `supervisor-*`) are identified this way and allowed to access `FACTORY_ROOT` from worktrees (they need env.sh, AGENTS.md, formulas/, lib/). **OAuth flock**: when `DISINTO_CONTAINER=1`, Claude CLI is wrapped in `flock -w 300 ~/.claude/session.lock` to queue concurrent token refresh attempts and prevent rotation races across agents sharing the same credentials. `monitor_phase_loop` sets `_MONITOR_LOOP_EXIT` to one of: `done`, `idle_timeout`, `idle_prompt` (Claude returned to `>` for 3 consecutive polls without writing any phase — callback invoked with `PHASE:failed`, session already dead), `crashed`, or `PHASE:escalate` / other `PHASE:*` string. **Unified escalation**: `PHASE:escalate` is the signal that a session needs human input (renamed from `PHASE:needs_human`). **Callers must handle `idle_prompt`** in both their callback and their post-loop exit handler — see [`docs/PHASE-PROTOCOL.md` idle_prompt](docs/PHASE-PROTOCOL.md#idle_prompt-exit-reason) for the full contract. | dev-agent.sh |
|
||||
|
|
|
|||
|
|
@ -95,10 +95,9 @@ export FORGE_GARDENER_TOKEN="${FORGE_GARDENER_TOKEN:-${FORGE_TOKEN}}"
|
|||
export FORGE_VAULT_TOKEN="${FORGE_VAULT_TOKEN:-${FORGE_TOKEN}}"
|
||||
export FORGE_SUPERVISOR_TOKEN="${FORGE_SUPERVISOR_TOKEN:-${FORGE_TOKEN}}"
|
||||
export FORGE_PREDICTOR_TOKEN="${FORGE_PREDICTOR_TOKEN:-${FORGE_TOKEN}}"
|
||||
export FORGE_ACTION_TOKEN="${FORGE_ACTION_TOKEN:-${FORGE_TOKEN}}"
|
||||
|
||||
# Bot usernames filter: FORGE_BOT_USERNAMES > legacy CODEBERG_BOT_USERNAMES
|
||||
export FORGE_BOT_USERNAMES="${FORGE_BOT_USERNAMES:-${CODEBERG_BOT_USERNAMES:-dev-bot,review-bot,planner-bot,gardener-bot,vault-bot,supervisor-bot,predictor-bot,action-bot}}"
|
||||
export FORGE_BOT_USERNAMES="${FORGE_BOT_USERNAMES:-${CODEBERG_BOT_USERNAMES:-dev-bot,review-bot,planner-bot,gardener-bot,vault-bot,supervisor-bot,predictor-bot}}"
|
||||
export CODEBERG_BOT_USERNAMES="${FORGE_BOT_USERNAMES}" # backwards compat
|
||||
|
||||
# Project config (FORGE_* preferred, CODEBERG_* fallback)
|
||||
|
|
|
|||
|
|
@ -23,7 +23,7 @@ need human decisions or external resources are filed as vault procurement items
|
|||
(`$OPS_REPO_ROOT/vault/pending/*.md`) instead of being escalated. Phase 3
|
||||
(file-at-constraints): identify the top 3 unresolved prerequisites that block
|
||||
the most downstream objectives — file issues as either `backlog` (code changes,
|
||||
dev-agent) or `action` (run existing formula, action-agent). **Stuck issues
|
||||
dev-agent) or `action` (run existing formula, dispatcher). **Stuck issues
|
||||
(detected BOUNCED/LABEL_CHURN) are dispatched to the `groom-backlog` formula
|
||||
in breakdown mode instead of being re-promoted** — this breaks the ping-pong
|
||||
loop by splitting them into dev-agent-sized sub-issues. **Human-blocked issues
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue