fix: feat: supervisor as formula-driven agent — cron + Matrix escalation (#245)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
openhands 2026-03-21 00:22:37 +00:00
parent 7fe5ed0381
commit d8244742f1
5 changed files with 585 additions and 12 deletions

View file

@ -22,7 +22,10 @@ disinto/
├── planner/ planner-run.sh — direct cron executor for run-planner formula
│ planner/journal/ — daily raw logs from each planner run
│ prediction-poll.sh, prediction-agent.sh — legacy predictor (superseded by predictor/)
├── supervisor/ supervisor-poll.sh — health monitoring
├── supervisor/ supervisor-run.sh — formula-driven health monitoring (cron wrapper)
│ preflight.sh — pre-flight data collection for supervisor formula
│ supervisor/journal/ — daily health logs from each run
│ supervisor-poll.sh — legacy bash orchestrator (superseded)
├── vault/ vault-poll.sh, vault-agent.sh, vault-fire.sh — action gating
├── action/ action-poll.sh, action-agent.sh — operational task execution
├── lib/ env.sh, agent-session.sh, ci-helpers.sh, ci-debug.sh, load-project.sh, parse-deps.sh, matrix_listener.sh
@ -138,26 +141,56 @@ issue is filed against).
### Supervisor (`supervisor/`)
**Role**: Health monitoring and auto-remediation. Two-layer architecture:
(1) factory infrastructure checks (RAM, disk, swap, docker, stale processes)
that run once, and (2) per-project checks (CI, PRs, dev-agent health,
circular deps, stale deps) that iterate over `projects/*.toml`.
**Role**: Health monitoring and auto-remediation, executed as a formula-driven
Claude agent. Collects system and project metrics via a bash pre-flight script,
then runs an interactive Claude session (sonnet) that assesses health, auto-fixes
issues, escalates via Matrix, and writes a daily journal.
**Trigger**: `supervisor-poll.sh` runs every 10 min via cron.
**Trigger**: `supervisor-run.sh` runs every 20 min via cron. It creates a tmux
session with `claude --model sonnet`, injects `formulas/run-supervisor.toml`
with pre-collected metrics as context, monitors the phase file, and cleans up
on completion or timeout (20 min max session). No action issues — the supervisor
runs directly from cron like the planner and predictor.
**Key files**:
- `supervisor/supervisor-poll.sh` — All checks + auto-fixes (kill stale processes, rotate logs, drop caches, docker prune, abort stale rebases) then invokes `claude -p` for unresolved alerts
- `supervisor/update-prompt.sh` — Updates the supervisor prompt file
- `supervisor/PROMPT.md` — System prompt for the supervisor's Claude invocation
- `supervisor/supervisor-run.sh` — Cron wrapper + orchestrator: lock, memory guard,
runs preflight.sh, sources disinto project config, creates tmux session, injects
formula prompt with metrics, monitors phase file, handles crash recovery via
`run_formula_and_monitor`
- `supervisor/preflight.sh` — Data collection: system resources (RAM, disk, swap,
load), Docker status, active tmux sessions + phase files, lock files, agent log
tails, CI pipeline status, open PRs, issue counts, stale worktrees, pending
escalations, Matrix escalation replies
- `formulas/run-supervisor.toml` — Execution spec: five steps (preflight review,
health-assessment, decide-actions, report, journal) with `needs` dependencies.
Claude evaluates all metrics and takes actions in a single interactive session
- `supervisor/journal/*.md` — Daily health logs from each supervisor run (local,
committed periodically)
- `supervisor/PROMPT.md` — Best-practices reference for remediation actions
- `supervisor/best-practices/*.md` — Domain-specific remediation guides (memory,
disk, CI, git, dev-agent, review-agent, codeberg)
- `supervisor/supervisor-poll.sh` — Legacy bash orchestrator (superseded by
supervisor-run.sh + formula)
**Alert priorities**: P0 (memory crisis), P1 (disk), P2 (factory stopped/stalled),
P3 (degraded PRs, circular deps, stale deps), P4 (housekeeping).
**Matrix integration**: The supervisor has its own Matrix thread. Posts health
summaries when there are changes, escalates P0-P2 issues, and processes replies
from humans ("ignore disk warning", "kill that agent", "what's stuck?"). The
Matrix listener routes thread replies to `/tmp/supervisor-escalation-reply`,
which `supervisor-run.sh` consumes atomically on each run.
**Environment variables consumed**:
- All from `lib/env.sh` + per-project TOML overrides
- `CODEBERG_TOKEN`, `CODEBERG_REPO`, `CODEBERG_API`, `PROJECT_NAME`, `PROJECT_REPO_ROOT`
- `PRIMARY_BRANCH`, `CLAUDE_MODEL` (set to sonnet by supervisor-run.sh)
- `WOODPECKER_TOKEN`, `WOODPECKER_SERVER`, `WOODPECKER_DB_PASSWORD`, `WOODPECKER_DB_USER`, `WOODPECKER_DB_HOST`, `WOODPECKER_DB_NAME` — CI database queries
- `CHECK_PRS`, `CHECK_DEV_AGENT`, `CHECK_PIPELINE_STALL` — Per-project monitoring toggles (from TOML `[monitoring]` section)
- `CHECK_INFRA_RETRY` — Infra failure retry toggle (env var only, defaults to `true`; not configurable via project TOML)
- `MATRIX_TOKEN`, `MATRIX_ROOM_ID`, `MATRIX_HOMESERVER` — Matrix notifications + human input
**Lifecycle**: supervisor-run.sh (cron */20) → lock + memory guard → run
preflight.sh (collect metrics) → consume escalation replies → load formula +
context → create tmux session → Claude assesses health, auto-fixes, posts
Matrix summary, writes journal → `PHASE:done`.
### Planner (`planner/`)

View file

@ -0,0 +1,237 @@
# formulas/run-supervisor.toml — Supervisor formula (health monitoring + remediation)
#
# Executed by supervisor/supervisor-run.sh via cron (every 20 minutes).
# supervisor-run.sh creates a tmux session with Claude (sonnet) and injects
# this formula with pre-collected metrics as context.
#
# Steps: preflight → health-assessment → decide-actions → report → journal
#
# Key differences from planner/gardener:
# - Runs every 20min — lightweight health check
# - Primarily READS state, rarely WRITES (no PRs, just Matrix + journal)
# - Reactive to escalations — processes pending escalation events
# - Conversation memory via Matrix thread and journal
name = "run-supervisor"
description = "Factory health monitoring: assess metrics, fix issues, report via Matrix, write journal"
version = 1
model = "sonnet"
[context]
files = ["AGENTS.md"]
[[steps]]
id = "preflight"
title = "Review pre-collected metrics"
description = """
The pre-flight metrics have already been collected by supervisor/preflight.sh
and injected into your prompt above. Review them now.
1. Read the injected metrics data carefully (System Resources, Docker,
Active Sessions, Phase Files, Lock Files, Agent Logs, CI Pipelines,
Open PRs, Issue Status, Stale Worktrees, Pending Escalations,
Escalation Replies).
2. If there are escalation replies from Matrix (human messages), note them
you will act on them in the decide-actions step.
3. Read the supervisor journal for recent history:
JOURNAL_FILE="$FACTORY_ROOT/supervisor/journal/$(date -u +%Y-%m-%d).md"
if [ -f "$JOURNAL_FILE" ]; then cat "$JOURNAL_FILE"; fi
4. Note any values that cross these thresholds:
- RAM available < 500MB or swap > 3GB P0 (memory crisis)
- Disk > 80% P1 (disk pressure)
- Agent sessions dead, CI stuck/pending, git in bad state P2 (factory stopped)
- PRs stale, unreviewed, or with merge conflicts P3 (factory degraded)
- Stale worktrees, old lock files P4 (housekeeping)
"""
[[steps]]
id = "health-assessment"
title = "Evaluate health of each subsystem"
description = """
Categorize every finding from the metrics into priority levels.
### P0 — Memory crisis
- RAM available < 500MB
- Swap used > 3GB AND RAM available < 2000MB
### P1 — Disk pressure
- Disk usage > 80%
### P2 — Factory stopped / stalled
- CI pipelines stuck running > 20min or pending > 30min
- Dev-agent lock file present but process dead
- Dev-agent status unchanged for > 30min
- Git repo on wrong branch or in broken rebase state
- Pipeline stalled: backlog issues exist but no agent ran for > 20min
- Dev-agent blocked: last N polls all report "no ready issues"
- Dev sessions in PHASE:needs_human for > 24h
### P3 — Factory degraded
- PRs with CI pass but merge conflict (needs rebase)
- PRs with CI failure stale > 30min
- PRs with CI pass but no review for > 60min
- Circular dependency deadlocks in backlog
- Stale dependencies (blocked by issues open > 30 days)
### P4 — Housekeeping
- Stale worktrees > 2h old with no active process
- Lock files for dead processes
- Stale claude processes (> 3h old)
List each finding with its priority level. If everything looks healthy,
note "All systems healthy" and proceed.
"""
needs = ["preflight"]
[[steps]]
id = "decide-actions"
title = "Fix what you can, escalate what you cannot"
description = """
For each finding from the health assessment, decide and execute an action.
### Auto-fixable (execute these directly)
**P0 Memory crisis:**
# Kill stale one-shot claude processes (>3h old)
pgrep -f "claude -p" --older 10800 2>/dev/null | xargs kill 2>/dev/null || true
# Drop filesystem caches
sync && echo 3 | sudo tee /proc/sys/vm/drop_caches >/dev/null 2>&1 || true
**P1 Disk pressure:**
# Docker cleanup
sudo docker system prune -f >/dev/null 2>&1 || true
# Truncate logs > 10MB
for f in "$FACTORY_ROOT"/{dev,review,supervisor,gardener,planner,predictor}/*.log; do
[ -f "$f" ] && [ "$(du -k "$f" | cut -f1)" -gt 10240 ] && truncate -s 0 "$f"
done
**P2 Dead lock files:**
rm -f /path/to/stale.lock
**P2 Stale rebase:**
cd "$PROJECT_REPO_ROOT"
git rebase --abort 2>/dev/null
git checkout "$PRIMARY_BRANCH" 2>/dev/null
**P2 Wrong branch:**
cd "$PROJECT_REPO_ROOT"
git checkout "$PRIMARY_BRANCH" 2>/dev/null
**P4 Stale worktrees:**
git -C "$PROJECT_REPO_ROOT" worktree remove --force /tmp/stale-worktree 2>/dev/null
git -C "$PROJECT_REPO_ROOT" worktree prune 2>/dev/null
**P4 Stale claude processes:**
pgrep -f "claude -p" --older 10800 2>/dev/null | xargs kill 2>/dev/null || true
### Escalation replies (from Matrix)
If there are escalation replies from a human, act on them:
- "ignore X" note in journal, do not alert on X this run
- "kill that agent" identify and kill the referenced session
- "what's stuck?" include detailed status in the Matrix report
- Other instructions follow them, use best judgment
### Cannot auto-fix → escalate
For P0-P2 issues that persist after auto-fix attempts, or issues requiring
human judgment, prepare an escalation message for the report step.
Read the relevant best-practices file before taking action:
cat "$FACTORY_ROOT/supervisor/best-practices/memory.md" # P0
cat "$FACTORY_ROOT/supervisor/best-practices/disk.md" # P1
cat "$FACTORY_ROOT/supervisor/best-practices/ci.md" # P2 CI
cat "$FACTORY_ROOT/supervisor/best-practices/dev-agent.md" # P2 agent
cat "$FACTORY_ROOT/supervisor/best-practices/git.md" # P2 git
Track what you fixed and what needs escalation for the report step.
"""
needs = ["health-assessment"]
[[steps]]
id = "report"
title = "Post health summary to Matrix"
description = """
Post a status summary to Matrix. Use the matrix_send function:
source "$FACTORY_ROOT/lib/env.sh"
matrix_send "supervisor" "<message>"
### When everything is healthy
Post a brief "all clear" only if the PREVIOUS run had alerts (check journal).
Do NOT post "all clear" every 20 minutes that would be noise.
### When there are findings
Post a summary grouped by priority:
matrix_send "supervisor" "Supervisor health check:
Fixed:
- <what was auto-fixed>
Alerts:
- [P2] <description>
- [P3] <description>
Status: RAM=<X>MB Disk=<Y>% Load=<Z>"
### When escalation is needed (P0-P2 unresolved)
Escalate with a clear call to action:
matrix_send "supervisor" "ESCALATE: <what's wrong and why you can't fix it>
Suggested action: <what the human should do>"
### Responding to escalation replies
If you acted on a human's reply, confirm what you did:
matrix_send "supervisor" "Acted on your reply: <summary of action taken>"
Keep messages concise. Do not post identical messages to what was posted
in the previous run (check journal for prior messages).
"""
needs = ["decide-actions"]
[[steps]]
id = "journal"
title = "Write health journal entry"
description = """
Append a timestamped entry to the supervisor journal.
File path:
$FACTORY_ROOT/supervisor/journal/$(date -u +%Y-%m-%d).md
If the file already exists (multiple runs per day), append a new section.
If it does not exist, create it.
Format:
## Supervisor run — HH:MM UTC
### Health status
- RAM: <X>MB available, Swap: <X>MB
- Disk: <X>%
- Load: <X>
- Docker: <N> containers
### Findings
- [P<N>] <finding> <action taken or "escalated">
(or "No issues found — all systems healthy")
### Actions taken
- <what was fixed>
(or "No actions needed")
### Escalation replies processed
- <human said X, did Y>
(or "None")
Keep each entry concise 15-25 lines max. This journal provides
run-to-run context so future supervisor runs can detect trends
(e.g., "disk has been >75% for 3 consecutive runs").
IMPORTANT: Do NOT commit or push the journal it is a local working file.
The journal directory is committed to git periodically by other agents.
After writing the journal, write the phase signal:
echo 'PHASE:done' > '$PHASE_FILE'
"""
needs = ["report"]

View file

188
supervisor/preflight.sh Executable file
View file

@ -0,0 +1,188 @@
#!/usr/bin/env bash
# =============================================================================
# preflight.sh — Collect system and project metrics for the supervisor formula
#
# Outputs structured text to stdout. Called by supervisor-run.sh before
# launching the Claude session. The output is injected into the prompt.
#
# Usage:
# bash supervisor/preflight.sh [projects/disinto.toml]
# =============================================================================
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
FACTORY_ROOT="$(dirname "$SCRIPT_DIR")"
export PROJECT_TOML="${1:-$FACTORY_ROOT/projects/disinto.toml}"
# shellcheck source=../lib/env.sh
source "$FACTORY_ROOT/lib/env.sh"
# shellcheck source=../lib/ci-helpers.sh
source "$FACTORY_ROOT/lib/ci-helpers.sh"
# ── System Resources ─────────────────────────────────────────────────────
echo "## System Resources"
_avail_mb=$(free -m | awk '/Mem:/{print $7}')
_total_mb=$(free -m | awk '/Mem:/{print $2}')
_swap_used=$(free -m | awk '/Swap:/{print $3}')
_disk_pct=$(df -h / | awk 'NR==2{print $5}' | tr -d '%')
_disk_used=$(df -h / | awk 'NR==2{print $3}')
_disk_total=$(df -h / | awk 'NR==2{print $2}')
_load=$(cat /proc/loadavg 2>/dev/null || echo "unknown")
echo "RAM: ${_avail_mb}MB available / ${_total_mb}MB total, Swap: ${_swap_used}MB used"
echo "Disk: ${_disk_pct}% used (${_disk_used}/${_disk_total} on /)"
echo "Load: ${_load}"
echo ""
# ── Docker ────────────────────────────────────────────────────────────────
echo "## Docker"
if command -v docker &>/dev/null; then
docker ps --format 'table {{.Names}}\t{{.Status}}' 2>/dev/null || echo "Docker query failed"
else
echo "Docker not available"
fi
echo ""
# ── Active Sessions + Phase Files ─────────────────────────────────────────
echo "## Active Sessions"
if tmux list-sessions 2>/dev/null; then
:
else
echo "No tmux sessions"
fi
echo ""
echo "## Phase Files"
_found_phase=false
for _pf in /tmp/*-session-*.phase; do
[ -f "$_pf" ] || continue
_found_phase=true
_phase_content=$(head -1 "$_pf" 2>/dev/null || echo "unreadable")
_phase_age_min=$(( ($(date +%s) - $(stat -c %Y "$_pf" 2>/dev/null || echo 0)) / 60 ))
echo " $(basename "$_pf"): ${_phase_content} (${_phase_age_min}min ago)"
done
[ "$_found_phase" = false ] && echo " None"
echo ""
# ── Lock Files ────────────────────────────────────────────────────────────
echo "## Lock Files"
_found_lock=false
for _lf in /tmp/*-poll.lock /tmp/*-run.lock /tmp/dev-agent-*.lock; do
[ -f "$_lf" ] || continue
_found_lock=true
_pid=$(cat "$_lf" 2>/dev/null || true)
_age_min=$(( ($(date +%s) - $(stat -c %Y "$_lf" 2>/dev/null || echo 0)) / 60 ))
_alive="dead"
[ -n "${_pid:-}" ] && kill -0 "$_pid" 2>/dev/null && _alive="alive"
echo " $(basename "$_lf"): PID=${_pid:-?} ${_alive} age=${_age_min}min"
done
[ "$_found_lock" = false ] && echo " None"
echo ""
# ── Agent Logs (last 15 lines each) ──────────────────────────────────────
echo "## Recent Agent Logs"
for _log in supervisor/supervisor.log dev/dev-agent.log review/review.log \
gardener/gardener.log planner/planner.log predictor/predictor.log \
action/action.log; do
_logpath="${FACTORY_ROOT}/${_log}"
if [ -f "$_logpath" ]; then
_log_age_min=$(( ($(date +%s) - $(stat -c %Y "$_logpath" 2>/dev/null || echo 0)) / 60 ))
echo "### ${_log} (last modified ${_log_age_min}min ago)"
tail -15 "$_logpath" 2>/dev/null || echo "(read failed)"
echo ""
fi
done
# ── CI Pipelines ──────────────────────────────────────────────────────────
echo "## CI Pipelines (${PROJECT_NAME})"
_recent_ci=$(wpdb -A -c "
SELECT number, status, branch,
ROUND(EXTRACT(EPOCH FROM (to_timestamp(finished) - to_timestamp(started)))/60)::int as dur_min
FROM pipelines
WHERE repo_id = ${WOODPECKER_REPO_ID}
AND finished > 0
AND to_timestamp(finished) > now() - interval '24 hours'
ORDER BY number DESC LIMIT 10;" 2>/dev/null || echo "CI database query failed")
echo "$_recent_ci"
_stuck=$(wpdb -c "
SELECT count(*) FROM pipelines
WHERE repo_id=${WOODPECKER_REPO_ID}
AND status='running'
AND EXTRACT(EPOCH FROM now() - to_timestamp(started)) > 1200;" 2>/dev/null | xargs || echo "?")
_pending=$(wpdb -c "
SELECT count(*) FROM pipelines
WHERE repo_id=${WOODPECKER_REPO_ID}
AND status='pending'
AND EXTRACT(EPOCH FROM now() - to_timestamp(created)) > 1800;" 2>/dev/null | xargs || echo "?")
echo "Stuck (>20min): ${_stuck}"
echo "Pending (>30min): ${_pending}"
echo ""
# ── Open PRs ──────────────────────────────────────────────────────────────
echo "## Open PRs (${PROJECT_NAME})"
_open_prs=$(codeberg_api GET "/pulls?state=open&limit=10" 2>/dev/null || echo "[]")
echo "$_open_prs" | jq -r '.[] | "#\(.number) [\(.head.ref)] \(.title) — updated \(.updated_at)"' 2>/dev/null || echo "No open PRs or query failed"
echo ""
# ── Backlog + In-Progress ─────────────────────────────────────────────────
echo "## Issue Status (${PROJECT_NAME})"
_backlog_count=$(codeberg_api GET "/issues?state=open&labels=backlog&type=issues&limit=1" 2>/dev/null | jq 'length' 2>/dev/null || echo "?")
_in_progress_count=$(codeberg_api GET "/issues?state=open&labels=in-progress&type=issues&limit=1" 2>/dev/null | jq 'length' 2>/dev/null || echo "?")
_blocked_count=$(codeberg_api GET "/issues?state=open&labels=blocked&type=issues&limit=1" 2>/dev/null | jq 'length' 2>/dev/null || echo "?")
echo "Backlog: ${_backlog_count}, In-progress: ${_in_progress_count}, Blocked: ${_blocked_count}"
echo ""
# ── Stale Worktrees ───────────────────────────────────────────────────────
echo "## Stale Worktrees"
_found_wt=false
for _wt in /tmp/*-worktree-* /tmp/*-review-*; do
[ -d "$_wt" ] || continue
_found_wt=true
_wt_age_min=$(( ($(date +%s) - $(stat -c %Y "$_wt" 2>/dev/null || echo 0)) / 60 ))
echo " $(basename "$_wt"): ${_wt_age_min}min old"
done
[ "$_found_wt" = false ] && echo " None"
echo ""
# ── Pending Escalations ──────────────────────────────────────────────────
echo "## Pending Escalations"
_found_esc=false
for _esc_file in "${FACTORY_ROOT}/supervisor/escalations-"*.jsonl; do
[ -f "$_esc_file" ] || continue
[[ "$_esc_file" == *.done.jsonl ]] && continue
_esc_count=$(wc -l < "$_esc_file" 2>/dev/null || echo 0)
[ "${_esc_count:-0}" -gt 0 ] || continue
_found_esc=true
echo "### $(basename "$_esc_file") (${_esc_count} entries)"
cat "$_esc_file"
echo ""
done
[ "$_found_esc" = false ] && echo " None"
echo ""
# ── Escalation Replies from Matrix ────────────────────────────────────────
echo "## Escalation Replies (from Matrix)"
if [ -s /tmp/supervisor-escalation-reply ]; then
cat /tmp/supervisor-escalation-reply
echo ""
echo "(This reply will be consumed after this run)"
else
echo " None"
fi
echo ""

115
supervisor/supervisor-run.sh Executable file
View file

@ -0,0 +1,115 @@
#!/usr/bin/env bash
# =============================================================================
# supervisor-run.sh — Cron wrapper: supervisor execution via Claude + formula
#
# Runs every 20 minutes (or on-demand). Guards against concurrent runs and
# low memory. Collects metrics via preflight.sh, then creates a tmux session
# with Claude (sonnet) reading formulas/run-supervisor.toml.
#
# Replaces supervisor-poll.sh (bash orchestrator + claude -p one-shot) with
# formula-driven interactive Claude session matching the planner/predictor
# pattern.
#
# Usage:
# supervisor-run.sh [projects/disinto.toml] # project config (default: disinto)
#
# Cron: */20 * * * * cd /path/to/dark-factory && bash supervisor/supervisor-run.sh
# =============================================================================
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
FACTORY_ROOT="$(dirname "$SCRIPT_DIR")"
# Accept project config from argument; default to disinto
export PROJECT_TOML="${1:-$FACTORY_ROOT/projects/disinto.toml}"
# shellcheck source=../lib/env.sh
source "$FACTORY_ROOT/lib/env.sh"
# shellcheck source=../lib/agent-session.sh
source "$FACTORY_ROOT/lib/agent-session.sh"
# shellcheck source=../lib/formula-session.sh
source "$FACTORY_ROOT/lib/formula-session.sh"
LOG_FILE="$SCRIPT_DIR/supervisor.log"
# shellcheck disable=SC2034 # consumed by run_formula_and_monitor
SESSION_NAME="supervisor-${PROJECT_NAME}"
PHASE_FILE="/tmp/supervisor-session-${PROJECT_NAME}.phase"
# shellcheck disable=SC2034 # read by monitor_phase_loop in lib/agent-session.sh
PHASE_POLL_INTERVAL=15
SCRATCH_FILE="/tmp/supervisor-${PROJECT_NAME}-scratch.md"
log() { echo "[$(date -u +%Y-%m-%dT%H:%M:%S)Z] $*" >> "$LOG_FILE"; }
# ── Guards ────────────────────────────────────────────────────────────────
acquire_cron_lock "/tmp/supervisor-run.lock"
check_memory 2000
log "--- Supervisor run start ---"
# ── Collect pre-flight metrics ────────────────────────────────────────────
log "Running preflight.sh"
PREFLIGHT_OUTPUT=""
if PREFLIGHT_OUTPUT=$(bash "$SCRIPT_DIR/preflight.sh" "$PROJECT_TOML" 2>&1); then
log "Preflight collected ($(echo "$PREFLIGHT_OUTPUT" | wc -l) lines)"
else
log "WARNING: preflight.sh failed, continuing with partial data"
fi
# ── Consume escalation replies ────────────────────────────────────────────
# Move the file atomically so matrix_listener can write a new one
ESCALATION_REPLY=""
if [ -s /tmp/supervisor-escalation-reply ]; then
_reply_tmp="/tmp/supervisor-escalation-reply.consumed.$$"
if mv /tmp/supervisor-escalation-reply "$_reply_tmp" 2>/dev/null; then
ESCALATION_REPLY=$(cat "$_reply_tmp")
rm -f "$_reply_tmp"
log "Consumed escalation reply: $(echo "$ESCALATION_REPLY" | head -1)"
fi
fi
# ── Load formula + context ───────────────────────────────────────────────
load_formula "$FACTORY_ROOT/formulas/run-supervisor.toml"
build_context_block AGENTS.md
# ── Read scratch file (compaction survival) ───────────────────────────────
SCRATCH_CONTEXT=$(read_scratch_context "$SCRATCH_FILE")
SCRATCH_INSTRUCTION=$(build_scratch_instruction "$SCRATCH_FILE")
# ── Build prompt ─────────────────────────────────────────────────────────
build_prompt_footer
# shellcheck disable=SC2034 # consumed by run_formula_and_monitor
PROMPT="You are the supervisor agent for ${CODEBERG_REPO}. Work through the formula below. You MUST write PHASE:done to '${PHASE_FILE}' when finished — the orchestrator will time you out if you return to the prompt without signalling.
You have full shell access and --dangerously-skip-permissions.
Fix what you can. Escalate what you cannot. Do NOT ask permission — act first, report after.
## Pre-flight metrics (collected $(date -u +%H:%M) UTC)
${PREFLIGHT_OUTPUT}
${ESCALATION_REPLY:+
## Escalation Reply (from Matrix — human message)
${ESCALATION_REPLY}
Act on this reply in the decide-actions step.
}
## Project context
${CONTEXT_BLOCK}
${SCRATCH_CONTEXT:+${SCRATCH_CONTEXT}
}
## Formula
${FORMULA_CONTENT}
${SCRATCH_INSTRUCTION}
${PROMPT_FOOTER}"
# ── Run session ──────────────────────────────────────────────────────────
export CLAUDE_MODEL="sonnet"
run_formula_and_monitor "supervisor" 1200
# ── Cleanup scratch file on normal exit ──────────────────────────────────
# FINAL_PHASE already set by run_formula_and_monitor
if [ "${FINAL_PHASE:-}" = "PHASE:done" ]; then
rm -f "$SCRATCH_FILE"
fi