fix: Remove escalation — planner routes through vault instead (#721)

Remove ESCALATED signal and escalation handling from planner, supervisor,
and gardener. When blocked on external resources or human decisions, these
agents now file vault procurement items (vault/pending/*.md) instead of
escalating directly to the human.

Changes:
- Planner formula: ESCALATED signal replaced with HUMAN_BLOCKED; files
  vault items and marks prerequisites as blocked-on-vault
- Supervisor formula/prompt: escalation sections replaced with vault item
  filing; preflight now reports pending vault items instead of escalation
  replies
- Gardener formula: ESCALATE action replaced with VAULT action; files
  vault/pending/*.md for human decisions
- Groom-backlog formula: same ESCALATE→VAULT replacement
- Gardener shell: PHASE:escalate replaced with PHASE:failed for merge
  blocks and CI exhaustion; escalation reply consumption removed
- Supervisor shell: escalation reply consumption removed from both
  supervisor-run.sh and legacy supervisor-poll.sh
- Prerequisite tree: #466 updated from "escalated" to "blocked-on-vault"

The vault is the factory's only interface to the human for resources and
approvals. Dev/action agents retain PHASE:escalate for operational session
issues (CI timeouts, merge blocks) which are a different mechanism.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
openhands 2026-03-26 09:09:58 +00:00
parent 850a8d743f
commit f2064ba67c
11 changed files with 117 additions and 113 deletions

View file

@ -53,7 +53,7 @@ stuck. Your job: figure out the correct dependency direction and fix the wrong o
actually depends on which
3. Edit the issue that has the incorrect dep to remove the `#NNN` reference from its
`## Dependencies` section (replace with `- None` if it was the only dep)
4. If the correct direction is unclear from code, escalate with both issue summaries
4. If the correct direction is unclear from code, file a vault item with both issue summaries
Use the forge API to edit issue bodies:
```bash
@ -70,25 +70,35 @@ obsolete or misprioritized. Investigate:
1. Check if dep #B is still relevant (read its body, check if the code it targets changed)
2. If the dep is obsolete → remove it from #A's `## Dependencies` section
3. If the dep is still needed → escalate, suggesting to prioritize #B or split #A
3. If the dep is still needed → file a vault item, suggesting to prioritize #B or split #A
### Dev-agent blocked (P2)
When you see "Dev-agent blocked: last N polls all report 'no ready issues'":
1. Check if circular deps exist (they'll appear as separate P3 alerts)
2. Check if all backlog issues depend on a single unmerged issue — if so, escalate
to prioritize that blocker
3. If no clear blocker, escalate with the list of blocked issues and their deps
2. Check if all backlog issues depend on a single unmerged issue — if so, file a vault
item to prioritize that blocker
3. If no clear blocker, file a vault item with the list of blocked issues and their deps
## Escalation
## When you cannot fix it
If you can't fix it, escalate via Matrix:
File a vault procurement item so the human is notified through the vault:
```bash
source ${FACTORY_ROOT}/lib/env.sh
matrix_send "supervisor" "🏭 ESCALATE: <what's wrong and why you can't fix it>"
cat > "${PROJECT_REPO_ROOT}/vault/pending/supervisor-$(date -u +%Y%m%d-%H%M)-issue.md" <<'VAULT_EOF'
# <What is needed>
## What
<description of the problem and why the supervisor cannot fix it>
## Why
<impact on factory health>
## Unblocks
- Factory health: <what this resolves>
VAULT_EOF
```
Do NOT escalate if you can fix it. Do NOT ask permission. Fix first, report after.
The vault-poll will notify the human and track the request.
Do NOT talk to the human directly. The vault is the factory's only interface
to the human for resources and approvals. Fix first, report after.
## Output
@ -97,7 +107,7 @@ FIXED: <what you did>
```
or
```
ESCALATE: <what's wrong>
VAULT: filed vault/pending/<id>.md — <what's needed>
```
## Learning

View file

@ -214,14 +214,15 @@ else
fi
echo ""
# ── Escalation Replies from Matrix ────────────────────────────────────────
# ── Pending Vault Items ───────────────────────────────────────────────────
echo "## Escalation Replies (from Matrix)"
if [ -s /tmp/supervisor-escalation-reply ]; then
cat /tmp/supervisor-escalation-reply
echo ""
echo "(Reply already consumed by supervisor-run.sh before this session)"
else
echo " None"
fi
echo "## Pending Vault Items"
_found_vault=false
for _vf in "${PROJECT_REPO_ROOT}/vault/pending/"*.md; do
[ -f "$_vf" ] || continue
_found_vault=true
_vtitle=$(grep -m1 '^# ' "$_vf" | sed 's/^# //' || basename "$_vf")
echo " $(basename "$_vf"): ${_vtitle}"
done
[ "$_found_vault" = false ] && echo " None"
echo ""

View file

@ -80,14 +80,6 @@ status() {
flog "$*"
}
# ── Check for escalation replies from Matrix ──────────────────────────────
ESCALATION_REPLY=""
if [ -s /tmp/supervisor-escalation-reply ]; then
ESCALATION_REPLY=$(cat /tmp/supervisor-escalation-reply)
rm -f /tmp/supervisor-escalation-reply
flog "Got escalation reply: $(echo "$ESCALATION_REPLY" | head -1)"
fi
# Alerts by priority
P0_ALERTS=""
P1_ALERTS=""
@ -813,13 +805,7 @@ Disk: $(df -h / | awk 'NR==2{printf "%s used of %s (%s)", $3, $2, $5}')
Docker: $(sudo docker ps --format '{{.Names}}' 2>/dev/null | wc -l) containers running
Claude procs: $(pgrep -f "claude" 2>/dev/null | wc -l)
$(if [ -n "$ESCALATION_REPLY" ]; then echo "
## Human Response to Previous Escalation
${ESCALATION_REPLY}
Act on this response."; fi)
Fix what you can. Escalate what you can't. Read the relevant best-practices file first."
Fix what you can. File vault items for what you can't. Read the relevant best-practices file first."
CLAUDE_OUTPUT=$(timeout 300 claude -p --model sonnet --dangerously-skip-permissions \
"$CLAUDE_PROMPT" 2>&1) || true

View file

@ -59,9 +59,6 @@ else
log "WARNING: preflight.sh failed, continuing with partial data"
fi
# ── Consume escalation replies ────────────────────────────────────────────
consume_escalation_reply "supervisor"
# ── Load formula + context ───────────────────────────────────────────────
load_formula "$FACTORY_ROOT/formulas/run-supervisor.toml"
build_context_block AGENTS.md
@ -77,16 +74,11 @@ build_prompt_footer
PROMPT="You are the supervisor agent for ${FORGE_REPO}. Work through the formula below. You MUST write PHASE:done to '${PHASE_FILE}' when finished — the orchestrator will time you out if you return to the prompt without signalling.
You have full shell access and --dangerously-skip-permissions.
Fix what you can. Escalate what you cannot. Do NOT ask permission — act first, report after.
Fix what you can. File vault items for what you cannot. Do NOT ask permission — act first, report after.
## Pre-flight metrics (collected $(date -u +%H:%M) UTC)
${PREFLIGHT_OUTPUT}
${ESCALATION_REPLY:+
## Escalation Reply (from Matrix — human message)
${ESCALATION_REPLY}
Act on this reply in the decide-actions step.
}
## Project context
${CONTEXT_BLOCK}
${SCRATCH_CONTEXT:+${SCRATCH_CONTEXT}