fix: Remove escalation — planner routes through vault instead (#721)

Remove ESCALATED signal and escalation handling from planner, supervisor,
and gardener. When blocked on external resources or human decisions, these
agents now file vault procurement items (vault/pending/*.md) instead of
escalating directly to the human.

Changes:
- Planner formula: ESCALATED signal replaced with HUMAN_BLOCKED; files
  vault items and marks prerequisites as blocked-on-vault
- Supervisor formula/prompt: escalation sections replaced with vault item
  filing; preflight now reports pending vault items instead of escalation
  replies
- Gardener formula: ESCALATE action replaced with VAULT action; files
  vault/pending/*.md for human decisions
- Groom-backlog formula: same ESCALATE→VAULT replacement
- Gardener shell: PHASE:escalate replaced with PHASE:failed for merge
  blocks and CI exhaustion; escalation reply consumption removed
- Supervisor shell: escalation reply consumption removed from both
  supervisor-run.sh and legacy supervisor-poll.sh
- Prerequisite tree: #466 updated from "escalated" to "blocked-on-vault"

The vault is the factory's only interface to the human for resources and
approvals. Dev/action agents retain PHASE:escalate for operational session
issues (CI timeouts, merge blocks) which are a different mechanism.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
openhands 2026-03-26 09:09:58 +00:00
parent 850a8d743f
commit f2064ba67c
11 changed files with 117 additions and 113 deletions

View file

@ -96,7 +96,7 @@ The dev-agent is completely starved until they are promoted or resolved.
For each tier-0 issue:
- Read the full body: curl -sf -H "Authorization: token $FORGE_TOKEN" "$FORGE_API/issues/{number}"
- If resolvable: promote to backlog add acceptance criteria, affected files, relabel
- If needs human decision: add to ESCALATE block
- If needs human decision: file a vault procurement item (vault/pending/<id>.md)
- If invalid / wontfix: close with explanation comment
After completing all tier-0, re-fetch to check for new blockers:
@ -135,8 +135,16 @@ DUPLICATE (>80% overlap after reading both bodies — confirm before closing):
Close: curl -X PATCH ... /issues/NNN -d '{"state":"closed"}'
Write: echo "ACTION: closed #NNN as duplicate of #OLDER" >> "$RESULT_FILE"
ESCALATE (ambiguous scope, architectural question, needs human decision):
Collect into the ESCALATE block written to the result file at the end.
VAULT (ambiguous scope, architectural question, needs human decision):
File a vault procurement item at $PROJECT_REPO_ROOT/vault/pending/<id>.md:
# <What decision or resource is needed>
## What
<description>
## Why
<which issue this unblocks>
## Unblocks
- #NNN — <title>
Log: echo "VAULT: filed vault/pending/<id>.md for #NNN — <reason>" >> "$RESULT_FILE"
Dust vs ore rules:
Dust: comment fix, variable rename, whitespace/formatting, single-line edit, trivial cleanup with no behavior change
@ -179,7 +187,7 @@ Re-fetch ALL open tech-debt issues and count them:
Check each tier:
tier-0 count == 0 (HARD REQUIREMENT factory is blocked until zero)
tier-1 all processed or escalated
tier-1 all processed or routed to vault
tier-2 all classified
If tier-0 > 0:
@ -195,8 +203,7 @@ If all tiers clear, write the completion summary and signal done:
echo "ACTION: grooming complete — 0 tech-debt remaining" >> "$RESULT_FILE"
echo 'PHASE:done' > "$PHASE_FILE"
Escalation format (for items needing human decision write to result file):
printf 'ESCALATE\n1. #NNN "title" — reason (a) option1 (b) option2 (c) option3\n' >> "$RESULT_FILE"
Vault items filed during this run are picked up by vault-poll automatically.
On unrecoverable error (API unavailable, repeated failures):
printf 'PHASE:failed\nReason: %s\n' 'describe what failed' > "$PHASE_FILE"

View file

@ -119,8 +119,16 @@ DUST (trivial — single-line edit, rename, comment, style, whitespace):
Do NOT close dust issues the dust-bundling step auto-bundles groups
of 3+ into one backlog issue.
ESCALATE (needs human decision):
printf 'ESCALATE\n1. #NNN "title" — reason (a) option1 (b) option2\n' >> "$RESULT_FILE"
VAULT (needs human decision or external resource):
File a vault procurement item at $PROJECT_REPO_ROOT/vault/pending/<id>.md:
# <What decision or resource is needed>
## What
<description>
## Why
<which issue this unblocks>
## Unblocks
- #NNN — <title>
Log: echo "VAULT: filed vault/pending/<id>.md for #NNN — <reason>" >> "$RESULT_FILE"
CLEAN (only if truly nothing to do):
echo 'CLEAN' >> "$RESULT_FILE"
@ -150,7 +158,7 @@ Sibling dependency rule (CRITICAL):
Only close for clear, unambiguous violations. If the issue is
borderline or could be interpreted as compatible, leave it open
and ESCALATE instead.
and file a VAULT item for human decision instead.
8. Quality gate backlog label enforcement:
For each open issue labeled 'backlog', verify it has the required
@ -178,7 +186,7 @@ Processing order:
2. AD alignment check close backlog issues that violate architecture decisions
3. Quality gate strip backlog from issues missing acceptance criteria or affected files
4. Process tech-debt issues by score (impact/effort)
5. Classify remaining items as dust or escalate
5. Classify remaining items as dust or route to vault
Do NOT bundle dust yourself the dust-bundling step handles accumulation,
dedup, TTL expiry, and bundling into backlog issues.

View file

@ -123,8 +123,9 @@ Update the tree:
Bounce/stuck detection for issues in the tree, fetch recent comments:
curl -sf -H "Authorization: token $FORGE_TOKEN" \
"$FORGE_API/issues/<number>/comments?limit=10"
Signals: BOUNCED (too_large, underspecified), ESCALATED (needs human decision),
Signals: BOUNCED (too_large, underspecified),
LABEL_CHURN (3+ relabels between backlog/underspecified).
If an issue needs a human decision or external resource, it is HUMAN_BLOCKED.
Track as stuck_issues[] for constraint filing below.
Hold the updated tree in memory written to disk in journal-and-commit.
@ -148,7 +149,17 @@ Graph bottlenecks (high betweenness centrality) and thin objectives inform ranki
Stuck issue handling:
- BOUNCED/LABEL_CHURN: do NOT re-promote. Dispatch groom-backlog formula instead:
tea_file_issue "chore: break down #<N> — bounced <count>x" "<body>" "action"
- ESCALATED: skip, mark in tree as "escalated — awaiting human decision"
- HUMAN_BLOCKED (needs human decision or external resource): file a vault
procurement item instead of skipping. Write vault/pending/<resource-id>.md:
# <What is needed>
## What
<description of the resource or decision needed>
## Why
<which objective/issue this unblocks>
## Unblocks
- #<issue> — <title>
Then mark the prerequisite in the tree as "blocked-on-vault (vault/pending/<id>.md)".
Do NOT skip or mark as "awaiting human decision" the vault owns the human interface.
Filing gate (for non-stuck constraints):
1. Check if issue already exists (match by #number in tree or title search)

View file

@ -9,7 +9,7 @@
# Key differences from planner/gardener:
# - Runs every 20min — lightweight health check
# - Primarily READS state, rarely WRITES (no PRs, just Matrix + journal)
# - Reactive to escalations — processes pending escalation events
# - Checks vault state for pending procurement items
# - Conversation memory via Matrix thread and journal
name = "run-supervisor"
@ -29,14 +29,14 @@ and injected into your prompt above. Review them now.
1. Read the injected metrics data carefully (System Resources, Docker,
Active Sessions, Phase Files, Stale Phase Cleanup, Lock Files, Agent Logs,
CI Pipelines, Open PRs, Issue Status, Stale Worktrees, Pending Escalations,
Escalation Replies).
CI Pipelines, Open PRs, Issue Status, Stale Worktrees).
Note: preflight.sh auto-removes PHASE:escalate files for closed issues
(24h grace period). Check the "Stale Phase Cleanup" section for any
files cleaned or in grace period this run.
2. If there are escalation replies from Matrix (human messages), note them
you will act on them in the decide-actions step.
2. Check vault state: read vault/pending/*.md for any procurement items
the planner has filed. Note items relevant to the health assessment
(e.g. a blocked resource that explains why the pipeline is stalled).
3. Read the supervisor journal for recent history:
JOURNAL_FILE="$FACTORY_ROOT/supervisor/journal/$(date -u +%Y-%m-%d).md"
@ -70,9 +70,9 @@ Categorize every finding from the metrics into priority levels.
- Git repo on wrong branch or in broken rebase state
- Pipeline stalled: backlog issues exist but no agent ran for > 20min
- Dev-agent blocked: last N polls all report "no ready issues"
- Dev/action sessions in PHASE:escalate for > 24h (escalation timeout)
- Dev/action sessions in PHASE:escalate for > 24h (session timeout)
(Note: PHASE:escalate files for closed issues are auto-cleaned by preflight;
this check covers escalations where the issue is still open)
this check covers sessions where the issue is still open)
### P3 — Factory degraded
- PRs stale: CI finished >20min ago AND no git push to the PR branch since CI completed
@ -92,7 +92,7 @@ needs = ["preflight"]
[[steps]]
id = "decide-actions"
title = "Fix what you can, escalate what you cannot"
title = "Fix what you can, file vault items for what you cannot"
description = """
For each finding from the health assessment, decide and execute an action.
@ -145,20 +145,21 @@ For each finding from the health assessment, decide and execute an action.
tmux send-keys -t "$SESSION" "# [supervisor] PR stale >20min — CI finished, please push or update" Enter
fi
If no active tmux session exists, note it in the journal for the next dev-poll cycle.
Do NOT escalate stale PRs to Matrix unless they remain stale for >3 consecutive runs.
Do NOT file vault items for stale PRs unless they remain stale for >3 consecutive runs.
### Escalation replies (from Matrix)
If there are escalation replies from a human, act on them:
- "ignore X" note in journal, do not alert on X this run
- "kill that agent" identify and kill the referenced session
- "what's stuck?" include detailed status in the Matrix report
- Other instructions follow them, use best judgment
### Cannot auto-fix → escalate
### Cannot auto-fix → file vault item
For P0-P2 issues that persist after auto-fix attempts, or issues requiring
human judgment, prepare an escalation message for the report step.
human judgment, file a vault procurement item:
Write $PROJECT_REPO_ROOT/vault/pending/supervisor-<issue-slug>.md:
# <What is needed>
## What
<description of the problem and why the supervisor cannot fix it>
## Why
<impact on factory health reference the priority level>
## Unblocks
- Factory health: <what this resolves>
The vault-poll will notify the human and track the request.
Read the relevant best-practices file before taking action:
cat "$FACTORY_ROOT/supervisor/best-practices/memory.md" # P0
@ -167,7 +168,7 @@ Read the relevant best-practices file before taking action:
cat "$FACTORY_ROOT/supervisor/best-practices/dev-agent.md" # P2 agent
cat "$FACTORY_ROOT/supervisor/best-practices/git.md" # P2 git
Track what you fixed and what needs escalation for the report step.
Track what you fixed and what vault items you filed for the report step.
"""
needs = ["health-assessment"]
@ -196,15 +197,14 @@ Post a summary grouped by priority:
Status: RAM=<X>MB Disk=<Y>% Load=<Z>"
### When escalation is needed (P0-P2 unresolved)
Escalate with a clear call to action:
matrix_send "supervisor" "ESCALATE: <what's wrong and why you can't fix it>
### When vault items were filed (P0-P2 unresolved)
Note the vault items in the status summary:
matrix_send "supervisor" "Supervisor health check:
Suggested action: <what the human should do>"
Filed vault items:
- vault/pending/<id>.md <summary>
### Responding to escalation replies
If you acted on a human's reply, confirm what you did:
matrix_send "supervisor" "Acted on your reply: <summary of action taken>"
Status: RAM=<X>MB Disk=<Y>% Load=<Z>"
Keep messages concise. Do not post identical messages to what was posted
in the previous run (check journal for prior messages).
@ -233,15 +233,15 @@ Format:
- Docker: <N> containers
### Findings
- [P<N>] <finding> <action taken or "escalated">
- [P<N>] <finding> <action taken or "filed vault item">
(or "No issues found — all systems healthy")
### Actions taken
- <what was fixed>
(or "No actions needed")
### Escalation replies processed
- <human said X, did Y>
### Vault items filed
- vault/pending/<id>.md <reason>
(or "None")
Keep each entry concise 15-25 lines max. This journal provides