fix: Remove escalation — planner routes through vault instead (#721)
Remove ESCALATED signal and escalation handling from planner, supervisor, and gardener. When blocked on external resources or human decisions, these agents now file vault procurement items (vault/pending/*.md) instead of escalating directly to the human. Changes: - Planner formula: ESCALATED signal replaced with HUMAN_BLOCKED; files vault items and marks prerequisites as blocked-on-vault - Supervisor formula/prompt: escalation sections replaced with vault item filing; preflight now reports pending vault items instead of escalation replies - Gardener formula: ESCALATE action replaced with VAULT action; files vault/pending/*.md for human decisions - Groom-backlog formula: same ESCALATE→VAULT replacement - Gardener shell: PHASE:escalate replaced with PHASE:failed for merge blocks and CI exhaustion; escalation reply consumption removed - Supervisor shell: escalation reply consumption removed from both supervisor-run.sh and legacy supervisor-poll.sh - Prerequisite tree: #466 updated from "escalated" to "blocked-on-vault" The vault is the factory's only interface to the human for resources and approvals. Dev/action agents retain PHASE:escalate for operational session issues (CI timeouts, merge blocks) which are a different mechanism. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
850a8d743f
commit
f2064ba67c
11 changed files with 117 additions and 113 deletions
|
|
@ -96,7 +96,7 @@ The dev-agent is completely starved until they are promoted or resolved.
|
|||
For each tier-0 issue:
|
||||
- Read the full body: curl -sf -H "Authorization: token $FORGE_TOKEN" "$FORGE_API/issues/{number}"
|
||||
- If resolvable: promote to backlog — add acceptance criteria, affected files, relabel
|
||||
- If needs human decision: add to ESCALATE block
|
||||
- If needs human decision: file a vault procurement item (vault/pending/<id>.md)
|
||||
- If invalid / wontfix: close with explanation comment
|
||||
|
||||
After completing all tier-0, re-fetch to check for new blockers:
|
||||
|
|
@ -135,8 +135,16 @@ DUPLICATE (>80% overlap after reading both bodies — confirm before closing):
|
|||
Close: curl -X PATCH ... /issues/NNN -d '{"state":"closed"}'
|
||||
Write: echo "ACTION: closed #NNN as duplicate of #OLDER" >> "$RESULT_FILE"
|
||||
|
||||
ESCALATE (ambiguous scope, architectural question, needs human decision):
|
||||
Collect into the ESCALATE block written to the result file at the end.
|
||||
VAULT (ambiguous scope, architectural question, needs human decision):
|
||||
File a vault procurement item at $PROJECT_REPO_ROOT/vault/pending/<id>.md:
|
||||
# <What decision or resource is needed>
|
||||
## What
|
||||
<description>
|
||||
## Why
|
||||
<which issue this unblocks>
|
||||
## Unblocks
|
||||
- #NNN — <title>
|
||||
Log: echo "VAULT: filed vault/pending/<id>.md for #NNN — <reason>" >> "$RESULT_FILE"
|
||||
|
||||
Dust vs ore rules:
|
||||
Dust: comment fix, variable rename, whitespace/formatting, single-line edit, trivial cleanup with no behavior change
|
||||
|
|
@ -179,7 +187,7 @@ Re-fetch ALL open tech-debt issues and count them:
|
|||
|
||||
Check each tier:
|
||||
tier-0 count == 0 (HARD REQUIREMENT — factory is blocked until zero)
|
||||
tier-1 all processed or escalated
|
||||
tier-1 all processed or routed to vault
|
||||
tier-2 all classified
|
||||
|
||||
If tier-0 > 0:
|
||||
|
|
@ -195,8 +203,7 @@ If all tiers clear, write the completion summary and signal done:
|
|||
echo "ACTION: grooming complete — 0 tech-debt remaining" >> "$RESULT_FILE"
|
||||
echo 'PHASE:done' > "$PHASE_FILE"
|
||||
|
||||
Escalation format (for items needing human decision — write to result file):
|
||||
printf 'ESCALATE\n1. #NNN "title" — reason (a) option1 (b) option2 (c) option3\n' >> "$RESULT_FILE"
|
||||
Vault items filed during this run are picked up by vault-poll automatically.
|
||||
|
||||
On unrecoverable error (API unavailable, repeated failures):
|
||||
printf 'PHASE:failed\nReason: %s\n' 'describe what failed' > "$PHASE_FILE"
|
||||
|
|
|
|||
|
|
@ -119,8 +119,16 @@ DUST (trivial — single-line edit, rename, comment, style, whitespace):
|
|||
Do NOT close dust issues — the dust-bundling step auto-bundles groups
|
||||
of 3+ into one backlog issue.
|
||||
|
||||
ESCALATE (needs human decision):
|
||||
printf 'ESCALATE\n1. #NNN "title" — reason (a) option1 (b) option2\n' >> "$RESULT_FILE"
|
||||
VAULT (needs human decision or external resource):
|
||||
File a vault procurement item at $PROJECT_REPO_ROOT/vault/pending/<id>.md:
|
||||
# <What decision or resource is needed>
|
||||
## What
|
||||
<description>
|
||||
## Why
|
||||
<which issue this unblocks>
|
||||
## Unblocks
|
||||
- #NNN — <title>
|
||||
Log: echo "VAULT: filed vault/pending/<id>.md for #NNN — <reason>" >> "$RESULT_FILE"
|
||||
|
||||
CLEAN (only if truly nothing to do):
|
||||
echo 'CLEAN' >> "$RESULT_FILE"
|
||||
|
|
@ -150,7 +158,7 @@ Sibling dependency rule (CRITICAL):
|
|||
|
||||
Only close for clear, unambiguous violations. If the issue is
|
||||
borderline or could be interpreted as compatible, leave it open
|
||||
and ESCALATE instead.
|
||||
and file a VAULT item for human decision instead.
|
||||
|
||||
8. Quality gate — backlog label enforcement:
|
||||
For each open issue labeled 'backlog', verify it has the required
|
||||
|
|
@ -178,7 +186,7 @@ Processing order:
|
|||
2. AD alignment check — close backlog issues that violate architecture decisions
|
||||
3. Quality gate — strip backlog from issues missing acceptance criteria or affected files
|
||||
4. Process tech-debt issues by score (impact/effort)
|
||||
5. Classify remaining items as dust or escalate
|
||||
5. Classify remaining items as dust or route to vault
|
||||
|
||||
Do NOT bundle dust yourself — the dust-bundling step handles accumulation,
|
||||
dedup, TTL expiry, and bundling into backlog issues.
|
||||
|
|
|
|||
|
|
@ -123,8 +123,9 @@ Update the tree:
|
|||
Bounce/stuck detection — for issues in the tree, fetch recent comments:
|
||||
curl -sf -H "Authorization: token $FORGE_TOKEN" \
|
||||
"$FORGE_API/issues/<number>/comments?limit=10"
|
||||
Signals: BOUNCED (too_large, underspecified), ESCALATED (needs human decision),
|
||||
Signals: BOUNCED (too_large, underspecified),
|
||||
LABEL_CHURN (3+ relabels between backlog/underspecified).
|
||||
If an issue needs a human decision or external resource, it is HUMAN_BLOCKED.
|
||||
Track as stuck_issues[] for constraint filing below.
|
||||
|
||||
Hold the updated tree in memory — written to disk in journal-and-commit.
|
||||
|
|
@ -148,7 +149,17 @@ Graph bottlenecks (high betweenness centrality) and thin objectives inform ranki
|
|||
Stuck issue handling:
|
||||
- BOUNCED/LABEL_CHURN: do NOT re-promote. Dispatch groom-backlog formula instead:
|
||||
tea_file_issue "chore: break down #<N> — bounced <count>x" "<body>" "action"
|
||||
- ESCALATED: skip, mark in tree as "escalated — awaiting human decision"
|
||||
- HUMAN_BLOCKED (needs human decision or external resource): file a vault
|
||||
procurement item instead of skipping. Write vault/pending/<resource-id>.md:
|
||||
# <What is needed>
|
||||
## What
|
||||
<description of the resource or decision needed>
|
||||
## Why
|
||||
<which objective/issue this unblocks>
|
||||
## Unblocks
|
||||
- #<issue> — <title>
|
||||
Then mark the prerequisite in the tree as "blocked-on-vault (vault/pending/<id>.md)".
|
||||
Do NOT skip or mark as "awaiting human decision" — the vault owns the human interface.
|
||||
|
||||
Filing gate (for non-stuck constraints):
|
||||
1. Check if issue already exists (match by #number in tree or title search)
|
||||
|
|
|
|||
|
|
@ -9,7 +9,7 @@
|
|||
# Key differences from planner/gardener:
|
||||
# - Runs every 20min — lightweight health check
|
||||
# - Primarily READS state, rarely WRITES (no PRs, just Matrix + journal)
|
||||
# - Reactive to escalations — processes pending escalation events
|
||||
# - Checks vault state for pending procurement items
|
||||
# - Conversation memory via Matrix thread and journal
|
||||
|
||||
name = "run-supervisor"
|
||||
|
|
@ -29,14 +29,14 @@ and injected into your prompt above. Review them now.
|
|||
|
||||
1. Read the injected metrics data carefully (System Resources, Docker,
|
||||
Active Sessions, Phase Files, Stale Phase Cleanup, Lock Files, Agent Logs,
|
||||
CI Pipelines, Open PRs, Issue Status, Stale Worktrees, Pending Escalations,
|
||||
Escalation Replies).
|
||||
CI Pipelines, Open PRs, Issue Status, Stale Worktrees).
|
||||
Note: preflight.sh auto-removes PHASE:escalate files for closed issues
|
||||
(24h grace period). Check the "Stale Phase Cleanup" section for any
|
||||
files cleaned or in grace period this run.
|
||||
|
||||
2. If there are escalation replies from Matrix (human messages), note them —
|
||||
you will act on them in the decide-actions step.
|
||||
2. Check vault state: read vault/pending/*.md for any procurement items
|
||||
the planner has filed. Note items relevant to the health assessment
|
||||
(e.g. a blocked resource that explains why the pipeline is stalled).
|
||||
|
||||
3. Read the supervisor journal for recent history:
|
||||
JOURNAL_FILE="$FACTORY_ROOT/supervisor/journal/$(date -u +%Y-%m-%d).md"
|
||||
|
|
@ -70,9 +70,9 @@ Categorize every finding from the metrics into priority levels.
|
|||
- Git repo on wrong branch or in broken rebase state
|
||||
- Pipeline stalled: backlog issues exist but no agent ran for > 20min
|
||||
- Dev-agent blocked: last N polls all report "no ready issues"
|
||||
- Dev/action sessions in PHASE:escalate for > 24h (escalation timeout)
|
||||
- Dev/action sessions in PHASE:escalate for > 24h (session timeout)
|
||||
(Note: PHASE:escalate files for closed issues are auto-cleaned by preflight;
|
||||
this check covers escalations where the issue is still open)
|
||||
this check covers sessions where the issue is still open)
|
||||
|
||||
### P3 — Factory degraded
|
||||
- PRs stale: CI finished >20min ago AND no git push to the PR branch since CI completed
|
||||
|
|
@ -92,7 +92,7 @@ needs = ["preflight"]
|
|||
|
||||
[[steps]]
|
||||
id = "decide-actions"
|
||||
title = "Fix what you can, escalate what you cannot"
|
||||
title = "Fix what you can, file vault items for what you cannot"
|
||||
description = """
|
||||
For each finding from the health assessment, decide and execute an action.
|
||||
|
||||
|
|
@ -145,20 +145,21 @@ For each finding from the health assessment, decide and execute an action.
|
|||
tmux send-keys -t "$SESSION" "# [supervisor] PR stale >20min — CI finished, please push or update" Enter
|
||||
fi
|
||||
If no active tmux session exists, note it in the journal for the next dev-poll cycle.
|
||||
Do NOT escalate stale PRs to Matrix unless they remain stale for >3 consecutive runs.
|
||||
Do NOT file vault items for stale PRs unless they remain stale for >3 consecutive runs.
|
||||
|
||||
### Escalation replies (from Matrix)
|
||||
|
||||
If there are escalation replies from a human, act on them:
|
||||
- "ignore X" → note in journal, do not alert on X this run
|
||||
- "kill that agent" → identify and kill the referenced session
|
||||
- "what's stuck?" → include detailed status in the Matrix report
|
||||
- Other instructions → follow them, use best judgment
|
||||
|
||||
### Cannot auto-fix → escalate
|
||||
### Cannot auto-fix → file vault item
|
||||
|
||||
For P0-P2 issues that persist after auto-fix attempts, or issues requiring
|
||||
human judgment, prepare an escalation message for the report step.
|
||||
human judgment, file a vault procurement item:
|
||||
Write $PROJECT_REPO_ROOT/vault/pending/supervisor-<issue-slug>.md:
|
||||
# <What is needed>
|
||||
## What
|
||||
<description of the problem and why the supervisor cannot fix it>
|
||||
## Why
|
||||
<impact on factory health — reference the priority level>
|
||||
## Unblocks
|
||||
- Factory health: <what this resolves>
|
||||
The vault-poll will notify the human and track the request.
|
||||
|
||||
Read the relevant best-practices file before taking action:
|
||||
cat "$FACTORY_ROOT/supervisor/best-practices/memory.md" # P0
|
||||
|
|
@ -167,7 +168,7 @@ Read the relevant best-practices file before taking action:
|
|||
cat "$FACTORY_ROOT/supervisor/best-practices/dev-agent.md" # P2 agent
|
||||
cat "$FACTORY_ROOT/supervisor/best-practices/git.md" # P2 git
|
||||
|
||||
Track what you fixed and what needs escalation for the report step.
|
||||
Track what you fixed and what vault items you filed for the report step.
|
||||
"""
|
||||
needs = ["health-assessment"]
|
||||
|
||||
|
|
@ -196,15 +197,14 @@ Post a summary grouped by priority:
|
|||
|
||||
Status: RAM=<X>MB Disk=<Y>% Load=<Z>"
|
||||
|
||||
### When escalation is needed (P0-P2 unresolved)
|
||||
Escalate with a clear call to action:
|
||||
matrix_send "supervisor" "ESCALATE: <what's wrong and why you can't fix it>
|
||||
### When vault items were filed (P0-P2 unresolved)
|
||||
Note the vault items in the status summary:
|
||||
matrix_send "supervisor" "Supervisor health check:
|
||||
|
||||
Suggested action: <what the human should do>"
|
||||
Filed vault items:
|
||||
- vault/pending/<id>.md — <summary>
|
||||
|
||||
### Responding to escalation replies
|
||||
If you acted on a human's reply, confirm what you did:
|
||||
matrix_send "supervisor" "Acted on your reply: <summary of action taken>"
|
||||
Status: RAM=<X>MB Disk=<Y>% Load=<Z>"
|
||||
|
||||
Keep messages concise. Do not post identical messages to what was posted
|
||||
in the previous run (check journal for prior messages).
|
||||
|
|
@ -233,15 +233,15 @@ Format:
|
|||
- Docker: <N> containers
|
||||
|
||||
### Findings
|
||||
- [P<N>] <finding> — <action taken or "escalated">
|
||||
- [P<N>] <finding> — <action taken or "filed vault item">
|
||||
(or "No issues found — all systems healthy")
|
||||
|
||||
### Actions taken
|
||||
- <what was fixed>
|
||||
(or "No actions needed")
|
||||
|
||||
### Escalation replies processed
|
||||
- <human said X, did Y>
|
||||
### Vault items filed
|
||||
- vault/pending/<id>.md — <reason>
|
||||
(or "None")
|
||||
|
||||
Keep each entry concise — 15-25 lines max. This journal provides
|
||||
|
|
|
|||
|
|
@ -60,9 +60,6 @@ check_memory 2000
|
|||
|
||||
log "--- Gardener run start ---"
|
||||
|
||||
# ── Consume escalation replies ────────────────────────────────────────────
|
||||
consume_escalation_reply "gardener"
|
||||
|
||||
# ── Load formula + context ───────────────────────────────────────────────
|
||||
load_formula "$FACTORY_ROOT/formulas/run-gardener.toml"
|
||||
build_context_block AGENTS.md
|
||||
|
|
@ -114,13 +111,8 @@ If no file changes in commit-and-pr:
|
|||
PROMPT="You are the issue gardener for ${FORGE_REPO}. Work through the formula below. Follow the phase protocol: if the commit-and-pr step creates a PR, write PHASE:awaiting_ci and wait for orchestrator CI/review/merge handling. If no file changes, write PHASE:done. The orchestrator will time you out if you return to the prompt without signalling.
|
||||
|
||||
You have full shell access and --dangerously-skip-permissions.
|
||||
Fix what you can. Escalate what you cannot. Do NOT ask permission — act first, report after.
|
||||
${ESCALATION_REPLY:+
|
||||
## Escalation Reply (from Matrix — human message)
|
||||
${ESCALATION_REPLY}
|
||||
Fix what you can. File vault items for what you cannot. Do NOT ask permission — act first, report after.
|
||||
|
||||
Act on this reply during the grooming step.
|
||||
}
|
||||
## Project context
|
||||
${CONTEXT_BLOCK}
|
||||
${SCRATCH_CONTEXT:+${SCRATCH_CONTEXT}
|
||||
|
|
@ -337,8 +329,8 @@ _gardener_merge() {
|
|||
printf 'PHASE:done\n' > "$PHASE_FILE"
|
||||
return 0
|
||||
fi
|
||||
log "gardener merge blocked (HTTP 405) — escalating"
|
||||
printf 'PHASE:escalate\nReason: gardener PR #%s merge blocked (HTTP 405)\n' \
|
||||
log "gardener merge blocked (HTTP 405)"
|
||||
printf 'PHASE:failed\nReason: gardener PR #%s merge blocked (HTTP 405)\n' \
|
||||
"$_GARDENER_PR" > "$PHASE_FILE"
|
||||
return 0
|
||||
fi
|
||||
|
|
@ -350,7 +342,7 @@ _gardener_merge() {
|
|||
git fetch origin ${PRIMARY_BRANCH} && git rebase origin/${PRIMARY_BRANCH}
|
||||
git push --force-with-lease origin HEAD
|
||||
echo \"PHASE:awaiting_ci\" > \"${PHASE_FILE}\"
|
||||
If rebase fails, write PHASE:escalate with a reason."
|
||||
If rebase fails, write PHASE:failed with a reason."
|
||||
}
|
||||
|
||||
# shellcheck disable=SC2317 # called indirectly by monitor_phase_loop
|
||||
|
|
@ -468,7 +460,7 @@ Write PHASE:awaiting_review to the phase file, then stop and wait:
|
|||
if ! $ci_done; then
|
||||
log "CI timeout for PR #${_GARDENER_PR}"
|
||||
agent_inject_into_session "${_MONITOR_SESSION:-$SESSION_NAME}" \
|
||||
"CI TIMEOUT: CI did not complete within 15 minutes for PR #${_GARDENER_PR}. Write PHASE:escalate if you cannot proceed."
|
||||
"CI TIMEOUT: CI did not complete within 15 minutes for PR #${_GARDENER_PR}. Write PHASE:failed with a reason if you cannot proceed."
|
||||
return 0
|
||||
fi
|
||||
|
||||
|
|
@ -484,7 +476,7 @@ Write PHASE:awaiting_review to the phase file, then stop and wait:
|
|||
_GARDENER_CI_FIX_COUNT=$(( _GARDENER_CI_FIX_COUNT + 1 ))
|
||||
if [ "$_GARDENER_CI_FIX_COUNT" -gt 3 ]; then
|
||||
log "CI exhausted after ${_GARDENER_CI_FIX_COUNT} attempts"
|
||||
printf 'PHASE:escalate\nReason: gardener CI exhausted after %d attempts\n' \
|
||||
printf 'PHASE:failed\nReason: gardener CI exhausted after %d attempts\n' \
|
||||
"$_GARDENER_CI_FIX_COUNT" > "$PHASE_FILE"
|
||||
return 0
|
||||
fi
|
||||
|
|
@ -625,7 +617,7 @@ Then stop and wait."
|
|||
if [ "$review_elapsed" -ge "$review_timeout" ]; then
|
||||
log "review wait timed out for PR #${_GARDENER_PR}"
|
||||
agent_inject_into_session "${_MONITOR_SESSION:-$SESSION_NAME}" \
|
||||
"No review received after ${review_timeout}s for PR #${_GARDENER_PR}. Write PHASE:escalate if you cannot proceed."
|
||||
"No review received after ${review_timeout}s for PR #${_GARDENER_PR}. Write PHASE:failed with a reason if you cannot proceed."
|
||||
fi
|
||||
}
|
||||
|
||||
|
|
@ -647,13 +639,6 @@ _gardener_on_phase_change() {
|
|||
PHASE:failed)
|
||||
agent_kill_session "${_MONITOR_SESSION:-$SESSION_NAME}"
|
||||
;;
|
||||
PHASE:escalate)
|
||||
local reason
|
||||
reason=$(sed -n '2p' "$PHASE_FILE" 2>/dev/null | sed 's/^Reason: //' || true)
|
||||
log "escalated: ${reason}"
|
||||
matrix_send "gardener" "Gardener escalated: ${reason}" 2>/dev/null || true
|
||||
agent_kill_session "${_MONITOR_SESSION:-$SESSION_NAME}"
|
||||
;;
|
||||
PHASE:crashed)
|
||||
if [ "${_GARDENER_CRASH_COUNT:-0}" -gt 0 ]; then
|
||||
log "ERROR: session crashed again — giving up"
|
||||
|
|
|
|||
|
|
@ -17,14 +17,18 @@ Dismissed predictions get re-filed by the predictor with stronger evidence
|
|||
if still valid. Phase 2
|
||||
(update-prerequisite-tree): scan repo state + open/closed issues, mark resolved
|
||||
prerequisites, discover new ones, update the tree. **Also scans comments on
|
||||
referenced issues for bounce/stuck signals** (BOUNCED, ESCALATED, LABEL_CHURN)
|
||||
to detect issues ping-ponging between backlog and underspecified. Phase 3
|
||||
referenced issues for bounce/stuck signals** (BOUNCED, LABEL_CHURN)
|
||||
to detect issues ping-ponging between backlog and underspecified. Issues that
|
||||
need human decisions or external resources are filed as vault procurement items
|
||||
(`vault/pending/*.md`) instead of being escalated. Phase 3
|
||||
(file-at-constraints): identify the top 3 unresolved prerequisites that block
|
||||
the most downstream objectives — file issues as either `backlog` (code changes,
|
||||
dev-agent) or `action` (run existing formula, action-agent). **Stuck issues
|
||||
(detected BOUNCED/LABEL_CHURN) are dispatched to the `groom-backlog` formula
|
||||
in breakdown mode instead of being re-promoted** — this breaks the ping-pong
|
||||
loop by splitting them into dev-agent-sized sub-issues.
|
||||
loop by splitting them into dev-agent-sized sub-issues. **Human-blocked issues
|
||||
are routed through the vault** — the planner files a procurement item and marks
|
||||
the prerequisite as blocked-on-vault in the tree.
|
||||
Phase 4 (journal-and-memory): write updated prerequisite tree + daily journal
|
||||
entry (committed to git) and update `planner/MEMORY.md` (committed to git).
|
||||
Phase 5 (commit-and-pr): one commit with all file changes, push, create PR.
|
||||
|
|
|
|||
|
|
@ -24,8 +24,8 @@ Status: DONE — #395 closed
|
|||
|
||||
## Objective: Example project demonstrating full lifecycle (#466)
|
||||
- [x] disinto init working (#393)
|
||||
- [ ] Human decision on implementation approach (external repo vs local demo) ⚠ escalated — awaiting human decision (since 2026-03-23)
|
||||
Status: BLOCKED — bounced by dev-agent (too large), escalated by gardener, 3 days without human response
|
||||
- [ ] Human decision on implementation approach (external repo vs local demo) — blocked-on-vault
|
||||
Status: BLOCKED — bounced by dev-agent (too large), routed to vault for human decision
|
||||
|
||||
## Objective: Landing page communicating value proposition (#534)
|
||||
- [x] disinto init working (#393)
|
||||
|
|
|
|||
|
|
@ -53,7 +53,7 @@ stuck. Your job: figure out the correct dependency direction and fix the wrong o
|
|||
actually depends on which
|
||||
3. Edit the issue that has the incorrect dep to remove the `#NNN` reference from its
|
||||
`## Dependencies` section (replace with `- None` if it was the only dep)
|
||||
4. If the correct direction is unclear from code, escalate with both issue summaries
|
||||
4. If the correct direction is unclear from code, file a vault item with both issue summaries
|
||||
|
||||
Use the forge API to edit issue bodies:
|
||||
```bash
|
||||
|
|
@ -70,25 +70,35 @@ obsolete or misprioritized. Investigate:
|
|||
|
||||
1. Check if dep #B is still relevant (read its body, check if the code it targets changed)
|
||||
2. If the dep is obsolete → remove it from #A's `## Dependencies` section
|
||||
3. If the dep is still needed → escalate, suggesting to prioritize #B or split #A
|
||||
3. If the dep is still needed → file a vault item, suggesting to prioritize #B or split #A
|
||||
|
||||
### Dev-agent blocked (P2)
|
||||
When you see "Dev-agent blocked: last N polls all report 'no ready issues'":
|
||||
|
||||
1. Check if circular deps exist (they'll appear as separate P3 alerts)
|
||||
2. Check if all backlog issues depend on a single unmerged issue — if so, escalate
|
||||
to prioritize that blocker
|
||||
3. If no clear blocker, escalate with the list of blocked issues and their deps
|
||||
2. Check if all backlog issues depend on a single unmerged issue — if so, file a vault
|
||||
item to prioritize that blocker
|
||||
3. If no clear blocker, file a vault item with the list of blocked issues and their deps
|
||||
|
||||
## Escalation
|
||||
## When you cannot fix it
|
||||
|
||||
If you can't fix it, escalate via Matrix:
|
||||
File a vault procurement item so the human is notified through the vault:
|
||||
```bash
|
||||
source ${FACTORY_ROOT}/lib/env.sh
|
||||
matrix_send "supervisor" "🏭 ESCALATE: <what's wrong and why you can't fix it>"
|
||||
cat > "${PROJECT_REPO_ROOT}/vault/pending/supervisor-$(date -u +%Y%m%d-%H%M)-issue.md" <<'VAULT_EOF'
|
||||
# <What is needed>
|
||||
## What
|
||||
<description of the problem and why the supervisor cannot fix it>
|
||||
## Why
|
||||
<impact on factory health>
|
||||
## Unblocks
|
||||
- Factory health: <what this resolves>
|
||||
VAULT_EOF
|
||||
```
|
||||
|
||||
Do NOT escalate if you can fix it. Do NOT ask permission. Fix first, report after.
|
||||
The vault-poll will notify the human and track the request.
|
||||
|
||||
Do NOT talk to the human directly. The vault is the factory's only interface
|
||||
to the human for resources and approvals. Fix first, report after.
|
||||
|
||||
## Output
|
||||
|
||||
|
|
@ -97,7 +107,7 @@ FIXED: <what you did>
|
|||
```
|
||||
or
|
||||
```
|
||||
ESCALATE: <what's wrong>
|
||||
VAULT: filed vault/pending/<id>.md — <what's needed>
|
||||
```
|
||||
|
||||
## Learning
|
||||
|
|
|
|||
|
|
@ -214,14 +214,15 @@ else
|
|||
fi
|
||||
echo ""
|
||||
|
||||
# ── Escalation Replies from Matrix ────────────────────────────────────────
|
||||
# ── Pending Vault Items ───────────────────────────────────────────────────
|
||||
|
||||
echo "## Escalation Replies (from Matrix)"
|
||||
if [ -s /tmp/supervisor-escalation-reply ]; then
|
||||
cat /tmp/supervisor-escalation-reply
|
||||
echo ""
|
||||
echo "(Reply already consumed by supervisor-run.sh before this session)"
|
||||
else
|
||||
echo " None"
|
||||
fi
|
||||
echo "## Pending Vault Items"
|
||||
_found_vault=false
|
||||
for _vf in "${PROJECT_REPO_ROOT}/vault/pending/"*.md; do
|
||||
[ -f "$_vf" ] || continue
|
||||
_found_vault=true
|
||||
_vtitle=$(grep -m1 '^# ' "$_vf" | sed 's/^# //' || basename "$_vf")
|
||||
echo " $(basename "$_vf"): ${_vtitle}"
|
||||
done
|
||||
[ "$_found_vault" = false ] && echo " None"
|
||||
echo ""
|
||||
|
|
|
|||
|
|
@ -80,14 +80,6 @@ status() {
|
|||
flog "$*"
|
||||
}
|
||||
|
||||
# ── Check for escalation replies from Matrix ──────────────────────────────
|
||||
ESCALATION_REPLY=""
|
||||
if [ -s /tmp/supervisor-escalation-reply ]; then
|
||||
ESCALATION_REPLY=$(cat /tmp/supervisor-escalation-reply)
|
||||
rm -f /tmp/supervisor-escalation-reply
|
||||
flog "Got escalation reply: $(echo "$ESCALATION_REPLY" | head -1)"
|
||||
fi
|
||||
|
||||
# Alerts by priority
|
||||
P0_ALERTS=""
|
||||
P1_ALERTS=""
|
||||
|
|
@ -813,13 +805,7 @@ Disk: $(df -h / | awk 'NR==2{printf "%s used of %s (%s)", $3, $2, $5}')
|
|||
Docker: $(sudo docker ps --format '{{.Names}}' 2>/dev/null | wc -l) containers running
|
||||
Claude procs: $(pgrep -f "claude" 2>/dev/null | wc -l)
|
||||
|
||||
$(if [ -n "$ESCALATION_REPLY" ]; then echo "
|
||||
## Human Response to Previous Escalation
|
||||
${ESCALATION_REPLY}
|
||||
|
||||
Act on this response."; fi)
|
||||
|
||||
Fix what you can. Escalate what you can't. Read the relevant best-practices file first."
|
||||
Fix what you can. File vault items for what you can't. Read the relevant best-practices file first."
|
||||
|
||||
CLAUDE_OUTPUT=$(timeout 300 claude -p --model sonnet --dangerously-skip-permissions \
|
||||
"$CLAUDE_PROMPT" 2>&1) || true
|
||||
|
|
|
|||
|
|
@ -59,9 +59,6 @@ else
|
|||
log "WARNING: preflight.sh failed, continuing with partial data"
|
||||
fi
|
||||
|
||||
# ── Consume escalation replies ────────────────────────────────────────────
|
||||
consume_escalation_reply "supervisor"
|
||||
|
||||
# ── Load formula + context ───────────────────────────────────────────────
|
||||
load_formula "$FACTORY_ROOT/formulas/run-supervisor.toml"
|
||||
build_context_block AGENTS.md
|
||||
|
|
@ -77,16 +74,11 @@ build_prompt_footer
|
|||
PROMPT="You are the supervisor agent for ${FORGE_REPO}. Work through the formula below. You MUST write PHASE:done to '${PHASE_FILE}' when finished — the orchestrator will time you out if you return to the prompt without signalling.
|
||||
|
||||
You have full shell access and --dangerously-skip-permissions.
|
||||
Fix what you can. Escalate what you cannot. Do NOT ask permission — act first, report after.
|
||||
Fix what you can. File vault items for what you cannot. Do NOT ask permission — act first, report after.
|
||||
|
||||
## Pre-flight metrics (collected $(date -u +%H:%M) UTC)
|
||||
${PREFLIGHT_OUTPUT}
|
||||
${ESCALATION_REPLY:+
|
||||
## Escalation Reply (from Matrix — human message)
|
||||
${ESCALATION_REPLY}
|
||||
|
||||
Act on this reply in the decide-actions step.
|
||||
}
|
||||
## Project context
|
||||
${CONTEXT_BLOCK}
|
||||
${SCRATCH_CONTEXT:+${SCRATCH_CONTEXT}
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue