fix: refactor: replace escalation JSONL with blocked label + diagnostic comment (#352)

Replace the unreliable escalation JSONL system (supervisor/escalations-*.jsonl
consumed by gardener) with direct blocked label + diagnostic comment on the
original issue.

When a dev-agent or action-agent session fails (PHASE:failed, idle timeout,
crash, CI exhausted):
- Capture last 50 lines from tmux pane via tmux capture-pane
- Post a structured diagnostic comment on the issue (exit reason, timestamp,
  PR number, tmux output)
- Label the issue "blocked" (instead of restoring "backlog")
- Remove in-progress label

Removed:
- Escalation JSONL write paths in dev-agent.sh, phase-handler.sh, dev-poll.sh,
  action-agent.sh
- is_escalated() helper in dev-poll.sh
- Escalation triage (P2f section) in supervisor-poll.sh
- Escalation processing + recipe engine in gardener-poll.sh
- ci-escalation-recipes step from run-gardener.toml formula
- escalations*.jsonl from .gitignore

Added:
- post_blocked_diagnostic() shared helper in phase-handler.sh
- ensure_blocked_label_id() helper (creates label via API if not exists)
- is_blocked() helper in dev-poll.sh (replaces is_escalated)
- Blocked issues listing in supervisor/preflight.sh

Kept:
- Matrix notifications on failure (unchanged)
- CI fix counter logic (still tracks attempts)
- needs_human injection in supervisor/gardener (not escalation-related)
- Gardener grooming (gardener-agent.sh still invoked)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
openhands 2026-03-21 04:18:43 +00:00
parent 0109f0b0c3
commit 61c44d31b1
10 changed files with 181 additions and 990 deletions

View file

@ -158,21 +158,16 @@ done
[ "$_found_wt" = false ] && echo " None"
echo ""
# ── Pending Escalations ──────────────────────────────────────────────────
# ── Blocked Issues ────────────────────────────────────────────────────────
echo "## Pending Escalations"
_found_esc=false
for _esc_file in "${FACTORY_ROOT}/supervisor/escalations-"*.jsonl; do
[ -f "$_esc_file" ] || continue
[[ "$_esc_file" == *.done.jsonl ]] && continue
_esc_count=$(wc -l < "$_esc_file" 2>/dev/null || echo 0)
[ "${_esc_count:-0}" -gt 0 ] || continue
_found_esc=true
echo "### $(basename "$_esc_file") (${_esc_count} entries)"
cat "$_esc_file"
echo ""
done
[ "$_found_esc" = false ] && echo " None"
echo "## Blocked Issues"
_blocked_issues=$(codeberg_api GET "/issues?state=open&labels=blocked&type=issues&limit=50" 2>/dev/null || echo "[]")
_blocked_n=$(echo "$_blocked_issues" | jq 'length' 2>/dev/null || echo 0)
if [ "${_blocked_n:-0}" -gt 0 ]; then
echo "$_blocked_issues" | jq -r '.[] | " #\(.number): \(.title)"' 2>/dev/null || echo " (query failed)"
else
echo " None"
fi
echo ""
# ── Escalation Replies from Matrix ────────────────────────────────────────