Commit graph

494 commits

Author SHA1 Message Date
openhands
cd5f05008b supervisor: learned — Push CI vs PR CI mismatch — agent picks wrong pipeline number 2026-03-21 05:05:02 +00:00
openhands
61c44d31b1 fix: refactor: replace escalation JSONL with blocked label + diagnostic comment (#352)
Replace the unreliable escalation JSONL system (supervisor/escalations-*.jsonl
consumed by gardener) with direct blocked label + diagnostic comment on the
original issue.

When a dev-agent or action-agent session fails (PHASE:failed, idle timeout,
crash, CI exhausted):
- Capture last 50 lines from tmux pane via tmux capture-pane
- Post a structured diagnostic comment on the issue (exit reason, timestamp,
  PR number, tmux output)
- Label the issue "blocked" (instead of restoring "backlog")
- Remove in-progress label

Removed:
- Escalation JSONL write paths in dev-agent.sh, phase-handler.sh, dev-poll.sh,
  action-agent.sh
- is_escalated() helper in dev-poll.sh
- Escalation triage (P2f section) in supervisor-poll.sh
- Escalation processing + recipe engine in gardener-poll.sh
- ci-escalation-recipes step from run-gardener.toml formula
- escalations*.jsonl from .gitignore

Added:
- post_blocked_diagnostic() shared helper in phase-handler.sh
- ensure_blocked_label_id() helper (creates label via API if not exists)
- is_blocked() helper in dev-poll.sh (replaces is_escalated)
- Blocked issues listing in supervisor/preflight.sh

Kept:
- Matrix notifications on failure (unchanged)
- CI fix counter logic (still tracks attempts)
- needs_human injection in supervisor/gardener (not escalation-related)
- Gardener grooming (gardener-agent.sh still invoked)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 04:18:43 +00:00
johba
0109f0b0c3 Merge pull request 'fix: PHASE:needs_human missing from crash-path terminal set in monitor_phase_loop (#342)' (#444) from fix/issue-342 into main 2026-03-21 04:59:02 +01:00
openhands
ab122c9701 fix: PHASE:needs_human missing from crash-path terminal set in monitor_phase_loop (#342)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 03:50:21 +00:00
johba
f511a6c7a7 Merge pull request 'fix: PHASE:crashed unhandled in _on_phase_change / dev-agent callback (#339)' (#443) from fix/issue-339 into main 2026-03-21 04:39:02 +01:00
openhands
a1d47a20f2 fix: eliminate duplicate code blocks flagged by CI dup-detection
Use single-line conditionals for worktree check in PHASE:crashed handler
(phase-handler.sh) to break 5-line window match with idle_timeout case.
Slim dev-agent.sh crashed case to just restore_to_backlog since the
_on_phase_change callback handles full cleanup.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 03:27:35 +00:00
openhands
cb1e45c4a8 supervisor: learned — PR CI vs Push CI mismatch causes silent stall in awaiting_review 2026-03-21 03:24:41 +00:00
openhands
7156f21e12 fix: extract restore_to_backlog() to eliminate duplicate label reset pattern
The cleanup_labels + curl POST + CLAIMED=false pattern was duplicated
across dev-agent.sh (idle_timeout and crashed cases) and phase-handler.sh
(PHASE:crashed handler), triggering duplicate-detection CI failure.

Extract restore_to_backlog() shared helper; call it from all three sites.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-21 02:14:47 +00:00
openhands
7f9cefa847 fix: PHASE:crashed unhandled in _on_phase_change / dev-agent callback (#339)
Add explicit PHASE:crashed case to _on_phase_change in phase-handler.sh:
logs crash, notifies Matrix, escalates to supervisor, restores backlog
label, preserves worktree if PR exists, cleans up temp files.

Add crashed case to dev-agent.sh post-loop case statement for
belt-and-suspenders cleanup matching the callback behavior.

Replaces the dead crash_recovery_failed case that was never triggered.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 01:31:20 +00:00
johba
e8dc145184 Merge pull request 'fix: Inner CI/review wait loops bypass exit_marker fast-path (#338)' (#442) from fix/issue-338 into main 2026-03-21 02:18:59 +01:00
openhands
e7be534c7d fix: Inner CI/review wait loops bypass exit_marker fast-path (#338)
Add exit_marker file check to the CI wait loop and review wait loop in
phase-handler.sh, matching the pattern already used in monitor_phase_loop
(agent-session.sh). This makes crash detection consistent across all
polling paths.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 00:55:38 +00:00
johba
6a499db011 Merge pull request 'fix: feat: supervisor as formula-driven agent — cron + Matrix escalation (#245)' (#441) from fix/issue-245 into main 2026-03-21 01:49:02 +01:00
openhands
52f7c4973e fix: address review — phase signal quoting, issue count limits, reply comment
- Fix critical: use double quotes for $PHASE_FILE in formula phase signal
- Fix low: use limit=50 for backlog/in-progress/blocked issue counts
- Fix nit: correct misleading comment about escalation reply timing

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 00:39:58 +00:00
openhands
bfdc01202c fix: break duplicate window — add priority order line to supervisor prompt
The duplicate detector skips lines starting with # (treats as comments
even inside quoted strings). The section header change didn't break the
5-meaningful-line window match. Adding a non-comment content line does.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 00:32:54 +00:00
openhands
53169f2514 fix: add supervisor and predictor scripts to agent-smoke CI test
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 00:30:22 +00:00
openhands
8ea4e06f8f fix: deduplicate prompt template to pass CI duplicate detection
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 00:26:13 +00:00
openhands
d8244742f1 fix: feat: supervisor as formula-driven agent — cron + Matrix escalation (#245)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 00:22:37 +00:00
johba
7fe5ed0381 Merge pull request 'fix: Stale REQUEST_CHANGES reviews still trigger re-work (#336)' (#440) from fix/issue-336 into main 2026-03-21 01:09:12 +01:00
openhands
e5965e71d4 fix: Stale REQUEST_CHANGES reviews still trigger re-work (#336)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 00:05:09 +00:00
johba
e4b2e0e43d Merge pull request 'fix: No combined wall-clock + idle cap for action-agent sessions (#334)' (#439) from fix/issue-334 into main 2026-03-21 01:01:23 +01:00
openhands
42620a1341 fix: No combined wall-clock + idle cap for action-agent sessions (#334)
Add ACTION_MAX_LIFETIME env var (default 8h) that caps total session
wall-clock time independently of ACTION_IDLE_TIMEOUT.  A background
watchdog sleeps for the remaining lifetime and, when triggered, kills
the tmux session, posts a summary comment on the issue, writes
PHASE:failed with a max_lifetime reason, and escalates to the
supervisor.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 23:51:46 +00:00
johba
f1b3bf25a4 Merge pull request 'fix: feat: SessionStart compact hook re-injects phase protocol after context compaction (#274)' (#438) from fix/issue-274 into main 2026-03-21 00:45:54 +01:00
openhands
aa89e2b31e fix: move write_compact_context after create_agent_session in gardener-agent
The context file was written before the reset block that deleted it,
making compaction re-injection a no-op for gardener sessions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 23:35:34 +00:00
openhands
e3895ad3ac fix: feat: SessionStart compact hook re-injects phase protocol after context compaction (#274)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 23:27:32 +00:00
johba
5c960f1b6e Merge pull request 'fix: feat: migrate review-agent to formula architecture (#267)' (#437) from fix/issue-267 into main 2026-03-21 00:14:03 +01:00
openhands
aecc8fb8ad fix: feat: migrate review-agent to formula architecture (#267)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 22:59:02 +00:00
johba
3ce70608ba Merge pull request 'fix: Shared monitor_phase_loop idle_prompt behaviour undocumented for future agents (#265)' (#434) from fix/issue-265 into main 2026-03-20 23:29:02 +01:00
openhands
9977c8575f ci: retrigger after force-push resolved branch divergence
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 22:14:55 +00:00
openhands
a3ca4b55b8 fix: Shared monitor_phase_loop idle_prompt behaviour undocumented for future agents (#265)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 22:07:00 +00:00
johba
05884c7d1e Merge pull request 'fix: fix: bundled dust cleanup — lib/matrix_listener.sh (#264)' (#431) from fix/issue-264 into main 2026-03-20 22:59:03 +01:00
openhands
f78fbc1da6 fix: bundled dust cleanup — lib/matrix_listener.sh (#264)
- Remove dead ROOM_ENCODED and EVENT_ID variables from matrix_listener.sh
  (were suppressed with SC2034 instead of removed)
- Remove dead REPO variable from dev-poll.sh and review-poll.sh
- Update header comment in matrix_listener.sh to list all 5 reply-routing
  cases (supervisor, gardener, dev, review, vault, action)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 21:40:31 +00:00
johba
e4d5058172 Merge pull request 'fix: feat: agents flush context to scratch file before compaction (#262)' (#430) from fix/issue-262 into main 2026-03-20 22:33:50 +01:00
openhands
26d20af48c fix: address review — scratch file survives crash, cap read size, fix instruction (#262)
- Remove SCRATCH_FILE from action-agent cleanup() trap so it survives crashes
- Change instruction to note contents already injected (avoid wasted tool call)
- Cap scratch file read at 8KB via head -c 8192
- Move predictor scratch instruction after formula (consistent placement)
- Remove redundant FINAL_PHASE re-reads in planner/predictor

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 20:58:32 +00:00
openhands
6405ac9837 fix: use shared scratch helpers in dev-agent and action-agent to eliminate duplicates (#262)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 20:47:22 +00:00
openhands
8d9e216e33 ci: retrigger after stale status (#262) 2026-03-20 20:22:56 +00:00
openhands
273e5ee53f supervisor: learned — False Positive: Shared Status File Causes Giant Age (29M+ min) 2026-03-20 20:14:15 +00:00
openhands
7199bbf9b5 fix: feat: agents flush context to scratch file before compaction (#262)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 20:12:45 +00:00
johba
edfdae9ad8 Merge pull request 'fix: P4 stale worktree sweep doesn't cover sup-retry-* worktrees (#253)' (#428) from fix/issue-253 into main 2026-03-20 20:54:03 +01:00
openhands
ec5c48ddf2 fix: P4 stale worktree sweep doesn't cover sup-retry-* worktrees (#253)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 19:45:21 +00:00
johba
0c317c0a90 Merge pull request 'fix: P2e and classify_pipeline_failure() use divergent infra heuristics (#251)' (#426) from fix/issue-251 into main 2026-03-20 20:40:59 +01:00
openhands
3cd047a7e0 fix: P2e and classify_pipeline_failure() use divergent infra heuristics (#251)
Extract shared is_infra_step() in lib/ci-helpers.sh capturing the union of
infra-detection heuristics from both P2e and classify_pipeline_failure():
- Clone/git step exit 128 (connection failure)
- Any step exit 137 (OOM/signal 9)
- Log-pattern matching (timeouts, connection failures)

Update classify_pipeline_failure() to use is_infra_step() with log fetching
and "any infra step" aggregation (matching P2e semantics). Simplify P2e to
delegate to classify_pipeline_failure(). Update P2f caller for new output
format ("infra <reason>").

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 19:19:29 +00:00
johba
a1ee1242ab Merge pull request 'fix: action-agent.sh fetches comments without bot filtering (#243)' (#424) from fix/issue-243 into main 2026-03-20 20:09:38 +01:00
openhands
5157064bf0 fix: action-agent.sh fetches comments without bot filtering (#243)
Resolve the bot username dynamically from CODEBERG_TOKEN via the /user
API endpoint and filter out bot comments from the prior-context section.
Additional bot accounts can be specified via CODEBERG_BOT_USERNAMES env
var (comma-separated).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 19:01:56 +00:00
johba
00d11efa82 Merge pull request 'fix: Stale lock path in existing dev-agent health check (line 373) (#242)' (#422) from fix/issue-242 into main 2026-03-20 19:54:02 +01:00
openhands
8fb6638589 fix: Stale lock path in existing dev-agent health check (line 373) (#242)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 18:44:59 +00:00
johba
d99be4db57 Merge pull request 'fix: lib/matrix_listener.sh: review case reads a separate /tmp/review-thread-map (col 2) instead of the standard THREAD_MAP (col 4) (#238)' (#421) from fix/issue-238 into main 2026-03-20 19:39:02 +01:00
openhands
db66e35556 fix: lib/matrix_listener.sh: review case reads a separate /tmp/review-thread-map (col 2) instead of the standard THREAD_MAP (col 4) (#238)
- matrix_listener.sh: review case now reads PR number from column 4 of
  the standard $THREAD_MAP instead of column 2 of /tmp/review-thread-map
- review-pr.sh: pass PR_NUMBER as context_tag (4th arg) to matrix_send
  so the standard MATRIX_THREAD_MAP has it in column 4; remove separate
  /tmp/review-thread-map write
- review-poll.sh: prune from MATRIX_THREAD_MAP instead of the removed
  /tmp/review-thread-map

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 18:21:42 +00:00
johba
258cc1d1e3 Merge pull request 'fix: dev-poll.sh has no explicit guard for action-labeled issues (#233)' (#414) from fix/issue-233 into main 2026-03-20 19:14:03 +01:00
johba
4377c0812f Merge pull request 'feat: disinto predictor — daily cron-driven formula (#406)' (#417) from action/issue-406 into main 2026-03-20 19:09:02 +01:00
openhands
0aa6528709 fix: address review — WOODPECKER_SERVER var, update AGENTS.md for new predictor
- Fix bug: replace WOODPECKER_URL with WOODPECKER_SERVER throughout
  run-predictor.toml (CI trends were silently skipped)
- Update AGENTS.md: new Predictor section reflecting predictor/ directory,
  formula-based architecture, daily 06:00 cron, supersedes legacy
  prediction-agent.sh
- Update directory layout, formula-session.sh sourced-by list, label table,
  and planner future-direction anchor
- Remove redundant Completion section from formula (PROMPT_FOOTER handles it)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 18:00:21 +00:00