Commit graph

1038 commits

Author SHA1 Message Date
openhands
88d04d9edb fix: feat: review-poll.sh injects review feedback into dev tmux session (#82)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-17 20:52:51 +00:00
johba
efc1745399 Merge pull request 'fix: feat: tmux session manager in dev-agent.sh (#80)' (#88) from fix/issue-80 into main 2026-03-17 21:46:16 +01:00
openhands
d59c09eb5b fix: address review findings from issue #80 phase protocol
- Add missing MAX_CI_FIXES=3 and MAX_REVIEW_ROUNDS=5 constants to the
  config section; referencing undefined variables with set -euo pipefail
  caused an abort on first CI failure or REQUEST_CHANGES review.

- cleanup() trap now calls kill_tmux_session() so any unexpected exit
  (SIGTERM, errexit, unbound variable) kills the Claude session rather
  than leaving it running autonomously without an orchestrator.

- do_merge() initial CI wait loop now breaks and returns 1 immediately
  on failure/error states, avoiding a full 10-minute poll before a
  merge attempt that would also fail.

- Inner review-poll loop no longer updates LAST_PHASE_MTIME when it
  detects a mid-wait phase-file change; leaving it stale ensures the
  outer loop detects and dispatches the new phase on its next tick
  (previously the phase was silently swallowed).

- post_refusal_comment dedup now fetches the last 5 comments and checks
  any of them, so a human reply between two agent runs no longer causes
  a duplicate refusal comment.

- Remove duplicate DELETE labels/backlog call in claim section.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-17 20:40:35 +00:00
openhands
db92bc13b5 fix: feat: tmux session manager in dev-agent.sh (#80)
Replace fire-and-forget `claude -p` calls with a persistent tmux session
that Claude Code runs in interactively. The orchestrator (dev-agent.sh)
monitors a phase file and reacts to Claude's signals:

- Session lifecycle: create `dev-{project}-{issue}` tmux session, send
  the full initial prompt (issue body + phase protocol instructions) via
  `tmux load-buffer` / `tmux paste-buffer`, then enter a phase monitor loop.

- Phase monitor loop: polls `/tmp/dev-session-{project}-{issue}.phase`
  every 30s for mtime changes. Handles all five phase sentinels:
  - PHASE:awaiting_ci   → create PR if needed, poll CI, inject result
  - PHASE:awaiting_review → poll for review comment, inject verdict
  - PHASE:needs_human  → send Matrix notification, wait for injection
  - PHASE:done         → call do_merge(), exit on success
  - PHASE:failed       → detect refusal JSON vs genuine failure, post
                          comment / escalate, kill session, restore backlog

- Crash recovery: if the tmux session dies unexpectedly, dev-agent.sh
  restarts it in the same worktree and injects a recovery prompt with
  the last known phase and git diff.

- Idle timeout: 2h with no phase update kills the session gracefully.

- PR creation moved into the PHASE:awaiting_ci handler; Claude pushes the
  branch and writes the phase, orchestrator creates the PR and starts CI.

- Summary file `/tmp/dev-impl-summary-{project}-{issue}.txt` carries the
  implementation summary (for PR body) and refusal JSON between Claude and
  the orchestrator.

- All existing logic preserved: dep preflight, label management, do_merge()
  with rebase retry, CI escalation, prior art detection, log rotation.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-17 20:20:38 +00:00
johba
fc4248cf6b Merge pull request 'fix: feat: auto-pull factory code on every agent spawn (#85)' (#87) from fix/issue-85 into main 2026-03-17 21:03:27 +01:00
openhands
1b29baebc3 fix: feat: auto-pull factory code on every agent spawn (#85)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-17 19:35:29 +00:00
johba
64b53aaad5 Merge pull request 'fix: feat: define phase-signaling protocol for persistent Claude sessions (#79)' (#84) from fix/issue-79 into main 2026-03-17 20:32:51 +01:00
openhands
2b534bb7ec fix: address review findings from issue #79 phase protocol
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-17 19:27:11 +00:00
openhands
275b92e8b5 fix: address review findings from issue #79 phase protocol
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-17 19:21:01 +00:00
openhands
d87b7db8f3 fix: feat: define phase-signaling protocol for persistent Claude sessions (#79)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-17 18:53:15 +00:00
johba
a74eb8ba08 Merge pull request 'fix: refactor: move escalation processing from supervisor to gardener (#67)' (#73) from fix/issue-67 into main 2026-03-17 19:36:48 +01:00
openhands
df2522a7cb fix: address review findings from issue #67 escalation refactor
- supervisor: skip *.done.jsonl in escalation glob (bug: wildcard matched
  harb.done.jsonl producing spurious 'pending' log noise every cycle)
- supervisor: use wc -l instead of grep -c . for line counting (style nit)
- supervisor: consume gardener-esc-resolved.log via fixed() so escalation
  resolutions appear in end-of-cycle supervisor reporting
- dev-poll: update all 'escalated to supervisor' log/matrix strings to
  'escalated to gardener' (lines 263, 268, 344, 420)
- gardener: track _esc_total_created across all escalation entries and
  write count to supervisor/gardener-esc-resolved.log after processing

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-17 18:30:57 +00:00
openhands
150ede5605 fix: refactor: move escalation processing from supervisor to gardener (#67)
- dev-poll.sh: write escalations to per-project files
  (supervisor/escalations-{PROJECT_NAME}.jsonl) and add "project" field
  so each project's escalations are isolated; update is_escalated() to
  read from the same per-project paths
- gardener-poll.sh: add escalation processing block that reads the
  per-project escalation file, fetches CI logs via Woodpecker, and
  creates per-file ShellCheck sub-issues or generic CI failure issues
  labeled backlog — runs with the correct CODEBERG_API and
  WOODPECKER_REPO_ID already loaded from the project TOML
- supervisor-poll.sh: remove the escalation processing block; replace
  with a simple flog report counting pending escalations per project

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-17 17:32:56 +00:00
johba
2797325d00 Merge pull request 'fix: feat: Woodpecker CI pipeline with ShellCheck + duplicate code detection (#45)' (#48) from fix/issue-45 into main 2026-03-17 18:17:22 +01:00
johba
c5dd03b106 Merge pull request 'fix: chore: create formula label in Codeberg + add more formula templates (#22)' (#66) from fix/issue-22 into main 2026-03-17 18:08:55 +01:00
openhands
88eed09e71 fix: address review findings on formula templates and BOOTSTRAP docs
- upgrade-dependency.toml: fix forge upgrade command (forge update, not
  forge install); remove redundant `npm install` after lockfile write;
  simplify description to "Upgrade {{package}} to {{to_version}}" so it
  reads cleanly when from_version is omitted
- add-rpc-method.toml: remove dead `namespace` variable; inline namespace
  derivation logic into register-method step description
- BOOTSTRAP.md: mark formula label entry as requiring feat/formula merge;
  add YAML front matter example so operators know the issue schema

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-17 16:52:49 +00:00
openhands
d16dc6175d fix: chore: create formula label in Codeberg + add more formula templates (#22)
- Add formulas/upgrade-dependency.toml: multi-ecosystem (npm/cargo/forge) dependency upgrade
  with steps for checking changelog, upgrading, applying breaking changes, and running tests
- Add formulas/add-rpc-method.toml: JSON-RPC method addition with steps for reading
  existing patterns, implementing handler, registering, writing tests, and running tests
- Document `formula` label in BOOTSTRAP.md optional labels table

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-17 16:40:56 +00:00
openhands
29d76c6d8b fix: make shellcheck non-blocking until existing warnings are fixed
ShellCheck finds real issues in existing code. Making it blocking
means the CI pipeline PR can't pass its own CI (chicken-and-egg).

Report warnings but don't fail — fix them incrementally via backlog.
2026-03-17 16:35:12 +00:00
johba
66e6095468 Merge pull request 'fix: docs: update BOOTSTRAP.md with multi-project setup lessons (#47)' (#64) from fix/issue-47 into main 2026-03-17 17:19:36 +01:00
openhands
566846a5c3 fix: correct multi-project cron comment and gardener TOML docs
- Fix Project B comment: 'dev +11' → 'dev +1' (cron offset is :01, not :11)
- Fix gap claim: cross-project gaps are 2 min, not 3
- Update prose to say '2 minutes' instead of '3-minute offsets'
- Add gardener-poll.sh to the list of scripts accepting a TOML arg
- Add per-project gardener cron lines to the multi-project example

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-17 16:03:38 +00:00
johba
552b6edbaf Merge pull request 'fix: dev-agent CI wait loop blocks forever without CI' (#62) from fix/agent-ci-wait-no-ci into main
Reviewed-on: https://codeberg.org/johba/disinto/pulls/62
Reviewed-by: review_bot <review_bot@noreply.codeberg.org>
2026-03-17 17:00:54 +01:00
openhands
b6a1d0300a fix: docs: update BOOTSTRAP.md with multi-project setup lessons (#47)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-17 15:50:34 +00:00
johba
47f2686ad7 Merge pull request 'fix: feat: supervisor breaks down escalated CI failures into sub-issues (#52)' (#61) from fix/issue-52 into main 2026-03-17 16:37:03 +01:00
openhands
8c816d6e7b fix: dev-agent CI wait loop blocks forever for projects without CI
The wait-for-CI loop sleeps 30s × 60 iterations waiting for CI
to report. Projects with WOODPECKER_REPO_ID=0 never get a status,
so the agent times out after 30min without merging approved PRs.

Now detects no-CI early and treats as success immediately.
2026-03-17 15:35:40 +00:00
openhands
13bc948b1d fix: address review findings for escalation race condition, SQL injection, and sc_codes scope
- Race condition: mv escalations.jsonl to a PID-stamped snapshot before
  processing so concurrent dev-poll appends go to a fresh file; rm snapshot
  after loop — no entries are ever silently dropped
- SQL injection: validate ESC_PR_SHA is a 40-char hex string before
  interpolating into the wpdb query
- sc_codes scope: compute per-file from file_errors (already filtered to
  that file) instead of the entire step log; also switch grep to -F so
  dots in filenames are not treated as regex wildcards
- step_pid validation: reject non-integer values from Woodpecker API before
  passing as CLI argument
- Fallback body now distinguishes "CI logs unavailable" from "logs found
  but issue creation API calls failed"
- ESC_GENERIC_FAIL: avoid leading blank line by using conditional separator
  and fix code-block opening newline
- is_escalated(): remove dead esc_file/done_file locals; add Python-level
  int() guard so empty/non-numeric issue or pr values fail cleanly instead
  of producing a syntax error suppressed by 2>/dev/null

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-17 15:11:53 +00:00
openhands
d9520f48a6 fix: feat: supervisor breaks down escalated CI failures into sub-issues (#52)
- supervisor-poll.sh: replace P3 escalation log with actionable sub-issue creation.
  For each entry in escalations.jsonl: fetch CI logs via woodpecker-cli, create one
  sub-issue per file for ShellCheck failures, one combined issue for other CI failures,
  or a fallback investigation issue if logs are unavailable. Move processed entries to
  escalations.done.jsonl and clear escalations.jsonl.
- dev-poll.sh: add is_escalated() helper that checks both escalations.jsonl and
  escalations.done.jsonl; use it (alongside ci_fix_count >= 3) in all three CI-fix
  spawn paths so escalated PRs are skipped even if the ci-fixes tracker is reset.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-17 14:32:41 +00:00
johba
531ae5cf71 fix: escalate once then continue to backlog (#59)
Two bugs after #53 merged:
1. Escalation written every poll cycle (4 entries in 30min) — now writes once, bumps counter to 4 to skip
2. Exit after escalation blocked backlog work — now falls through to pick up next issue

Co-authored-by: openhands <openhands@all-hands.dev>
Reviewed-on: https://codeberg.org/johba/disinto/pulls/59
Reviewed-by: review_bot <review_bot@noreply.codeberg.org>
2026-03-17 15:14:48 +01:00
johba
c24adc4ea2 fix: limit CI fix respawn to 3 attempts, then escalate to supervisor (#53)
Dev-poll spawned a fresh agent every 10min for CI failures. Each agent started with CI_FIX_COUNT=0 — infinite loop.

Now tracks attempts per PR in `/tmp/dev-poll-ci-fixes-{project}.json`. After 3 failed rounds:
- Writes escalation to `supervisor/escalations.jsonl`
- Sends Matrix alert
- Stops respawning

Part of #52 (supervisor escalation pipeline).

Co-authored-by: openhands <openhands@all-hands.dev>
Reviewed-on: https://codeberg.org/johba/disinto/pulls/53
Reviewed-by: review_bot <review_bot@noreply.codeberg.org>
2026-03-17 13:15:49 +01:00
openhands
f541bcb073 fix: address AI review findings for CI pipeline and duplicate detection
- Fix anti-pattern regex 2 to match quoted form '"$CI_STATE" != "success"'
  (was r'\$CI_STATE\s*!=\s*"success"', now r'"?\$CI_STATE"?\s*!=\s*"success"')
- Update both anti-pattern messages to say 'extract ci_passed() to lib/'
  instead of implying it already exists as a shared helper in dev-poll.sh
- Add explicit 'when: event: [push, pull_request]' trigger block to ci.yml
- Add '-r' to xargs in shellcheck step to handle zero .sh files gracefully
- Fix operator precedence bug in review-poll.sh:62: scope the OR clause
  with braces so CI_STATE=pending bypass only applies when WOODPECKER_REPO_ID=0

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-17 10:18:39 +00:00
openhands
ee1af38390 fix: feat: Woodpecker CI pipeline with ShellCheck + duplicate code detection (#45)
- Add .woodpecker/ci.yml: two-step pipeline (shellcheck + duplicate detection)
- Add .woodpecker/detect-duplicates.py: sliding-window hash detection (5-line
  windows, 2+ files) plus grep-based anti-pattern checks (hardcoded CI_STATE,
  hardcoded WOODPECKER_REPO_ID). Runs as failure: ignore so CI stays green
  while findings are visible in logs.
- Add .shellcheckrc: disable SC1090/SC1091 (dynamic source paths are
  intentional; all scripts use the same lib/env.sh pattern)
- Update projects/disinto.toml: woodpecker_repo_id = 4, remove bypass comment

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-17 10:02:58 +00:00
johba
740bddb2db Merge pull request 'fix: feat: RESOURCES.md — infrastructure manifest for planner resource awareness (#23)' (#33) from fix/issue-23 into main 2026-03-17 10:56:52 +01:00
openhands
567dc4bde0 fix: address review findings for supervisor metrics (#24)
- planner: filter CI and dev metrics by project name to prevent cross-project pollution
- planner: replace fragile awk JSONL filter with jq select()
- supervisor: add codeberg_count_paginated() helper; replace hardcoded limit=50 dev-metric API calls with paginated counts so projects with >50 issues report accurate blocked-ratio data
- supervisor: add 24h age filter to CI metric SQL query so stale pipelines are not re-emitted with a fresh timestamp
- supervisor: replace fragile awk key-order-dependent JSON filter in rotate_metrics() with jq select(); add safety guard to prevent overwriting file with empty result on parse failure
- supervisor: move mkdir -p for metrics dir to startup (once) instead of every emit_metric() call
- supervisor: guard _RAM_TOTAL_MB against empty value in bash arithmetic

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-17 10:56:37 +01:00
openhands
313b6e134c chore: exclude metrics/supervisor-metrics.jsonl from git tracking
Runtime metrics file should not be tracked in version control.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-17 10:56:37 +01:00
openhands
53c1fea6ea fix: feat: supervisor metrics logging for planner trend analysis (#24)
- supervisor-poll.sh: append structured JSONL metrics on every poll
  - infra metric (ram_used_pct, disk_used_pct, swap_mb) after Layer 1 checks
  - ci metric (pipeline id, duration_min, status) per project via wpdb query
  - dev metric (issues_in_backlog, issues_blocked, pr_open) per project via Codeberg API
  - rotate_metrics() trims metrics/supervisor-metrics.jsonl to last 30 days on startup
- planner-agent.sh: reads last 7 days of metrics before Phase 2 gap analysis
  - computes avg CI duration, success rate, RAM/disk utilization, blocked ratio
  - injects summary into gap analysis prompt as "Operational metrics" section
  - instructs planner to create optimization issues when metrics conflict with VISION.md

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-17 10:56:37 +01:00
johba
183059d43f Merge pull request 'fix: ci_passed() helper — fix all CI gates in dev-poll for no-CI projects' (#44) from fix/ci-passed-helper into main
Reviewed-on: https://codeberg.org/johba/disinto/pulls/44
2026-03-17 10:51:59 +01:00
openhands
ef77c56217 fix: extract ci_passed() helper — fix all CI gates for no-CI projects
dev-poll.sh had 5 places checking CI_STATE='success', all blocking
projects without CI. Extracted ci_passed() helper that treats
empty/pending/unknown as pass when WOODPECKER_REPO_ID=0.
2026-03-17 09:51:18 +00:00
johba
915ff45cc6 Merge pull request 'fix: dev-agent merge gate blocks projects without CI' (#43) from fix/dev-agent-merge-no-ci into main
Reviewed-on: https://codeberg.org/johba/disinto/pulls/43
2026-03-17 10:49:19 +01:00
openhands
ad9d68e525 fix: dev-agent merge gate requires CI even for projects without CI
Same pattern as review-poll — projects with WOODPECKER_REPO_ID=0
treat empty/unknown CI as pass for the merge gate.
2026-03-17 09:48:13 +00:00
johba
0490a4b8d8 Merge pull request 'fix: auto-close issues when dev-agent detects already_done' (#42) from fix/close-already-done into main
Reviewed-on: https://codeberg.org/johba/disinto/pulls/42
2026-03-17 10:39:22 +01:00
openhands
9445e36a1e fix: auto-close issues when dev-agent detects already_done
Previously the agent unclaimed the issue but left it open, causing
an infinite claim/refuse/unclaim loop on every poll cycle.
2026-03-17 09:38:08 +00:00
johba
18f57c5cc4 Merge pull request 'fix: enforce single-threaded pipeline per project' (#40) from fix/single-threaded-per-project into main
Reviewed-on: https://codeberg.org/johba/disinto/pulls/40
2026-03-17 10:24:03 +01:00
openhands
1b3559bba7 fix: enforce single-threaded pipeline per project
Don't start new issues while open PRs are waiting for review/CI.
This prevents dev-agent from churning through backlog issues
without reviews landing first.
2026-03-17 09:17:02 +00:00
johba
bff73ebcf7 Merge pull request 'fix: review-pr.sh also needs CI bypass for projects without CI' (#37) from fix/review-pr-ci-bypass into main
Reviewed-on: https://codeberg.org/johba/disinto/pulls/37
2026-03-17 10:10:42 +01:00
openhands
98a9ada9c3 fix: review-pr.sh also needs CI bypass for projects without CI
review-poll.sh was fixed but review-pr.sh had its own CI gate that
still blocked. Both checks now skip CI requirement when
WOODPECKER_REPO_ID=0.
2026-03-17 09:10:06 +00:00
johba
cfd4619e81 Merge pull request 'fix: review-poll skips PRs when project has no CI' (#36) from fix/review-no-ci into main
Reviewed-on: https://codeberg.org/johba/disinto/pulls/36
2026-03-17 10:06:44 +01:00
openhands
51d2b81ef4 fix: review-poll skips PRs when project has no CI
Projects with woodpecker_repo_id=0 (like disinto) have no CI status.
Review-poll treated empty CI state as failure and skipped all PRs.
Now treats empty/pending CI as pass when no CI is configured.
2026-03-17 09:05:43 +00:00
johba
273803f47b Merge pull request 'fix: TMPDIR unbound variable crashes already_done handler' (#35) from fix/tmpdir-crash into main
Reviewed-on: https://codeberg.org/johba/disinto/pulls/35
2026-03-17 10:01:10 +01:00
openhands
ea033d3f04 fix: TMPDIR unbound variable crashes already_done handler
TMPDIR is not guaranteed to be set. Replaced with /tmp/ directly.
This caused harb dev-agent to crash when posting refusal comments,
leaving issues stuck in a retry loop.
2026-03-17 09:00:43 +00:00
johba
e2262b5737 Merge pull request 'feat: per-project Matrix room from project TOML' (#34) from feat/per-project-matrix into main
Reviewed-on: https://codeberg.org/johba/disinto/pulls/34
2026-03-17 09:59:34 +01:00
openhands
ed57eff704 feat: per-project Matrix room — load MATRIX_ROOM_ID from project TOML
Each project can specify its own Matrix room for notifications.
- harb → #harb-dev:matrix.allf.in
- disinto → #disinto-dev:matrix.allf.in
2026-03-17 08:56:00 +00:00