Commit graph

87 commits

Author SHA1 Message Date
openhands
deeedd0cbf fix: CODEBERG_WEB not exported from lib/env.sh — other agents may hit the same gap (#129)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-18 09:40:20 +00:00
openhands
19a245fe5e fix: Coordinate review injection between review-poll.sh and dev-agent.sh to prevent double-injection (#90)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-18 09:01:50 +00:00
openhands
9fa4846581 fix: ALL_COMMENTS fetch is capped at limit=50 — watermark search may miss reviews on high-comment PRs (#100)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-18 08:13:43 +00:00
openhands
88f2268bc6 fix: idle timeout does not escalate — session dies silently (#123)
1. Timeout handler (dev-agent.sh): write escalation to project-suffixed
   file, restore backlog label, clean up phase file on idle timeout.
2. Fix escalation file naming: escalations.jsonl → escalations-${PROJECT_NAME}.jsonl
   everywhere in dev-agent.sh so gardener actually picks them up.
3. Gardener (gardener-poll.sh): handle idle_timeout reason before CI-specific
   recipe logic — create investigation sub-issue instead of silently returning.
4. Update .gitignore to match new escalations-*.jsonl pattern.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-18 07:02:33 +00:00
openhands
2446543545 fix: feat/formula not merged but formula templates and label docs already on main (#69)
- dev-agent.sh: add explicit guard that skips formula-labeled issues with a
  clear log message instead of silently producing no formula behavior
- BOOTSTRAP.md: rewrite formula label entry to state it is not yet functional
  and that dev-agent will skip such issues until feat/formula is merged

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-18 02:17:18 +00:00
openhands
bd02330b22 fix: shellcheck TODO has no enforcement — || true may never be removed (#71)
- Fix SC2164: add || exit 1 to bare cd in update-prompt.sh
- Fix SC2155: separate declare and assign in env.sh, supervisor-poll.sh, dev-agent.sh
- Fix SC2034: inline suppression for vars used by sourced helpers
- Remove unused `mergeable` declaration, rename unused loop var to `_w`
- Remove || true from shellcheck CI step — failures are now blocking

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-18 01:53:02 +00:00
openhands
8034b50315 fix: address review findings from issue #76
- Fix double-injection bug: flat-file write only when direct tmux inject didn't happen
- Fix ci_exhausted href='#' fallback to use CODEBERG_WEB/pulls/N
- Remove duplicate $THREAD_FILE in rm command
- HTML-escape CI snippet before embedding in <pre> block
- notify_ctx falls back to plain matrix_send when no thread exists
- Thread root uses HTML-formatted message for consistency
- Deduplicate _ci_pipeline_url variable

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-18 00:42:00 +00:00
openhands
814706bf90 fix: feat: Matrix notifications — contextual, linked, conversational (#76)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-18 00:20:11 +00:00
openhands
bfe0c09b5c fix: address review findings from issue #81
- Fix dev-agent.sh comment: gardener-poll.sh is the backup injector, not review-poll.sh
- Add renotify marker cleanup to gardener injection path
- Use atomic mv to claim reply file, preventing double-injection race between supervisor and gardener
- Add break after supervisor injection for symmetry with gardener
- Remove overly prescriptive PHASE:awaiting_ci hardcode from injection instructions

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 22:40:54 +00:00
openhands
48683e508c fix: feat: supervisor-poll.sh and gardener-poll.sh inject human replies into needs_human dev sessions (#81)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 22:33:28 +00:00
openhands
515c528e10 fix: poll for claude readiness before injecting prompt into tmux
Replace fixed sleep(3) + paste-buffer race with a wait_for_claude_ready()
function that polls the tmux pane for the ❯ prompt (up to 120s). This
fixes the bug where the initial prompt was pasted before Claude Code
finished initializing, resulting in a stuck session with an empty prompt.

Observed on issue #81: session sat idle for 42+ minutes because the
paste arrived during Claude's startup splash screen.

Changes:
- Add wait_for_claude_ready() that polls tmux capture-pane for ❯
- Call it inside inject_into_session() before every paste
- Use inject_into_session() for initial prompt (was inline paste-buffer)
- Remove fixed sleep(3) from session creation and recovery paths
- Fail hard if claude doesn't become ready within timeout

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 23:23:12 +01:00
openhands
d59c09eb5b fix: address review findings from issue #80 phase protocol
- Add missing MAX_CI_FIXES=3 and MAX_REVIEW_ROUNDS=5 constants to the
  config section; referencing undefined variables with set -euo pipefail
  caused an abort on first CI failure or REQUEST_CHANGES review.

- cleanup() trap now calls kill_tmux_session() so any unexpected exit
  (SIGTERM, errexit, unbound variable) kills the Claude session rather
  than leaving it running autonomously without an orchestrator.

- do_merge() initial CI wait loop now breaks and returns 1 immediately
  on failure/error states, avoiding a full 10-minute poll before a
  merge attempt that would also fail.

- Inner review-poll loop no longer updates LAST_PHASE_MTIME when it
  detects a mid-wait phase-file change; leaving it stale ensures the
  outer loop detects and dispatches the new phase on its next tick
  (previously the phase was silently swallowed).

- post_refusal_comment dedup now fetches the last 5 comments and checks
  any of them, so a human reply between two agent runs no longer causes
  a duplicate refusal comment.

- Remove duplicate DELETE labels/backlog call in claim section.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-17 20:40:35 +00:00
openhands
db92bc13b5 fix: feat: tmux session manager in dev-agent.sh (#80)
Replace fire-and-forget `claude -p` calls with a persistent tmux session
that Claude Code runs in interactively. The orchestrator (dev-agent.sh)
monitors a phase file and reacts to Claude's signals:

- Session lifecycle: create `dev-{project}-{issue}` tmux session, send
  the full initial prompt (issue body + phase protocol instructions) via
  `tmux load-buffer` / `tmux paste-buffer`, then enter a phase monitor loop.

- Phase monitor loop: polls `/tmp/dev-session-{project}-{issue}.phase`
  every 30s for mtime changes. Handles all five phase sentinels:
  - PHASE:awaiting_ci   → create PR if needed, poll CI, inject result
  - PHASE:awaiting_review → poll for review comment, inject verdict
  - PHASE:needs_human  → send Matrix notification, wait for injection
  - PHASE:done         → call do_merge(), exit on success
  - PHASE:failed       → detect refusal JSON vs genuine failure, post
                          comment / escalate, kill session, restore backlog

- Crash recovery: if the tmux session dies unexpectedly, dev-agent.sh
  restarts it in the same worktree and injects a recovery prompt with
  the last known phase and git diff.

- Idle timeout: 2h with no phase update kills the session gracefully.

- PR creation moved into the PHASE:awaiting_ci handler; Claude pushes the
  branch and writes the phase, orchestrator creates the PR and starts CI.

- Summary file `/tmp/dev-impl-summary-{project}-{issue}.txt` carries the
  implementation summary (for PR body) and refusal JSON between Claude and
  the orchestrator.

- All existing logic preserved: dep preflight, label management, do_merge()
  with rebase retry, CI escalation, prior art detection, log rotation.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-17 20:20:38 +00:00
openhands
1b29baebc3 fix: feat: auto-pull factory code on every agent spawn (#85)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-17 19:35:29 +00:00
openhands
8c816d6e7b fix: dev-agent CI wait loop blocks forever for projects without CI
The wait-for-CI loop sleeps 30s × 60 iterations waiting for CI
to report. Projects with WOODPECKER_REPO_ID=0 never get a status,
so the agent times out after 30min without merging approved PRs.

Now detects no-CI early and treats as success immediately.
2026-03-17 15:35:40 +00:00
johba
915ff45cc6 Merge pull request 'fix: dev-agent merge gate blocks projects without CI' (#43) from fix/dev-agent-merge-no-ci into main
Reviewed-on: https://codeberg.org/johba/disinto/pulls/43
2026-03-17 10:49:19 +01:00
openhands
ad9d68e525 fix: dev-agent merge gate requires CI even for projects without CI
Same pattern as review-poll — projects with WOODPECKER_REPO_ID=0
treat empty/unknown CI as pass for the merge gate.
2026-03-17 09:48:13 +00:00
openhands
9445e36a1e fix: auto-close issues when dev-agent detects already_done
Previously the agent unclaimed the issue but left it open, causing
an infinite claim/refuse/unclaim loop on every poll cycle.
2026-03-17 09:38:08 +00:00
openhands
ea033d3f04 fix: TMPDIR unbound variable crashes already_done handler
TMPDIR is not guaranteed to be set. Replaced with /tmp/ directly.
This caused harb dev-agent to crash when posting refusal comments,
leaving issues stuck in a retry loop.
2026-03-17 09:00:43 +00:00
openhands
b376fbc25e fix: dev-agent.sh also needs per-project lock file
dev-poll.sh was fixed but dev-agent.sh still used hardcoded
/tmp/dev-agent.lock. The disinto agent locked out harb's agent.
2026-03-17 08:41:15 +00:00
openhands
5aa0b42481 fix: dev-agent preflight treats ## Related refs as dependencies
The broad regex `(?:^|\n)\s*-\s*#\K[0-9]+` matched ANY bullet with #NNN,
including ## Related sections. This caused #893 (and likely others) to be
permanently blocked by sibling issues that aren't actual dependencies.

Now only extracts deps from:
- Inline 'depends on #NNN' / 'blocked by #NNN' phrases
- ## Dependencies / ## Depends on / ## Blocked by sections

This matches the same logic used by dev-poll.sh get_deps().
2026-03-17 05:58:54 +00:00
johba
77cb4c4643 refactor: rename factory/ → supervisor/, factory-poll → supervisor-poll
The supervisor agent was confusingly named "factory" (same as the
project). Rename directory, script, log, lock, status, and escalation
files. Update all references across scripts and docs.

FACTORY_ROOT env var unchanged (refers to project root, not agent).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-15 18:06:25 +01:00
openhands
0a0d5e8e24 fix: inline merge+rebase in recovery path (do_merge not yet defined)
do_merge() is defined at line 876, but recovery mode calls it at
line ~498. Bash requires functions to be defined before use.
Inlined the merge→rebase→re-approve→retry logic directly.
2026-03-15 14:10:21 +00:00
openhands
2c527cef4a fix: dev-agent handles approved+stuck PRs in recovery mode
1. Recovery mode: if PR already has approval + green CI, try merge
   immediately instead of entering the review wait loop forever.
2. do_merge: on 405/merge failure, rebase → force push → wait CI →
   re-approve via review_bot → retry merge. Covers the stale-approval
   dismissal problem end-to-end.
3. Codeberg mergeable field is unreliable — rebase on any merge failure.
2026-03-15 14:09:33 +00:00
johba
9b0c1e6c30 feat: add planner-agent, remove STATE.md append from dev-agent
- Remove write_state_entry/append_state_log from dev-agent (#10)
- Add planner-agent.sh: rebuilds STATE.md from git history + closed
  issues, then gap-analyses against VISION.md to create backlog
  issues (#6, #7)
- Add planner-poll.sh: cron wrapper with lock + memory guard

STATE.md is now solely owned by the planner — one compact snapshot
rebuilt each run, not an ever-growing append log.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-15 11:45:16 +01:00
johba
f215fbe3cf feat: add Matrix coordination channel, replace openclaw (Closes #8)
Add matrix_send() to lib/env.sh and matrix_listener.sh daemon for
real-time notifications, threaded escalations, and human-in-the-loop
replies. All agents now notify via Matrix instead of openclaw.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 16:25:33 +01:00
johba
90ef03a304 refactor: make all scripts multi-project via env vars
Replace hardcoded harb references across the entire codebase:
- HARB_REPO_ROOT → PROJECT_REPO_ROOT (with deprecated alias)
- Derive PROJECT_NAME from CODEBERG_REPO slug
- Add PRIMARY_BRANCH (master/main), WOODPECKER_REPO_ID env vars
- Parameterize worktree prefixes, docker container names, branch refs
- Genericize agent prompts (gardener, factory supervisor)
- Update best-practices docs to use $-vars, prefix harb lessons

All project-specific values now flow from .env → lib/env.sh → scripts.
Backward-compatible: existing harb setups work without .env changes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 13:49:09 +01:00
openhands
2b3c488f1c fix: STATE.md entries without status prefix — reads as done once merged 2026-03-14 12:14:42 +00:00
openhands
7fd913596b fix: write_state_entry defined after call site — crashes dev-agent
Function was defined at line 867 but called at line 550. Bash requires
functions to be defined before invocation. Moved to top with other
helpers. Also removed duplicate definition.
2026-03-14 11:01:05 +00:00
openhands
0f979fd6c9 fix: stuck PRs priority + STATE.md in first commit + 405 bug in dev-poll
1. PRIORITY 1.5 in dev-poll: scan ALL open PRs for REQUEST_CHANGES or CI
   failure before picking new backlog issues. Stuck PRs get fixed first
   to avoid complex rebases piling up.

2. STATE.md written in worktree before claude starts (included in first
   commit, not a separate push that dismisses stale approvals).

3. Removed HTTP 405 from merge success check in dev-poll.sh (was fixed
   in dev-agent.sh but not here — 2 occurrences).
2026-03-14 07:34:47 +00:00
openhands
8e3b72d13f feat: dev-agent auto-rebase before merge
When PR has merge conflicts (mergeable=false), attempt git rebase
before merge. If rebase fails, abort and escalate via notify.

Flow: approval → check mergeable → rebase if needed → wait CI → merge

Resolves the serial seed PR bottleneck where append-only files
(manifest.jsonl) create trivial conflicts that block the pipeline.
2026-03-13 19:56:12 +00:00
openhands
0132c7acc4 fix: 405 treated as merge success + STATE.md push dismissed approvals
Root cause: Two bugs combined to silently close PRs without merging.

1. HTTP 405 ('not allowed to merge') was in the success condition
   alongside 200/204. Codeberg returns 405 when branch protection
   blocks the merge (e.g., stale approvals).

2. append_state_log pushed a new commit AFTER review_bot approved,
   but BEFORE the merge attempt. With dismiss_stale_approvals=true,
   the new commit automatically dismissed the approval → 405.

Impact: 6 PRs (#683, #688, #692, #695, #696, #699) were 'merged'
(logged as success, branch deleted, issue closed) but never actually
merged into master. All work was lost.

Fixes:
- Remove 405 from merge success check
- Move STATE.md append out of pre-merge path
2026-03-13 17:41:10 +00:00
openhands
499f6d8828 feat: STATE.md append before merge, lives in harb repo
Moved from dark-factory to harb. Dev-agent appends one line to
STATE.md on the PR branch right before merge — goes through
review like any other change.
2026-03-13 10:27:10 +00:00
openhands
ed58874890 feat: STATE.md append on merge (dark-factory#5)
After each successful PR merge, dev-agent appends one line to
STATE.md: - [date] what now exists (#PR)

Lives in dark-factory repo (harb master is protected).
Planner will collapse this into a compact snapshot later.
2026-03-13 10:25:00 +00:00
openhands
64b464b01b feat: dev-agent → supervisor escalation via escalations.jsonl
When dev-agent exhausts CI fix budget, writes escalation marker.
Supervisor picks it up next poll and invokes claude -p to diagnose
(flaky test? rate limit? real bug?) and decide: fix, close PR, or
escalate to human.
2026-03-13 06:51:53 +00:00
openhands
4895ad1989 refactor: ci-debug to lib, rewrite README
- Moved ci-debug.sh from dev/ to lib/ (shared utility)
- README: fixed supervisor description (all alerts go to claude)
- README: replaced implementation details with actual design principles
2026-03-12 18:10:25 +00:00
openhands
cb24968d9b feat: dark factory — autonomous CI/CD agents for harb
Three agents extracted from ~/scripts/harb-{dev,review}/:

- dev/ — pull-based dev agent (find ready issues → implement → PR → merge)
- review/ — AI code review (structured verdicts, follow-up issues)
- factory/ — supervisor (bash health checks, auto-fix, escalation)

All secrets externalized to .env (see .env.example).
Shared env/helpers in lib/env.sh.
2026-03-12 12:44:15 +00:00