Commit graph

71 commits

Author SHA1 Message Date
openhands
d9520f48a6 fix: feat: supervisor breaks down escalated CI failures into sub-issues (#52)
- supervisor-poll.sh: replace P3 escalation log with actionable sub-issue creation.
  For each entry in escalations.jsonl: fetch CI logs via woodpecker-cli, create one
  sub-issue per file for ShellCheck failures, one combined issue for other CI failures,
  or a fallback investigation issue if logs are unavailable. Move processed entries to
  escalations.done.jsonl and clear escalations.jsonl.
- dev-poll.sh: add is_escalated() helper that checks both escalations.jsonl and
  escalations.done.jsonl; use it (alongside ci_fix_count >= 3) in all three CI-fix
  spawn paths so escalated PRs are skipped even if the ci-fixes tracker is reset.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-17 14:32:41 +00:00
johba
531ae5cf71 fix: escalate once then continue to backlog (#59)
Two bugs after #53 merged:
1. Escalation written every poll cycle (4 entries in 30min) — now writes once, bumps counter to 4 to skip
2. Exit after escalation blocked backlog work — now falls through to pick up next issue

Co-authored-by: openhands <openhands@all-hands.dev>
Reviewed-on: https://codeberg.org/johba/disinto/pulls/59
Reviewed-by: review_bot <review_bot@noreply.codeberg.org>
2026-03-17 15:14:48 +01:00
johba
c24adc4ea2 fix: limit CI fix respawn to 3 attempts, then escalate to supervisor (#53)
Dev-poll spawned a fresh agent every 10min for CI failures. Each agent started with CI_FIX_COUNT=0 — infinite loop.

Now tracks attempts per PR in `/tmp/dev-poll-ci-fixes-{project}.json`. After 3 failed rounds:
- Writes escalation to `supervisor/escalations.jsonl`
- Sends Matrix alert
- Stops respawning

Part of #52 (supervisor escalation pipeline).

Co-authored-by: openhands <openhands@all-hands.dev>
Reviewed-on: https://codeberg.org/johba/disinto/pulls/53
Reviewed-by: review_bot <review_bot@noreply.codeberg.org>
2026-03-17 13:15:49 +01:00
openhands
ef77c56217 fix: extract ci_passed() helper — fix all CI gates for no-CI projects
dev-poll.sh had 5 places checking CI_STATE='success', all blocking
projects without CI. Extracted ci_passed() helper that treats
empty/pending/unknown as pass when WOODPECKER_REPO_ID=0.
2026-03-17 09:51:18 +00:00
openhands
1b3559bba7 fix: enforce single-threaded pipeline per project
Don't start new issues while open PRs are waiting for review/CI.
This prevents dev-agent from churning through backlog issues
without reviews landing first.
2026-03-17 09:17:02 +00:00
openhands
249eef86c1 fix: per-project lock and log files for dev-poll
Hardcoded /tmp/dev-agent.lock meant harb and disinto dev-polls shared
a lock — one project's running agent blocked the other. Now uses
/tmp/dev-agent-{project}.lock and dev-agent-{project}.log.
2026-03-17 08:18:24 +00:00
johba
9050413994 refactor: split supervisor into infra + per-project, make poll scripts config-driven
Supervisor split (#26):
- Layer 1 (infra): P0 memory, P1 disk, P4 housekeeping — runs once, project-agnostic
- Layer 2 (per-project): P2 CI/dev-agent, P3 PRs/deps — iterates projects/*.toml
- Adding a new project requires only a new TOML file, no code changes

Poll scripts accept project TOML arg (#27):
- dev-poll.sh, review-poll.sh, gardener-poll.sh accept optional project TOML as $1
- env.sh loads PROJECT_TOML if set, overriding .env defaults
- Cron: `dev-poll.sh projects/versi.toml` targets that project

New files:
- lib/load-project.sh: TOML to env var loader (Python tomllib)
- projects/versi.toml: current project config extracted from .env

Backwards compatible: scripts without a TOML arg fall back to .env config.

Closes #26, Closes #27

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 08:57:18 +01:00
johba
98f0c40106 refactor: rewrite parse-deps.py as pure bash, remove only Python from repo
Replace lib/parse-deps.py with lib/parse-deps.sh to keep the toolchain
all-bash. Rewrite supervisor P3b cycle detection and P3c stale dep check
as pure bash using associative arrays and DFS.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-16 21:22:53 +01:00
johba
6cf580c010 refactor: extract shared dep parser to lib/parse-deps.py (Closes #20)
Single source of truth for dependency parsing, replacing three copies:
- dev-poll.sh get_deps() now calls parse-deps.py
- supervisor P3b/P3c import parse_deps() via importlib

Supports stdin, argument, and --json modes for different callers.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-16 21:16:49 +01:00
johba
77cb4c4643 refactor: rename factory/ → supervisor/, factory-poll → supervisor-poll
The supervisor agent was confusingly named "factory" (same as the
project). Rename directory, script, log, lock, status, and escalation
files. Update all references across scripts and docs.

FACTORY_ROOT env var unchanged (refers to project root, not agent).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-15 18:06:25 +01:00
openhands
c5fdd8ac50 fix: always rebase on merge failure, don't trust mergeable field
Codeberg's mergeable field flickers between true/false — unreliable
for deciding whether to rebase. Just attempt rebase on any non-200/204.
Worst case it's a no-op. Also added git fetch before rebase.
2026-03-15 10:51:09 +00:00
openhands
c22f1acbdf fix: add matrix notifications for silent failure paths
dev-poll.sh:
- Merge conflicts (rebase attempt + outcome)
- Non-conflict merge failures (HTTP code)
- Low memory skip
- New issue launch

review-poll.sh:
- Review script failure
2026-03-15 10:27:23 +00:00
openhands
b4d14c4c98 fix: auto-rebase on merge conflict (mergeable=false)
When merge returns non-200, check mergeable flag. If false,
rebase the PR branch onto master via worktree. If rebase fails,
spawn dev-agent to resolve. Prevents infinite 405 retry loops.

Extracted try_merge_or_rebase() helper used at all 3 merge points.
2026-03-15 10:21:40 +00:00
johba
f215fbe3cf feat: add Matrix coordination channel, replace openclaw (Closes #8)
Add matrix_send() to lib/env.sh and matrix_listener.sh daemon for
real-time notifications, threaded escalations, and human-in-the-loop
replies. All agents now notify via Matrix instead of openclaw.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 16:25:33 +01:00
johba
90ef03a304 refactor: make all scripts multi-project via env vars
Replace hardcoded harb references across the entire codebase:
- HARB_REPO_ROOT → PROJECT_REPO_ROOT (with deprecated alias)
- Derive PROJECT_NAME from CODEBERG_REPO slug
- Add PRIMARY_BRANCH (master/main), WOODPECKER_REPO_ID env vars
- Parameterize worktree prefixes, docker container names, branch refs
- Genericize agent prompts (gardener, factory supervisor)
- Update best-practices docs to use $-vars, prefix harb lessons

All project-specific values now flow from .env → lib/env.sh → scripts.
Backward-compatible: existing harb setups work without .env changes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 13:49:09 +01:00
openhands
a3f0f7f6f3 fix: stuck PR issue extraction — check title, body (Closes #N), log skip
PRs #684 and #710 had no issue number in branch name or title.
Now also checks PR body for 'Closes #NNN'. If still no issue found,
logs a skip (dev-agent requires an issue number to work).
2026-03-14 12:01:36 +00:00
openhands
30b31c76aa fix: stuck PR detection only matched fix/issue-NNN branches
PRs with custom branch names (fix/fitness-factory-address,
chore/seed-consolidation) were invisible to priority 1.5.
Now also extracts issue number from PR title (#NNN) as fallback.
2026-03-14 11:12:27 +00:00
openhands
0f979fd6c9 fix: stuck PRs priority + STATE.md in first commit + 405 bug in dev-poll
1. PRIORITY 1.5 in dev-poll: scan ALL open PRs for REQUEST_CHANGES or CI
   failure before picking new backlog issues. Stuck PRs get fixed first
   to avoid complex rebases piling up.

2. STATE.md written in worktree before claude starts (included in first
   commit, not a separate push that dismisses stale approvals).

3. Removed HTTP 405 from merge success check in dev-poll.sh (was fixed
   in dev-agent.sh but not here — 2 occurrences).
2026-03-14 07:34:47 +00:00
openhands
98210cc302 fix: dep check — trust closed state, drop merged-PR search
The merged-PR search was over-engineered and caused false negatives
(couldn't match PR to issue when title/body didn't contain #NNN).
Issue closed = dep satisfied. Factory only closes after merging.
2026-03-13 11:25:35 +00:00
openhands
d61dead3f1 fix: dep check fallback — also check PR with same number as issue
Codeberg uses shared issue/PR numbering. When a PR IS the dep issue
(e.g. PR #665 fixes issue #665), the title search misses it.
Fallback checks if pulls/{dep_num} is merged.
2026-03-13 11:24:14 +00:00
openhands
cb24968d9b feat: dark factory — autonomous CI/CD agents for harb
Three agents extracted from ~/scripts/harb-{dev,review}/:

- dev/ — pull-based dev agent (find ready issues → implement → PR → merge)
- review/ — AI code review (structured verdicts, follow-up issues)
- factory/ — supervisor (bash health checks, auto-fix, escalation)

All secrets externalized to .env (see .env.example).
Shared env/helpers in lib/env.sh.
2026-03-12 12:44:15 +00:00