Commit graph

49 commits

Author SHA1 Message Date
openhands
8193e7bc96 fix: extract build_formula_issue_body to eliminate duplicate code blocks
Move TOML frontmatter construction into a shared helper in
lib/file-action-issue.sh, used by both gardener-poll.sh and
gardener-run.sh. Fixes CI duplicate-detection failure.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 10:23:00 +00:00
openhands
1782cbd610 fix: gardener-poll.sh needs to file action issues (not call gardener-agent.sh directly) (#367)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 10:19:27 +00:00
openhands
082a472b9e fix: gardener creates investigation issues for already-closed escalations (#289)
Filter stale escalation entries in gardener-poll.sh before passing them
to the agent session. For each escalation reply line, extract referenced
issue numbers (#NNN) and check their current state via the API. Discard
entries where all referenced issues are already closed, preventing the
gardener from creating investigation issues for resolved problems.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 09:27:31 +00:00
openhands
7b6b56d761 fix: address review — restore +x, guard double comment, update stale docs (#352)
- Restore executable bit on gardener/gardener-poll.sh (cron invokes it directly)
- Add _BLOCKED_POSTED guard to prevent duplicate diagnostic comments when
  both _on_phase_change(PHASE:crashed) and the belt-and-suspenders exit
  handler both call post_blocked_diagnostic()
- Update stale documentation:
  - gardener-run.sh: remove "CI escalation recipes" from issue body
  - AGENTS.md: update directory layout comment for gardener-poll.sh
  - gardener-poll.sh: remove recipe engine description from header

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 05:55:27 +00:00
openhands
61c44d31b1 fix: refactor: replace escalation JSONL with blocked label + diagnostic comment (#352)
Replace the unreliable escalation JSONL system (supervisor/escalations-*.jsonl
consumed by gardener) with direct blocked label + diagnostic comment on the
original issue.

When a dev-agent or action-agent session fails (PHASE:failed, idle timeout,
crash, CI exhausted):
- Capture last 50 lines from tmux pane via tmux capture-pane
- Post a structured diagnostic comment on the issue (exit reason, timestamp,
  PR number, tmux output)
- Label the issue "blocked" (instead of restoring "backlog")
- Remove in-progress label

Removed:
- Escalation JSONL write paths in dev-agent.sh, phase-handler.sh, dev-poll.sh,
  action-agent.sh
- is_escalated() helper in dev-poll.sh
- Escalation triage (P2f section) in supervisor-poll.sh
- Escalation processing + recipe engine in gardener-poll.sh
- ci-escalation-recipes step from run-gardener.toml formula
- escalations*.jsonl from .gitignore

Added:
- post_blocked_diagnostic() shared helper in phase-handler.sh
- ensure_blocked_label_id() helper (creates label via API if not exists)
- is_blocked() helper in dev-poll.sh (replaces is_escalated)
- Blocked issues listing in supervisor/preflight.sh

Kept:
- Matrix notifications on failure (unchanged)
- CI fix counter logic (still tracks attempts)
- needs_human injection in supervisor/gardener (not escalation-related)
- Gardener grooming (gardener-agent.sh still invoked)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 04:18:43 +00:00
openhands
aa89e2b31e fix: move write_compact_context after create_agent_session in gardener-agent
The context file was written before the reset block that deleted it,
making compaction re-injection a no-op for gardener sessions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 23:35:34 +00:00
openhands
e3895ad3ac fix: feat: SessionStart compact hook re-injects phase protocol after context compaction (#274)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 23:27:32 +00:00
openhands
7199bbf9b5 fix: feat: agents flush context to scratch file before compaction (#262)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 20:12:45 +00:00
openhands
5bac4a8409 fix: extract lib/formula-session.sh to eliminate duplicate code blocks
Shared helpers for formula-driven cron agents: lock, memory guard,
formula loading, context building, session startup, crash recovery.

- planner-run.sh uses shared helpers instead of inline code
- gardener-agent.sh delegates crash recovery to formula_phase_callback
- agent-smoke.sh updated for renamed planner script + new lib file

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 13:53:33 +00:00
openhands
cc6a958245 fix: address review — guard grooming in gardener-poll.sh, doc fixes
- Add --recipes-only flag to gardener-poll.sh to skip grooming call when
  invoked by the formula's ci-escalation-recipes step (prevents double-run)
- Update formula step to pass --recipes-only
- Add lib/file-action-issue.sh to AGENTS.md shared helpers table
- Clarify TOML arg scope in gardener trigger description
- Fix log prefixes in gardener-run.sh (poll: → run:)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 13:02:33 +00:00
openhands
59b6d76afa fix: extract file_action_issue helper to eliminate duplicate code blocks
CI duplicate-detection flagged shared action-issue filing pattern between
gardener-run.sh and planner-poll.sh. Extract into lib/file-action-issue.sh
and refactor both scripts to use it.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 12:49:08 +00:00
openhands
eb90a42095 fix: gardener runs as cron-driven formula — runtime wrapper (#246)
Add gardener-run.sh as a thin cron wrapper that files an action issue
referencing formulas/run-gardener.toml, following the same pattern as
planner-poll.sh. The action-agent picks up the issue and executes the
gardener formula steps in an interactive Claude session.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 12:44:02 +00:00
openhands
ac04dc29a6 fix: feat: PostToolUse hook detects phase file writes in real-time (eliminates polling latency) (#278)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 17:55:06 +00:00
openhands
45745d2bfd fix: gardener-poll escalation consumer does not handle idle_prompt reason (#268)
Widen the escalation dispatch pattern from `idle_timeout*` to also match
`idle_prompt*`. When an idle_prompt escalation arrives, the gardener now
creates an investigation sub-issue with a tailored description (session
returned to prompt without writing a phase signal) instead of silently
falling through to the recipe engine.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 12:55:16 +00:00
johba
d5c2c213a3 fix: bug: gardener hangs forever when Claude finishes without writing phase file (#261) (#263)
Fixes #261

## Changes
Fixed gardener hanging forever when Claude skips phase protocol. Three changes: (1) gardener-agent.sh: replaced 999999s timeout with 7200s (2h, matching dev-agent); (2) lib/agent-session.sh: added idle-prompt detection to monitor_phase_loop — if Claude returns to the ❯ prompt for 3 consecutive polls with no phase file written, exits immediately with _MONITOR_LOOP_EXIT=idle_prompt (only fires when phase file is empty, so awaiting_ci/review waits are unaffected); (3) gardener prompt: removed 'no time limit' wording, replaced with explicit phase-write requirement.

Co-authored-by: openhands <openhands@all-hands.dev>
Reviewed-on: https://codeberg.org/johba/disinto/pulls/263
Reviewed-by: Disinto_bot <disinto_bot@noreply.codeberg.org>
2026-03-19 13:47:10 +01:00
openhands
e853949b47 fix: Callbacks can't see the resolved _session from monitor_phase_loop (#200)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-19 01:05:21 +00:00
openhands
833b07ed6e fix: labels:["backlog"] passes string name to Codeberg API that expects integer IDs (#164)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-18 20:36:39 +00:00
openhands
7456af65e9 fix: feat: gardener formula — groom-backlog.toml with verify loop, remove timeouts (#183)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-18 18:42:30 +00:00
openhands
d83098f382 fix: pass SESSION_NAME to all agent-session.sh function calls
Library functions need explicit session name argument — they no longer
have closure over $SESSION_NAME from the parent script.

- agent_kill_session: add $SESSION_NAME to all 11 call sites
- agent_inject_into_session: add $SESSION_NAME to all call sites in
  phase-handler.sh and gardener-agent.sh
- agent_kill_session: guard against missing arg (defensive)
2026-03-18 16:24:58 +00:00
openhands
ae3e742f9f fix: rename function calls to match agent-session.sh exports (#176)
kill_tmux_session → agent_kill_session
inject_into_session → agent_inject_into_session
wait_for_claude_ready → agent_wait_for_claude_ready

Also restore status() function lost during #160 refactor.

Fixes dev-agent and gardener-agent crash on startup:
  line 149: status: command not found
  line 280: kill_tmux_session: command not found
2026-03-18 16:10:12 +00:00
johba
6d5cc4458f fix: feat: gardener-agent.sh — tmux + Claude interactive gardener using agent-session.sh (#159) (#163)
Fixes #159

## Changes
Add gardener-agent.sh (tmux+Claude) and lib/agent-session.sh (shared helpers). gardener-poll.sh slimmed to cron wrapper; grooming delegated to new agent; recipe engine for CI escalations unchanged.

Co-authored-by: openhands <openhands@all-hands.dev>
Reviewed-on: https://codeberg.org/johba/disinto/pulls/163
Reviewed-by: review_bot <review_bot@noreply.codeberg.org>
2026-03-18 16:21:07 +01:00
openhands
ff3e790f51 fix: remove head -10 cap and update tech-debt problem label (#151)
Remove the head -10 cap from TECH_DEBT_ISSUES so Claude sees all
tech-debt issues, not just the first 10. Apply a head -50 guard on
the list passed in PROBLEMS to avoid oversized prompts while still
feeding far more than the old cap. Update the problem label to drop
"max 10 per run" text which contradicted the zero-tech-debt objective.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-18 11:03:29 +00:00
openhands
716bea9d7c fix: gardener objective: zero tech-debt issues per run (#151)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-18 10:45:31 +00:00
openhands
90762d8de3 fix: address review feedback — CODEBERG_WEB unbound, title prefix, emoji
- Replace ${CODEBERG_WEB} with inline https://codeberg.org/${CODEBERG_REPO}
  to avoid unbound variable crash in gardener-poll.sh (set -euo pipefail)
- Change sub-issue title prefix from fix: to chore: since it's an
  investigation task, not a code fix
- Add emoji prefix to idle_timeout matrix notification for consistency

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-18 07:18:57 +00:00
openhands
88f2268bc6 fix: idle timeout does not escalate — session dies silently (#123)
1. Timeout handler (dev-agent.sh): write escalation to project-suffixed
   file, restore backlog label, clean up phase file on idle timeout.
2. Fix escalation file naming: escalations.jsonl → escalations-${PROJECT_NAME}.jsonl
   everywhere in dev-agent.sh so gardener actually picks them up.
3. Gardener (gardener-poll.sh): handle idle_timeout reason before CI-specific
   recipe logic — create investigation sub-issue instead of silently returning.
4. Update .gitignore to match new escalations-*.jsonl pattern.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-18 07:02:33 +00:00
openhands
d6e91b2466 fix: address review feedback — recipe engine robustness and correctness
- Bug: chicken-egg-ci create-per-file-issues was aliased to shellcheck-only
  function. Added generic playbook_lint_per_file() that handles any linter
  output format. Renamed action to lint-per-file.
- Bug: cascade-rebase fired retry-merge synchronously after async rebase.
  Removed retry-merge and re-approve from recipe — rebase settles, CI reruns,
  normal flow handles merge on subsequent cycle.
- Warning: jq calls on PR data lacked || true under set -euo pipefail. Fixed.
- Warning: playbook_rebase_pr and playbook_retrigger_ci incremented
  _PB_SUB_CREATED before confirming API success. Now check HTTP status code.
- Warning: Python import tomllib fails on < 3.11. Added try/except fallback
  to tomli package.
- Nit: failures_on_unchanged regex broadened to handle generic linter formats
  (file.sh:line:col patterns in addition to ShellCheck's "In file line N:").
- Info: match_recipe now logs Python stderr on error instead of silently
  falling back to generic recipe.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-18 03:05:09 +00:00
openhands
cb8a9bc6e5 fix: restore executable permission on gardener-poll.sh
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-18 02:53:08 +00:00
openhands
f293dd6269 fix: feat: gardener escalation recipes — pattern-matched playbooks for CI failures (#68)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-18 02:53:03 +00:00
openhands
a8d10931f6 fix: address review findings from issue #74
- Add dedup guard: skip dust entries for issues already in dust.jsonl
- Inject already-staged issue list into LLM prompt to prevent re-emission
- Guard mv after jq: only overwrite dust.jsonl if jq succeeded
- Use sort -nu for numeric dedup of issue numbers
- Compute bundle count from distinct issues, not raw entries
- Add 30-day TTL expiry for sub-threshold dust groups
- Fix inconsistent heading levels in bundle body (all ###)
- Add scope note to PROMPT.md (human docs only, not injected)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-18 01:41:14 +00:00
openhands
530ce7319f fix: feat: gardener bundles dust into ore before promoting to backlog (#74)
- Add dust/ore rule to gardener LLM prompt: trivial tech-debt (comment
  fix, rename, style-only, single-line) outputs DUST: JSON instead of
  promoting individually
- Parse DUST lines from LLM output, validate JSON, append to dust.jsonl
  with timestamp
- After evaluation pass: check groups with 3+ items, create bundled
  backlog issue, close source issues with cross-reference
- Add gardener/dust.jsonl to .gitignore
- Create gardener/PROMPT.md documenting the dust vs ore philosophy

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-18 01:33:09 +00:00
openhands
63e60de9d6 fix: address round 2 review findings from issue #81
- Move atomic mv inside gardener loop so reply is only claimed when a
  matching needs_human session exists (fixes reply-loss regression)
- Delay rm of claimed file until after successful injection in both
  supervisor and gardener (OOM/SIGKILL leaves file recoverable)
- Fix matrix_listener ack message: 'next poll' instead of 'next supervisor poll'

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 22:59:05 +00:00
openhands
bfe0c09b5c fix: address review findings from issue #81
- Fix dev-agent.sh comment: gardener-poll.sh is the backup injector, not review-poll.sh
- Add renotify marker cleanup to gardener injection path
- Use atomic mv to claim reply file, preventing double-injection race between supervisor and gardener
- Add break after supervisor injection for symmetry with gardener
- Remove overly prescriptive PHASE:awaiting_ci hardcode from injection instructions

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 22:40:54 +00:00
openhands
48683e508c fix: feat: supervisor-poll.sh and gardener-poll.sh inject human replies into needs_human dev sessions (#81)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 22:33:28 +00:00
openhands
df2522a7cb fix: address review findings from issue #67 escalation refactor
- supervisor: skip *.done.jsonl in escalation glob (bug: wildcard matched
  harb.done.jsonl producing spurious 'pending' log noise every cycle)
- supervisor: use wc -l instead of grep -c . for line counting (style nit)
- supervisor: consume gardener-esc-resolved.log via fixed() so escalation
  resolutions appear in end-of-cycle supervisor reporting
- dev-poll: update all 'escalated to supervisor' log/matrix strings to
  'escalated to gardener' (lines 263, 268, 344, 420)
- gardener: track _esc_total_created across all escalation entries and
  write count to supervisor/gardener-esc-resolved.log after processing

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-17 18:30:57 +00:00
openhands
150ede5605 fix: refactor: move escalation processing from supervisor to gardener (#67)
- dev-poll.sh: write escalations to per-project files
  (supervisor/escalations-{PROJECT_NAME}.jsonl) and add "project" field
  so each project's escalations are isolated; update is_escalated() to
  read from the same per-project paths
- gardener-poll.sh: add escalation processing block that reads the
  per-project escalation file, fetches CI logs via Woodpecker, and
  creates per-file ShellCheck sub-issues or generic CI failure issues
  labeled backlog — runs with the correct CODEBERG_API and
  WOODPECKER_REPO_ID already loaded from the project TOML
- supervisor-poll.sh: remove the escalation processing block; replace
  with a simple flog report counting pending escalations per project

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-17 17:32:56 +00:00
johba
9050413994 refactor: split supervisor into infra + per-project, make poll scripts config-driven
Supervisor split (#26):
- Layer 1 (infra): P0 memory, P1 disk, P4 housekeeping — runs once, project-agnostic
- Layer 2 (per-project): P2 CI/dev-agent, P3 PRs/deps — iterates projects/*.toml
- Adding a new project requires only a new TOML file, no code changes

Poll scripts accept project TOML arg (#27):
- dev-poll.sh, review-poll.sh, gardener-poll.sh accept optional project TOML as $1
- env.sh loads PROJECT_TOML if set, overriding .env defaults
- Cron: `dev-poll.sh projects/versi.toml` targets that project

New files:
- lib/load-project.sh: TOML to env var loader (Python tomllib)
- projects/versi.toml: current project config extracted from .env

Backwards compatible: scripts without a TOML arg fall back to .env config.

Closes #26, Closes #27

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 08:57:18 +01:00
openhands
2fd18b086d feat: gardener prioritizes blocker issues that starve the factory
Adds PRIORITY_blockers_starving_factory detection: scans backlog issues'
deps, finds any that are open but not labeled backlog, and puts them
at the top of the Claude prompt as highest priority for promotion.

Previously, gardener promoted random tech-debt issues while the actual
blockers (e.g. #650, #563, #714, #743) were ignored, leaving all
backlog items permanently stuck.
2026-03-16 21:06:01 +00:00
johba
acab6c95c8 feat: supervisor detects dep deadlocks, stale deps, and dev-agent blocked states
Add three new supervisor checks:
- P2c: alert when dev-agent reports "no ready issues" for 6+ consecutive polls
- P3b: detect circular dependency deadlocks via DFS cycle detection
- P3c: flag backlog issues blocked by deps open >30 days

Update supervisor PROMPT.md with guidance for Claude to resolve circular deps
by reading code context, and handle stale deps by checking relevance.

Gardener prompt now forbids bidirectional deps between sibling issues and
requires ## Related (not ## Dependencies) for cross-references.

Closes #16, Closes #17

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-16 21:07:02 +01:00
johba
1675e17502 fix: replace PRODUCT-TRUTH.md/ARCHITECTURE.md refs with AGENTS.md
These docs never existed — gardener and review-pr referenced them
as if they did. AGENTS.md tree is now the single architecture
reference for all agents.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-15 17:41:10 +01:00
johba
f215fbe3cf feat: add Matrix coordination channel, replace openclaw (Closes #8)
Add matrix_send() to lib/env.sh and matrix_listener.sh daemon for
real-time notifications, threaded escalations, and human-in-the-loop
replies. All agents now notify via Matrix instead of openclaw.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 16:25:33 +01:00
johba
90ef03a304 refactor: make all scripts multi-project via env vars
Replace hardcoded harb references across the entire codebase:
- HARB_REPO_ROOT → PROJECT_REPO_ROOT (with deprecated alias)
- Derive PROJECT_NAME from CODEBERG_REPO slug
- Add PRIMARY_BRANCH (master/main), WOODPECKER_REPO_ID env vars
- Parameterize worktree prefixes, docker container names, branch refs
- Genericize agent prompts (gardener, factory supervisor)
- Update best-practices docs to use $-vars, prefix harb lessons

All project-specific values now flow from .env → lib/env.sh → scripts.
Backward-compatible: existing harb setups work without .env changes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 13:49:09 +01:00
openhands
c3f24460a7 fix: gardener must ACTION or ESCALATE every tech-debt issue, never skip
Claude was silently skipping ambiguous issues instead of escalating.
Made output format mandatory and explicit: every issue in the list
must result in ACTION (promoted) or ESCALATE (needs decision).
2026-03-14 08:40:19 +00:00
openhands
793dafdb8a fix: gardener uses curl+CODEBERG_TOKEN instead of codeberg_api function
codeberg_api is a bash function in the gardener script's own process,
not available to claude-p's tool execution environment. Claude was
silently failing to call it and returning CLEAN.

Switch to curl commands with $CODEBERG_TOKEN env var that claude-p
can actually execute via its bash tool.
2026-03-13 22:35:30 +00:00
openhands
d137862813 fix: gardener tech-debt promotion not surfaced as problem
Tech-debt→backlog promotion was only in prompt text, not in the
problem list. Claude focused on detected problems (dupes, thin issues)
and printed CLEAN, ignoring the primary mission.

Fix: explicitly list up to 10 tech-debt issues in the problem list
so claude sees them as actionable items.

Also bumped --max-turns from 10 to 30 — promoting issues requires
reading + editing + relabeling via API, needs more turns.
2026-03-13 20:50:16 +00:00
openhands
cdbe668b0d security: gardener uses codeberg_api helper, never exposes tokens
Prompt now references codeberg_api function instead of raw curl+token.
Explicit instruction to never echo/log credentials.
2026-03-13 09:33:38 +00:00
openhands
4ce16f30dc feat: gardener primary mission — promote tech-debt to actionable backlog
Reads source files + repo docs to understand each issue, adds
acceptance criteria + affected files + deps, relabels backlog.
Max 10 per run. Escalates ambiguous scope with options.
2026-03-13 09:32:39 +00:00
openhands
4f34e2dd01 fix: gardener reads repo docs (PRODUCT-TRUTH, ARCHITECTURE, AGENTS) before making decisions 2026-03-13 09:30:49 +00:00
openhands
d9de5b3708 fix: gardener dupe detection — strip series prefixes (LLM seed, Push3 evolution) 2026-03-13 09:22:44 +00:00
openhands
174187f6a6 feat: issue gardener — daily backlog grooming agent
Bash pre-checks (zero tokens): duplicate titles, thin issues, stale
issues, missing deps. Then claude -p for analysis and action.

Escalates decisions in compact format:
  1. #123 "title" — reason (a) opt1 (b) opt2 (c) opt3

Cron: daily 07:00 UTC. Light touch — grooms, doesn't invent work.
2026-03-13 09:17:09 +00:00