disinto

Author	SHA1	Message	Date
openhands	a66bd91721	fix: Replace Codeberg dependency with local Forgejo instance (#611 ) - Add setup_forge() to bin/disinto: provisions Forgejo via Docker, creates admin + bot users (dev-bot, review-bot), generates API tokens, creates repo, and pushes code — all automated - Rename env vars: CODEBERG_TOKEN→FORGE_TOKEN, REVIEW_BOT_TOKEN→ FORGE_REVIEW_TOKEN, CODEBERG_REPO→FORGE_REPO, CODEBERG_API→ FORGE_API, CODEBERG_WEB→FORGE_WEB, CODEBERG_BOT_USERNAMES→ FORGE_BOT_USERNAMES (with backwards-compat fallbacks) - Rename API helpers: codeberg_api()→forge_api(), codeberg_api_all() →forge_api_all() (with compat aliases) - Add forge_url field to project TOML; load-project.sh derives FORGE_API/FORGE_WEB from forge_url + repo - Update parse_repo_slug() to accept any host URL, not just codeberg - Forgejo data stored under ~/.disinto/forgejo/ (not in factory repo) - Update all 58 files: agent scripts, formulas, docs, site HTML Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-23 16:57:12 +00:00
openhands	7627aef1c0	fix: P4 sweep uses ${PROJECT_NAME} without fallback, unlike proj_name (#429 ) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-23 09:10:24 +00:00
openhands	fbe645a305	fix: BACKLOG_NUMS array in supervisor-poll.sh is never queried (#110 ) The BACKLOG_NUMS associative array was built to track which issue numbers are in the backlog, but the DFS cycle-detection code used NODE_COLOR as a membership guard instead. This meant deps pointing to non-backlog issues were only skipped by coincidence (they weren't in NODE_COLOR either). Three changes: - Remove SC2034 suppression since BACKLOG_NUMS is now actually queried - Initialize NODE_COLOR from BACKLOG_NUMS keys (all backlog issues) instead of DEPS_OF keys (only issues with dependencies), so every backlog issue gets a proper DFS color - Replace the NODE_COLOR membership check with BACKLOG_NUMS in the DFS, so the guard explicitly asks "is this dep a backlog issue?" rather than relying on NODE_COLOR initialization as a proxy Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-23 08:16:13 +00:00
openhands	c642ebf81d	fix: bundled dust cleanup — set-euo-pipefail (#516 ) Add missing `set -euo pipefail` to three scripts per AGENTS.md conventions: - lib/ci-helpers.sh - lib/parse-deps.sh - supervisor/supervisor-poll.sh Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-21 19:59:55 +00:00
openhands	5822dc89d9	fix: feat: unified escalation — single PHASE:escalate path for all agents (#510 ) Replace PHASE:needs_human with PHASE:escalate across all agent types. Consolidates 6 overlapping escalation mechanisms into one unified path: detect → notify via Matrix → session stays alive → human reply injected → resume. Key changes: - PHASE:escalate replaces PHASE:needs_human everywhere (16 files) - CI exhausted now escalates instead of immediately marking blocked - Matrix listener routes free-text replies to vault tmux sessions - Vault agent writes PHASE:escalate files for procurement requests - Supervisor monitors PHASE:escalate sessions in health checks - 24h timeout on escalation → blocked label + session killed - All 38 phase protocol tests updated and passing Supersedes #462, #458, #465.	2026-03-21 19:39:04 +00:00
openhands	ac13bf110c	fix: Status file is not per-project in multi-project setups (#423 ) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-21 16:20:07 +00:00
openhands	61c44d31b1	fix: refactor: replace escalation JSONL with blocked label + diagnostic comment (#352 ) Replace the unreliable escalation JSONL system (supervisor/escalations-.jsonl consumed by gardener) with direct blocked label + diagnostic comment on the original issue. When a dev-agent or action-agent session fails (PHASE:failed, idle timeout, crash, CI exhausted): - Capture last 50 lines from tmux pane via tmux capture-pane - Post a structured diagnostic comment on the issue (exit reason, timestamp, PR number, tmux output) - Label the issue "blocked" (instead of restoring "backlog") - Remove in-progress label Removed: - Escalation JSONL write paths in dev-agent.sh, phase-handler.sh, dev-poll.sh, action-agent.sh - is_escalated() helper in dev-poll.sh - Escalation triage (P2f section) in supervisor-poll.sh - Escalation processing + recipe engine in gardener-poll.sh - ci-escalation-recipes step from run-gardener.toml formula - escalations.jsonl from .gitignore Added: - post_blocked_diagnostic() shared helper in phase-handler.sh - ensure_blocked_label_id() helper (creates label via API if not exists) - is_blocked() helper in dev-poll.sh (replaces is_escalated) - Blocked issues listing in supervisor/preflight.sh Kept: - Matrix notifications on failure (unchanged) - CI fix counter logic (still tracks attempts) - needs_human injection in supervisor/gardener (not escalation-related) - Gardener grooming (gardener-agent.sh still invoked) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-21 04:18:43 +00:00
openhands	ec5c48ddf2	fix: P4 stale worktree sweep doesn't cover sup-retry-* worktrees (#253 ) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 19:45:21 +00:00
openhands	3cd047a7e0	fix: P2e and classify_pipeline_failure() use divergent infra heuristics (#251 ) Extract shared is_infra_step() in lib/ci-helpers.sh capturing the union of infra-detection heuristics from both P2e and classify_pipeline_failure(): - Clone/git step exit 128 (connection failure) - Any step exit 137 (OOM/signal 9) - Log-pattern matching (timeouts, connection failures) Update classify_pipeline_failure() to use is_infra_step() with log fetching and "any infra step" aggregation (matching P2e semantics). Simplify P2e to delegate to classify_pipeline_failure(). Update P2f caller for new output format ("infra <reason>"). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 19:19:29 +00:00
openhands	8fb6638589	fix: Stale lock path in existing dev-agent health check (line 373) (#242 ) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 18:44:59 +00:00
openhands	1c93267193	fix: clear P0/P1 alerts after early send to prevent duplicate Matrix messages After sending P0/P1 alerts immediately, reset the variables so they are excluded from the final consolidated ALL_ALERTS send at the end of the script. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-19 09:23:47 +00:00
openhands	5632138cc3	fix: bug: supervisor never delivers disk alerts — crashes during PR scan (#252 ) Send P0 and P1 alerts to Matrix immediately after detection, before per-project checks run. Also guard check_project calls with \|\| flog so any API timeout or jq parse failure inside the per-project scan cannot kill the script before alert delivery. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-19 09:16:56 +00:00
openhands	97ffdca95c	fix: address review feedback on escalation triage (#185 ) - supervisor-poll.sh: check PR state before retrigger; discard stale escalations for closed/merged PRs instead of pushing to their branches - supervisor-poll.sh: bump escalation ts to now on failed retrigger push, so the 30-min cooldown resets and alert flooding is avoided on persistent failures - ci-helpers.sh: require at least one confirmed infra step before returning "infra"; prevents false-positive when all step names are empty strings - ci-helpers.sh: clarify header comment to distinguish per-function requirements - AGENTS.md: document classify_pipeline_failure() in ci-helpers.sh table row Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-19 09:03:03 +00:00
openhands	47eccdb8ae	fix: split case pattern so smoke test recognises ci_exhausted labels (#185 ) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-19 08:53:32 +00:00
openhands	051ff39144	fix: feat: supervisor auto-retriggers CI after infra-only exhaustion (#185 ) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-19 08:51:30 +00:00
openhands	1557c17e2f	fix: address review: phase guard, tmux failure safety, paginated PR lookup (#235 ) - Skip cleanup for sessions in needs_human/awaiting_ci/awaiting_review phases - On tmux display-message failure skip session instead of defaulting to epoch 0 - Use paginated PR lookups (page loop checking page size, not match count) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-19 08:11:51 +00:00
openhands	d982b4592f	fix: feat: supervisor cleans up orphaned tmux sessions + worktrees (#235 ) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-19 07:51:30 +00:00
openhands	8e600787c1	fix: ci_passed() still lives in dev/dev-poll.sh, not lib/ (#70 ) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-18 02:05:54 +00:00
openhands	bd02330b22	fix: shellcheck TODO has no enforcement — \|\| true may never be removed (#71 ) - Fix SC2164: add \|\| exit 1 to bare cd in update-prompt.sh - Fix SC2155: separate declare and assign in env.sh, supervisor-poll.sh, dev-agent.sh - Fix SC2034: inline suppression for vars used by sourced helpers - Remove unused `mergeable` declaration, rename unused loop var to `_w` - Remove \|\| true from shellcheck CI step — failures are now blocking Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-18 01:53:02 +00:00
openhands	1c5d3e7bbd	fix: address review findings from issue #75 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-18 01:18:34 +00:00
openhands	57fdec9504	fix: feat: supervisor auto-retriggers infra CI failures (#75 ) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-18 01:08:35 +00:00
openhands	63e60de9d6	fix: address round 2 review findings from issue #81 - Move atomic mv inside gardener loop so reply is only claimed when a matching needs_human session exists (fixes reply-loss regression) - Delay rm of claimed file until after successful injection in both supervisor and gardener (OOM/SIGKILL leaves file recoverable) - Fix matrix_listener ack message: 'next poll' instead of 'next supervisor poll' Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-17 22:59:05 +00:00
openhands	bfe0c09b5c	fix: address review findings from issue #81 - Fix dev-agent.sh comment: gardener-poll.sh is the backup injector, not review-poll.sh - Add renotify marker cleanup to gardener injection path - Use atomic mv to claim reply file, preventing double-injection race between supervisor and gardener - Add break after supervisor injection for symmetry with gardener - Remove overly prescriptive PHASE:awaiting_ci hardcode from injection instructions Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-17 22:40:54 +00:00
openhands	48683e508c	fix: feat: supervisor-poll.sh and gardener-poll.sh inject human replies into needs_human dev sessions (#81 ) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-17 22:33:28 +00:00
openhands	df2522a7cb	fix: address review findings from issue #67 escalation refactor - supervisor: skip *.done.jsonl in escalation glob (bug: wildcard matched harb.done.jsonl producing spurious 'pending' log noise every cycle) - supervisor: use wc -l instead of grep -c . for line counting (style nit) - supervisor: consume gardener-esc-resolved.log via fixed() so escalation resolutions appear in end-of-cycle supervisor reporting - dev-poll: update all 'escalated to supervisor' log/matrix strings to 'escalated to gardener' (lines 263, 268, 344, 420) - gardener: track _esc_total_created across all escalation entries and write count to supervisor/gardener-esc-resolved.log after processing Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-17 18:30:57 +00:00
openhands	150ede5605	fix: refactor: move escalation processing from supervisor to gardener (#67 ) - dev-poll.sh: write escalations to per-project files (supervisor/escalations-{PROJECT_NAME}.jsonl) and add "project" field so each project's escalations are isolated; update is_escalated() to read from the same per-project paths - gardener-poll.sh: add escalation processing block that reads the per-project escalation file, fetches CI logs via Woodpecker, and creates per-file ShellCheck sub-issues or generic CI failure issues labeled backlog — runs with the correct CODEBERG_API and WOODPECKER_REPO_ID already loaded from the project TOML - supervisor-poll.sh: remove the escalation processing block; replace with a simple flog report counting pending escalations per project Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-17 17:32:56 +00:00
openhands	13bc948b1d	fix: address review findings for escalation race condition, SQL injection, and sc_codes scope - Race condition: mv escalations.jsonl to a PID-stamped snapshot before processing so concurrent dev-poll appends go to a fresh file; rm snapshot after loop — no entries are ever silently dropped - SQL injection: validate ESC_PR_SHA is a 40-char hex string before interpolating into the wpdb query - sc_codes scope: compute per-file from file_errors (already filtered to that file) instead of the entire step log; also switch grep to -F so dots in filenames are not treated as regex wildcards - step_pid validation: reject non-integer values from Woodpecker API before passing as CLI argument - Fallback body now distinguishes "CI logs unavailable" from "logs found but issue creation API calls failed" - ESC_GENERIC_FAIL: avoid leading blank line by using conditional separator and fix code-block opening newline - is_escalated(): remove dead esc_file/done_file locals; add Python-level int() guard so empty/non-numeric issue or pr values fail cleanly instead of producing a syntax error suppressed by 2>/dev/null Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-17 15:11:53 +00:00
openhands	d9520f48a6	fix: feat: supervisor breaks down escalated CI failures into sub-issues (#52 ) - supervisor-poll.sh: replace P3 escalation log with actionable sub-issue creation. For each entry in escalations.jsonl: fetch CI logs via woodpecker-cli, create one sub-issue per file for ShellCheck failures, one combined issue for other CI failures, or a fallback investigation issue if logs are unavailable. Move processed entries to escalations.done.jsonl and clear escalations.jsonl. - dev-poll.sh: add is_escalated() helper that checks both escalations.jsonl and escalations.done.jsonl; use it (alongside ci_fix_count >= 3) in all three CI-fix spawn paths so escalated PRs are skipped even if the ci-fixes tracker is reset. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-17 14:32:41 +00:00
openhands	567dc4bde0	fix: address review findings for supervisor metrics (#24 ) - planner: filter CI and dev metrics by project name to prevent cross-project pollution - planner: replace fragile awk JSONL filter with jq select() - supervisor: add codeberg_count_paginated() helper; replace hardcoded limit=50 dev-metric API calls with paginated counts so projects with >50 issues report accurate blocked-ratio data - supervisor: add 24h age filter to CI metric SQL query so stale pipelines are not re-emitted with a fresh timestamp - supervisor: replace fragile awk key-order-dependent JSON filter in rotate_metrics() with jq select(); add safety guard to prevent overwriting file with empty result on parse failure - supervisor: move mkdir -p for metrics dir to startup (once) instead of every emit_metric() call - supervisor: guard _RAM_TOTAL_MB against empty value in bash arithmetic Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-17 10:56:37 +01:00
openhands	53c1fea6ea	fix: feat: supervisor metrics logging for planner trend analysis (#24 ) - supervisor-poll.sh: append structured JSONL metrics on every poll - infra metric (ram_used_pct, disk_used_pct, swap_mb) after Layer 1 checks - ci metric (pipeline id, duration_min, status) per project via wpdb query - dev metric (issues_in_backlog, issues_blocked, pr_open) per project via Codeberg API - rotate_metrics() trims metrics/supervisor-metrics.jsonl to last 30 days on startup - planner-agent.sh: reads last 7 days of metrics before Phase 2 gap analysis - computes avg CI duration, success rate, RAM/disk utilization, blocked ratio - injects summary into gap analysis prompt as "Operational metrics" section - instructs planner to create optimization issues when metrics conflict with VISION.md Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-17 10:56:37 +01:00
johba	9050413994	refactor: split supervisor into infra + per-project, make poll scripts config-driven Supervisor split (#26): - Layer 1 (infra): P0 memory, P1 disk, P4 housekeeping — runs once, project-agnostic - Layer 2 (per-project): P2 CI/dev-agent, P3 PRs/deps — iterates projects/*.toml - Adding a new project requires only a new TOML file, no code changes Poll scripts accept project TOML arg (#27): - dev-poll.sh, review-poll.sh, gardener-poll.sh accept optional project TOML as $1 - env.sh loads PROJECT_TOML if set, overriding .env defaults - Cron: `dev-poll.sh projects/versi.toml` targets that project New files: - lib/load-project.sh: TOML to env var loader (Python tomllib) - projects/versi.toml: current project config extracted from .env Backwards compatible: scripts without a TOML arg fall back to .env config. Closes #26, Closes #27 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-17 08:57:18 +01:00
johba	98f0c40106	refactor: rewrite parse-deps.py as pure bash, remove only Python from repo Replace lib/parse-deps.py with lib/parse-deps.sh to keep the toolchain all-bash. Rewrite supervisor P3b cycle detection and P3c stale dep check as pure bash using associative arrays and DFS. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-16 21:22:53 +01:00
johba	6cf580c010	refactor: extract shared dep parser to lib/parse-deps.py (Closes #20 ) Single source of truth for dependency parsing, replacing three copies: - dev-poll.sh get_deps() now calls parse-deps.py - supervisor P3b/P3c import parse_deps() via importlib Supports stdin, argument, and --json modes for different callers. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-16 21:16:49 +01:00
johba	acab6c95c8	feat: supervisor detects dep deadlocks, stale deps, and dev-agent blocked states Add three new supervisor checks: - P2c: alert when dev-agent reports "no ready issues" for 6+ consecutive polls - P3b: detect circular dependency deadlocks via DFS cycle detection - P3c: flag backlog issues blocked by deps open >30 days Update supervisor PROMPT.md with guidance for Claude to resolve circular deps by reading code context, and handle stale deps by checking relevance. Gardener prompt now forbids bidirectional deps between sibling issues and requires ## Related (not ## Dependencies) for cross-references. Closes #16, Closes #17 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-16 21:07:02 +01:00
johba	77cb4c4643	refactor: rename factory/ → supervisor/, factory-poll → supervisor-poll The supervisor agent was confusingly named "factory" (same as the project). Rename directory, script, log, lock, status, and escalation files. Update all references across scripts and docs. FACTORY_ROOT env var unchanged (refers to project root, not agent). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-15 18:06:25 +01:00

35 commits