- dev-poll.sh: write escalations to per-project files
(supervisor/escalations-{PROJECT_NAME}.jsonl) and add "project" field
so each project's escalations are isolated; update is_escalated() to
read from the same per-project paths
- gardener-poll.sh: add escalation processing block that reads the
per-project escalation file, fetches CI logs via Woodpecker, and
creates per-file ShellCheck sub-issues or generic CI failure issues
labeled backlog — runs with the correct CODEBERG_API and
WOODPECKER_REPO_ID already loaded from the project TOML
- supervisor-poll.sh: remove the escalation processing block; replace
with a simple flog report counting pending escalations per project
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- upgrade-dependency.toml: fix forge upgrade command (forge update, not
forge install); remove redundant `npm install` after lockfile write;
simplify description to "Upgrade {{package}} to {{to_version}}" so it
reads cleanly when from_version is omitted
- add-rpc-method.toml: remove dead `namespace` variable; inline namespace
derivation logic into register-method step description
- BOOTSTRAP.md: mark formula label entry as requiring feat/formula merge;
add YAML front matter example so operators know the issue schema
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
ShellCheck finds real issues in existing code. Making it blocking
means the CI pipeline PR can't pass its own CI (chicken-and-egg).
Report warnings but don't fail — fix them incrementally via backlog.
- Fix Project B comment: 'dev +11' → 'dev +1' (cron offset is :01, not :11)
- Fix gap claim: cross-project gaps are 2 min, not 3
- Update prose to say '2 minutes' instead of '3-minute offsets'
- Add gardener-poll.sh to the list of scripts accepting a TOML arg
- Add per-project gardener cron lines to the multi-project example
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The wait-for-CI loop sleeps 30s × 60 iterations waiting for CI
to report. Projects with WOODPECKER_REPO_ID=0 never get a status,
so the agent times out after 30min without merging approved PRs.
Now detects no-CI early and treats as success immediately.
- Race condition: mv escalations.jsonl to a PID-stamped snapshot before
processing so concurrent dev-poll appends go to a fresh file; rm snapshot
after loop — no entries are ever silently dropped
- SQL injection: validate ESC_PR_SHA is a 40-char hex string before
interpolating into the wpdb query
- sc_codes scope: compute per-file from file_errors (already filtered to
that file) instead of the entire step log; also switch grep to -F so
dots in filenames are not treated as regex wildcards
- step_pid validation: reject non-integer values from Woodpecker API before
passing as CLI argument
- Fallback body now distinguishes "CI logs unavailable" from "logs found
but issue creation API calls failed"
- ESC_GENERIC_FAIL: avoid leading blank line by using conditional separator
and fix code-block opening newline
- is_escalated(): remove dead esc_file/done_file locals; add Python-level
int() guard so empty/non-numeric issue or pr values fail cleanly instead
of producing a syntax error suppressed by 2>/dev/null
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- supervisor-poll.sh: replace P3 escalation log with actionable sub-issue creation.
For each entry in escalations.jsonl: fetch CI logs via woodpecker-cli, create one
sub-issue per file for ShellCheck failures, one combined issue for other CI failures,
or a fallback investigation issue if logs are unavailable. Move processed entries to
escalations.done.jsonl and clear escalations.jsonl.
- dev-poll.sh: add is_escalated() helper that checks both escalations.jsonl and
escalations.done.jsonl; use it (alongside ci_fix_count >= 3) in all three CI-fix
spawn paths so escalated PRs are skipped even if the ci-fixes tracker is reset.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Two bugs after #53 merged:
1. Escalation written every poll cycle (4 entries in 30min) — now writes once, bumps counter to 4 to skip
2. Exit after escalation blocked backlog work — now falls through to pick up next issue
Co-authored-by: openhands <openhands@all-hands.dev>
Reviewed-on: https://codeberg.org/johba/disinto/pulls/59
Reviewed-by: review_bot <review_bot@noreply.codeberg.org>
Dev-poll spawned a fresh agent every 10min for CI failures. Each agent started with CI_FIX_COUNT=0 — infinite loop.
Now tracks attempts per PR in `/tmp/dev-poll-ci-fixes-{project}.json`. After 3 failed rounds:
- Writes escalation to `supervisor/escalations.jsonl`
- Sends Matrix alert
- Stops respawning
Part of #52 (supervisor escalation pipeline).
Co-authored-by: openhands <openhands@all-hands.dev>
Reviewed-on: https://codeberg.org/johba/disinto/pulls/53
Reviewed-by: review_bot <review_bot@noreply.codeberg.org>
- Fix anti-pattern regex 2 to match quoted form '"$CI_STATE" != "success"'
(was r'\$CI_STATE\s*!=\s*"success"', now r'"?\$CI_STATE"?\s*!=\s*"success"')
- Update both anti-pattern messages to say 'extract ci_passed() to lib/'
instead of implying it already exists as a shared helper in dev-poll.sh
- Add explicit 'when: event: [push, pull_request]' trigger block to ci.yml
- Add '-r' to xargs in shellcheck step to handle zero .sh files gracefully
- Fix operator precedence bug in review-poll.sh:62: scope the OR clause
with braces so CI_STATE=pending bypass only applies when WOODPECKER_REPO_ID=0
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- planner: filter CI and dev metrics by project name to prevent cross-project pollution
- planner: replace fragile awk JSONL filter with jq select()
- supervisor: add codeberg_count_paginated() helper; replace hardcoded limit=50 dev-metric API calls with paginated counts so projects with >50 issues report accurate blocked-ratio data
- supervisor: add 24h age filter to CI metric SQL query so stale pipelines are not re-emitted with a fresh timestamp
- supervisor: replace fragile awk key-order-dependent JSON filter in rotate_metrics() with jq select(); add safety guard to prevent overwriting file with empty result on parse failure
- supervisor: move mkdir -p for metrics dir to startup (once) instead of every emit_metric() call
- supervisor: guard _RAM_TOTAL_MB against empty value in bash arithmetic
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- supervisor-poll.sh: append structured JSONL metrics on every poll
- infra metric (ram_used_pct, disk_used_pct, swap_mb) after Layer 1 checks
- ci metric (pipeline id, duration_min, status) per project via wpdb query
- dev metric (issues_in_backlog, issues_blocked, pr_open) per project via Codeberg API
- rotate_metrics() trims metrics/supervisor-metrics.jsonl to last 30 days on startup
- planner-agent.sh: reads last 7 days of metrics before Phase 2 gap analysis
- computes avg CI duration, success rate, RAM/disk utilization, blocked ratio
- injects summary into gap analysis prompt as "Operational metrics" section
- instructs planner to create optimization issues when metrics conflict with VISION.md
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
dev-poll.sh had 5 places checking CI_STATE='success', all blocking
projects without CI. Extracted ci_passed() helper that treats
empty/pending/unknown as pass when WOODPECKER_REPO_ID=0.
Don't start new issues while open PRs are waiting for review/CI.
This prevents dev-agent from churning through backlog issues
without reviews landing first.
Projects with woodpecker_repo_id=0 (like disinto) have no CI status.
Review-poll treated empty CI state as failure and skipped all PRs.
Now treats empty/pending CI as pass when no CI is configured.
TMPDIR is not guaranteed to be set. Replaced with /tmp/ directly.
This caused harb dev-agent to crash when posting refusal comments,
leaving issues stuck in a retry loop.
- Add RESOURCES.example.md: committed template showing Compute/Domains/Accounts/Budget structure
- Gitignore RESOURCES.md so local infrastructure data is never committed
- Planner phase 2 reads RESOURCES.md from factory root when present
- Planner prompt instructs Claude to reference specific resource aliases in operational issues
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Hardcoded /tmp/dev-agent.lock meant harb and disinto dev-polls shared
a lock — one project's running agent blocked the other. Now uses
/tmp/dev-agent-{project}.lock and dev-agent-{project}.log.
Implements the vault subsystem: a JSONL queue and gate agent that sits
between agent output and irreversible external actions (emails, posts,
API calls, charges).
New files:
- vault/vault-poll.sh: cron entry (*/30), three phases: retry approved,
timeout escalations (48h), invoke vault-agent for new pending actions
- vault/vault-agent.sh: claude -p wrapper that classifies and routes
actions based on risk × reversibility routing table
- vault/vault-fire.sh: two-phase dispatcher (pending→approved→fired)
with per-action locking and webhook-call handler
- vault/vault-reject.sh: moves actions to rejected/ with reason + timestamp
- vault/PROMPT.md: vault-agent system prompt with routing table
Modified:
- lib/matrix_listener.sh: new vault dispatch branch for APPROVE/REJECT
replies to escalation threads
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The broad regex `(?:^|\n)\s*-\s*#\K[0-9]+` matched ANY bullet with #NNN,
including ## Related sections. This caused #893 (and likely others) to be
permanently blocked by sibling issues that aren't actual dependencies.
Now only extracts deps from:
- Inline 'depends on #NNN' / 'blocked by #NNN' phrases
- ## Dependencies / ## Depends on / ## Blocked by sections
This matches the same logic used by dev-poll.sh get_deps().