- upgrade-dependency.toml: fix forge upgrade command (forge update, not
forge install); remove redundant `npm install` after lockfile write;
simplify description to "Upgrade {{package}} to {{to_version}}" so it
reads cleanly when from_version is omitted
- add-rpc-method.toml: remove dead `namespace` variable; inline namespace
derivation logic into register-method step description
- BOOTSTRAP.md: mark formula label entry as requiring feat/formula merge;
add YAML front matter example so operators know the issue schema
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Fix Project B comment: 'dev +11' → 'dev +1' (cron offset is :01, not :11)
- Fix gap claim: cross-project gaps are 2 min, not 3
- Update prose to say '2 minutes' instead of '3-minute offsets'
- Add gardener-poll.sh to the list of scripts accepting a TOML arg
- Add per-project gardener cron lines to the multi-project example
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The wait-for-CI loop sleeps 30s × 60 iterations waiting for CI
to report. Projects with WOODPECKER_REPO_ID=0 never get a status,
so the agent times out after 30min without merging approved PRs.
Now detects no-CI early and treats as success immediately.
- Race condition: mv escalations.jsonl to a PID-stamped snapshot before
processing so concurrent dev-poll appends go to a fresh file; rm snapshot
after loop — no entries are ever silently dropped
- SQL injection: validate ESC_PR_SHA is a 40-char hex string before
interpolating into the wpdb query
- sc_codes scope: compute per-file from file_errors (already filtered to
that file) instead of the entire step log; also switch grep to -F so
dots in filenames are not treated as regex wildcards
- step_pid validation: reject non-integer values from Woodpecker API before
passing as CLI argument
- Fallback body now distinguishes "CI logs unavailable" from "logs found
but issue creation API calls failed"
- ESC_GENERIC_FAIL: avoid leading blank line by using conditional separator
and fix code-block opening newline
- is_escalated(): remove dead esc_file/done_file locals; add Python-level
int() guard so empty/non-numeric issue or pr values fail cleanly instead
of producing a syntax error suppressed by 2>/dev/null
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- supervisor-poll.sh: replace P3 escalation log with actionable sub-issue creation.
For each entry in escalations.jsonl: fetch CI logs via woodpecker-cli, create one
sub-issue per file for ShellCheck failures, one combined issue for other CI failures,
or a fallback investigation issue if logs are unavailable. Move processed entries to
escalations.done.jsonl and clear escalations.jsonl.
- dev-poll.sh: add is_escalated() helper that checks both escalations.jsonl and
escalations.done.jsonl; use it (alongside ci_fix_count >= 3) in all three CI-fix
spawn paths so escalated PRs are skipped even if the ci-fixes tracker is reset.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Two bugs after #53 merged:
1. Escalation written every poll cycle (4 entries in 30min) — now writes once, bumps counter to 4 to skip
2. Exit after escalation blocked backlog work — now falls through to pick up next issue
Co-authored-by: openhands <openhands@all-hands.dev>
Reviewed-on: https://codeberg.org/johba/disinto/pulls/59
Reviewed-by: review_bot <review_bot@noreply.codeberg.org>
Dev-poll spawned a fresh agent every 10min for CI failures. Each agent started with CI_FIX_COUNT=0 — infinite loop.
Now tracks attempts per PR in `/tmp/dev-poll-ci-fixes-{project}.json`. After 3 failed rounds:
- Writes escalation to `supervisor/escalations.jsonl`
- Sends Matrix alert
- Stops respawning
Part of #52 (supervisor escalation pipeline).
Co-authored-by: openhands <openhands@all-hands.dev>
Reviewed-on: https://codeberg.org/johba/disinto/pulls/53
Reviewed-by: review_bot <review_bot@noreply.codeberg.org>
- planner: filter CI and dev metrics by project name to prevent cross-project pollution
- planner: replace fragile awk JSONL filter with jq select()
- supervisor: add codeberg_count_paginated() helper; replace hardcoded limit=50 dev-metric API calls with paginated counts so projects with >50 issues report accurate blocked-ratio data
- supervisor: add 24h age filter to CI metric SQL query so stale pipelines are not re-emitted with a fresh timestamp
- supervisor: replace fragile awk key-order-dependent JSON filter in rotate_metrics() with jq select(); add safety guard to prevent overwriting file with empty result on parse failure
- supervisor: move mkdir -p for metrics dir to startup (once) instead of every emit_metric() call
- supervisor: guard _RAM_TOTAL_MB against empty value in bash arithmetic
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- supervisor-poll.sh: append structured JSONL metrics on every poll
- infra metric (ram_used_pct, disk_used_pct, swap_mb) after Layer 1 checks
- ci metric (pipeline id, duration_min, status) per project via wpdb query
- dev metric (issues_in_backlog, issues_blocked, pr_open) per project via Codeberg API
- rotate_metrics() trims metrics/supervisor-metrics.jsonl to last 30 days on startup
- planner-agent.sh: reads last 7 days of metrics before Phase 2 gap analysis
- computes avg CI duration, success rate, RAM/disk utilization, blocked ratio
- injects summary into gap analysis prompt as "Operational metrics" section
- instructs planner to create optimization issues when metrics conflict with VISION.md
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
dev-poll.sh had 5 places checking CI_STATE='success', all blocking
projects without CI. Extracted ci_passed() helper that treats
empty/pending/unknown as pass when WOODPECKER_REPO_ID=0.
Don't start new issues while open PRs are waiting for review/CI.
This prevents dev-agent from churning through backlog issues
without reviews landing first.
Projects with woodpecker_repo_id=0 (like disinto) have no CI status.
Review-poll treated empty CI state as failure and skipped all PRs.
Now treats empty/pending CI as pass when no CI is configured.
TMPDIR is not guaranteed to be set. Replaced with /tmp/ directly.
This caused harb dev-agent to crash when posting refusal comments,
leaving issues stuck in a retry loop.
- Add RESOURCES.example.md: committed template showing Compute/Domains/Accounts/Budget structure
- Gitignore RESOURCES.md so local infrastructure data is never committed
- Planner phase 2 reads RESOURCES.md from factory root when present
- Planner prompt instructs Claude to reference specific resource aliases in operational issues
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Hardcoded /tmp/dev-agent.lock meant harb and disinto dev-polls shared
a lock — one project's running agent blocked the other. Now uses
/tmp/dev-agent-{project}.lock and dev-agent-{project}.log.
Implements the vault subsystem: a JSONL queue and gate agent that sits
between agent output and irreversible external actions (emails, posts,
API calls, charges).
New files:
- vault/vault-poll.sh: cron entry (*/30), three phases: retry approved,
timeout escalations (48h), invoke vault-agent for new pending actions
- vault/vault-agent.sh: claude -p wrapper that classifies and routes
actions based on risk × reversibility routing table
- vault/vault-fire.sh: two-phase dispatcher (pending→approved→fired)
with per-action locking and webhook-call handler
- vault/vault-reject.sh: moves actions to rejected/ with reason + timestamp
- vault/PROMPT.md: vault-agent system prompt with routing table
Modified:
- lib/matrix_listener.sh: new vault dispatch branch for APPROVE/REJECT
replies to escalation threads
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The broad regex `(?:^|\n)\s*-\s*#\K[0-9]+` matched ANY bullet with #NNN,
including ## Related sections. This caused #893 (and likely others) to be
permanently blocked by sibling issues that aren't actual dependencies.
Now only extracts deps from:
- Inline 'depends on #NNN' / 'blocked by #NNN' phrases
- ## Dependencies / ## Depends on / ## Blocked by sections
This matches the same logic used by dev-poll.sh get_deps().
Adds PRIORITY_blockers_starving_factory detection: scans backlog issues'
deps, finds any that are open but not labeled backlog, and puts them
at the top of the Claude prompt as highest priority for promotion.
Previously, gardener promoted random tech-debt issues while the actual
blockers (e.g. #650, #563, #714, #743) were ignored, leaving all
backlog items permanently stuck.
Replace lib/parse-deps.py with lib/parse-deps.sh to keep the toolchain
all-bash. Rewrite supervisor P3b cycle detection and P3c stale dep check
as pure bash using associative arrays and DFS.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Single source of truth for dependency parsing, replacing three copies:
- dev-poll.sh get_deps() now calls parse-deps.py
- supervisor P3b/P3c import parse_deps() via importlib
Supports stdin, argument, and --json modes for different callers.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add three new supervisor checks:
- P2c: alert when dev-agent reports "no ready issues" for 6+ consecutive polls
- P3b: detect circular dependency deadlocks via DFS cycle detection
- P3c: flag backlog issues blocked by deps open >30 days
Update supervisor PROMPT.md with guidance for Claude to resolve circular deps
by reading code context, and handle stale deps by checking relevance.
Gardener prompt now forbids bidirectional deps between sibling issues and
requires ## Related (not ## Dependencies) for cross-references.
Closes#16, Closes#17
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The supervisor agent was confusingly named "factory" (same as the
project). Rename directory, script, log, lock, status, and escalation
files. Update all references across scripts and docs.
FACTORY_ROOT env var unchanged (refers to project root, not agent).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>