feat: supervisor detects dep deadlocks, stale deps, and dev-agent blocked states
Add three new supervisor checks: - P2c: alert when dev-agent reports "no ready issues" for 6+ consecutive polls - P3b: detect circular dependency deadlocks via DFS cycle detection - P3c: flag backlog issues blocked by deps open >30 days Update supervisor PROMPT.md with guidance for Claude to resolve circular deps by reading code context, and handle stale deps by checking relevance. Gardener prompt now forbids bidirectional deps between sibling issues and requires ## Related (not ## Dependencies) for cross-references. Closes #16, Closes #17 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
27373a16f3
commit
acab6c95c8
3 changed files with 201 additions and 2 deletions
|
|
@ -7,8 +7,8 @@ You are the supervisor agent for `$CODEBERG_REPO`. You were called because
|
|||
|
||||
1. **P0 — Memory crisis:** RAM <500MB or swap >3GB
|
||||
2. **P1 — Disk pressure:** Disk >80%
|
||||
3. **P2 — Factory stopped:** Dev-agent dead, CI down, git broken
|
||||
4. **P3 — Factory degraded:** Derailed PR, stuck pipeline, unreviewed PRs
|
||||
3. **P2 — Factory stopped:** Dev-agent dead, CI down, git broken, all backlog dep-blocked
|
||||
4. **P3 — Factory degraded:** Derailed PR, stuck pipeline, unreviewed PRs, circular deps, stale deps
|
||||
5. **P4 — Housekeeping:** Stale processes, log rotation
|
||||
|
||||
## What You Can Do
|
||||
|
|
@ -42,6 +42,44 @@ This gives you:
|
|||
- `$FACTORY_ROOT` — path to the disinto repo
|
||||
- `matrix_send <prefix> <message>` — send notifications to the Matrix coordination room
|
||||
|
||||
## Handling Dependency Alerts
|
||||
|
||||
### Circular dependencies (P3)
|
||||
When you see "Circular dependency deadlock: #A -> #B -> #A", the backlog is permanently
|
||||
stuck. Your job: figure out the correct dependency direction and fix the wrong one.
|
||||
|
||||
1. Read both issue bodies: `codeberg_api GET "/issues/A"`, `codeberg_api GET "/issues/B"`
|
||||
2. Read the referenced source files in `$PROJECT_REPO_ROOT` to understand which change
|
||||
actually depends on which
|
||||
3. Edit the issue that has the incorrect dep to remove the `#NNN` reference from its
|
||||
`## Dependencies` section (replace with `- None` if it was the only dep)
|
||||
4. If the correct direction is unclear from code, escalate with both issue summaries
|
||||
|
||||
Use the Codeberg API to edit issue bodies:
|
||||
```bash
|
||||
# Read current body
|
||||
BODY=$(codeberg_api GET "/issues/NNN" | jq -r '.body')
|
||||
# Edit (remove the circular ref, keep other deps)
|
||||
NEW_BODY=$(echo "$BODY" | sed 's/- #XXX/- None/')
|
||||
codeberg_api PATCH "/issues/NNN" -d "$(jq -nc --arg b "$NEW_BODY" '{body:$b}')"
|
||||
```
|
||||
|
||||
### Stale dependencies (P3)
|
||||
When you see "Stale dependency: #A blocked by #B (open N days)", the dep may be
|
||||
obsolete or misprioritized. Investigate:
|
||||
|
||||
1. Check if dep #B is still relevant (read its body, check if the code it targets changed)
|
||||
2. If the dep is obsolete → remove it from #A's `## Dependencies` section
|
||||
3. If the dep is still needed → escalate, suggesting to prioritize #B or split #A
|
||||
|
||||
### Dev-agent blocked (P2)
|
||||
When you see "Dev-agent blocked: last N polls all report 'no ready issues'":
|
||||
|
||||
1. Check if circular deps exist (they'll appear as separate P3 alerts)
|
||||
2. Check if all backlog issues depend on a single unmerged issue — if so, escalate
|
||||
to prioritize that blocker
|
||||
3. If no clear blocker, escalate with the list of blocked issues and their deps
|
||||
|
||||
## Escalation
|
||||
|
||||
If you can't fix it, escalate via Matrix:
|
||||
|
|
|
|||
|
|
@ -225,6 +225,22 @@ if [ "${BACKLOG_COUNT:-0}" -gt 0 ] && [ "${IN_PROGRESS:-0}" -eq 0 ]; then
|
|||
fi
|
||||
fi
|
||||
|
||||
# =============================================================================
|
||||
# P2c: DEV-AGENT PRODUCTIVITY — all backlog blocked for too long
|
||||
# =============================================================================
|
||||
status "P2: checking dev-agent productivity"
|
||||
|
||||
DEV_LOG_FILE="${FACTORY_ROOT}/dev/dev-agent.log"
|
||||
if [ -f "$DEV_LOG_FILE" ]; then
|
||||
# Check if last 6 poll entries all report "no ready issues" (~1 hour at 10min intervals)
|
||||
RECENT_POLLS=$(tail -100 "$DEV_LOG_FILE" | grep "poll:" | tail -6)
|
||||
TOTAL_RECENT=$(echo "$RECENT_POLLS" | grep -c "." || true)
|
||||
BLOCKED_IN_RECENT=$(echo "$RECENT_POLLS" | grep -c "no ready issues" || true)
|
||||
if [ "$TOTAL_RECENT" -ge 6 ] && [ "$BLOCKED_IN_RECENT" -eq "$TOTAL_RECENT" ]; then
|
||||
p2 "Dev-agent blocked: last ${BLOCKED_IN_RECENT} polls all report 'no ready issues' — all backlog issues may be dep-blocked or have circular deps"
|
||||
fi
|
||||
fi
|
||||
|
||||
# =============================================================================
|
||||
# P3: FACTORY DEGRADED — derailed PRs, unreviewed PRs
|
||||
# =============================================================================
|
||||
|
|
@ -273,6 +289,150 @@ for pr in $OPEN_PRS; do
|
|||
fi
|
||||
done
|
||||
|
||||
# =============================================================================
|
||||
# P3b: CIRCULAR DEPENDENCIES — deadlock detection
|
||||
# =============================================================================
|
||||
status "P3: checking for circular dependencies"
|
||||
|
||||
BACKLOG_FOR_DEPS=$(codeberg_api GET "/issues?state=open&labels=backlog&type=issues&limit=50" 2>/dev/null || true)
|
||||
if [ -n "$BACKLOG_FOR_DEPS" ] && [ "$BACKLOG_FOR_DEPS" != "null" ] && [ "$(echo "$BACKLOG_FOR_DEPS" | jq 'length' 2>/dev/null || echo 0)" -gt 0 ]; then
|
||||
|
||||
CYCLES=$(echo "$BACKLOG_FOR_DEPS" | python3 -c '
|
||||
import sys, json, re
|
||||
|
||||
issues = json.load(sys.stdin)
|
||||
|
||||
def parse_deps(body):
|
||||
deps = set()
|
||||
in_section = False
|
||||
for line in (body or "").split("\n"):
|
||||
if re.match(r"^##?\s*(Depends on|Blocked by|Dependencies)", line, re.IGNORECASE):
|
||||
in_section = True
|
||||
continue
|
||||
if in_section and re.match(r"^##?\s", line):
|
||||
in_section = False
|
||||
if in_section:
|
||||
deps.update(int(m) for m in re.findall(r"#(\d+)", line))
|
||||
if re.search(r"(depends on|blocked by)", line, re.IGNORECASE):
|
||||
deps.update(int(m) for m in re.findall(r"#(\d+)", line))
|
||||
return deps
|
||||
|
||||
graph = {}
|
||||
for issue in issues:
|
||||
num = issue["number"]
|
||||
deps = parse_deps(issue.get("body", ""))
|
||||
deps.discard(num)
|
||||
if deps:
|
||||
graph[num] = deps
|
||||
|
||||
WHITE, GRAY, BLACK = 0, 1, 2
|
||||
color = {n: WHITE for n in graph}
|
||||
cycles = []
|
||||
|
||||
def dfs(u, path):
|
||||
color[u] = GRAY
|
||||
path.append(u)
|
||||
for v in graph.get(u, set()):
|
||||
if v not in color:
|
||||
continue
|
||||
if color[v] == GRAY:
|
||||
cycles.append(path[path.index(v):] + [v])
|
||||
elif color[v] == WHITE:
|
||||
dfs(v, path)
|
||||
path.pop()
|
||||
color[u] = BLACK
|
||||
|
||||
for node in list(graph.keys()):
|
||||
if color.get(node) == WHITE:
|
||||
dfs(node, [])
|
||||
|
||||
seen = set()
|
||||
for cycle in cycles:
|
||||
key = tuple(sorted(set(cycle)))
|
||||
if key not in seen:
|
||||
seen.add(key)
|
||||
print(" -> ".join(f"#{n}" for n in cycle))
|
||||
' 2>/dev/null || true)
|
||||
|
||||
if [ -n "$CYCLES" ]; then
|
||||
while IFS= read -r cycle; do
|
||||
[ -z "$cycle" ] && continue
|
||||
p3 "Circular dependency deadlock: ${cycle}"
|
||||
done <<< "$CYCLES"
|
||||
fi
|
||||
|
||||
# ===========================================================================
|
||||
# P3c: STALE DEPENDENCIES — blocked by old open issues (>30 days)
|
||||
# ===========================================================================
|
||||
status "P3: checking for stale dependencies"
|
||||
|
||||
STALE_DEPS=$(echo "$BACKLOG_FOR_DEPS" | CODEBERG_TOKEN="$CODEBERG_TOKEN" CODEBERG_API="$CODEBERG_API" python3 -c '
|
||||
import sys, json, re, os
|
||||
from datetime import datetime, timezone
|
||||
from urllib.request import Request, urlopen
|
||||
|
||||
issues = json.load(sys.stdin)
|
||||
token = os.environ.get("CODEBERG_TOKEN", "")
|
||||
api = os.environ.get("CODEBERG_API", "")
|
||||
issue_map = {i["number"]: i for i in issues}
|
||||
now = datetime.now(timezone.utc)
|
||||
|
||||
def parse_deps(body):
|
||||
deps = set()
|
||||
in_section = False
|
||||
for line in (body or "").split("\n"):
|
||||
if re.match(r"^##?\s*(Depends on|Blocked by|Dependencies)", line, re.IGNORECASE):
|
||||
in_section = True
|
||||
continue
|
||||
if in_section and re.match(r"^##?\s", line):
|
||||
in_section = False
|
||||
if in_section:
|
||||
deps.update(int(m) for m in re.findall(r"#(\d+)", line))
|
||||
if re.search(r"(depends on|blocked by)", line, re.IGNORECASE):
|
||||
deps.update(int(m) for m in re.findall(r"#(\d+)", line))
|
||||
return deps
|
||||
|
||||
checked = {}
|
||||
for issue in issues:
|
||||
num = issue["number"]
|
||||
deps = parse_deps(issue.get("body", ""))
|
||||
deps.discard(num)
|
||||
for dep in deps:
|
||||
if dep in checked:
|
||||
dep_data = checked[dep]
|
||||
elif dep in issue_map:
|
||||
dep_data = issue_map[dep]
|
||||
checked[dep] = dep_data
|
||||
else:
|
||||
try:
|
||||
req = Request(f"{api}/issues/{dep}",
|
||||
headers={"Authorization": f"token {token}"})
|
||||
with urlopen(req, timeout=5) as resp:
|
||||
dep_data = json.loads(resp.read())
|
||||
checked[dep] = dep_data
|
||||
except Exception:
|
||||
continue
|
||||
if dep_data.get("state") != "open":
|
||||
continue
|
||||
created = dep_data.get("created_at", "")
|
||||
try:
|
||||
created_dt = datetime.fromisoformat(created.replace("Z", "+00:00"))
|
||||
age_days = (now - created_dt).days
|
||||
if age_days > 30:
|
||||
dep_title = dep_data.get("title", "")[:50]
|
||||
print(f"#{num} blocked by #{dep} \"{dep_title}\" (open {age_days} days)")
|
||||
except Exception:
|
||||
pass
|
||||
' 2>/dev/null || true)
|
||||
|
||||
if [ -n "$STALE_DEPS" ]; then
|
||||
while IFS= read -r stale; do
|
||||
[ -z "$stale" ] && continue
|
||||
p3 "Stale dependency: ${stale}"
|
||||
done <<< "$STALE_DEPS"
|
||||
fi
|
||||
fi
|
||||
|
||||
# =============================================================================
|
||||
# P4: HOUSEKEEPING — stale processes
|
||||
# =============================================================================
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue