fix: feat: planner reads issue comments to detect bounced/stuck work — delegates spec-out to formula (#595)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 12:16:33 +00:00 · 2026-03-23 12:16:33 +00:00 · 9f0a81145b
commit 9f0a81145b
parent 1c909e58b3
2 changed files with 134 additions and 3 deletions
--- a/formulas/groom-backlog.toml
+++ b/formulas/groom-backlog.toml
@ -1,16 +1,67 @@
 # formulas/groom-backlog.toml — Groom the backlog: triage all tech-debt with verify loop

 name        = "groom-backlog"
-description = "Triage and process all tech-debt issues — blockers first, then by impact score, verify to zero"
-version     = 1
+description = "Triage tech-debt issues OR break down bounced issues dispatched by the planner"
+version     = 2

 [context]
 files = ["README.md", "AGENTS.md", "VISION.md"]

+[[steps]]
+id    = "check-mode"
+title = "Determine operating mode: grooming vs breakdown"
+description = """
+Check the YAML front matter of the dispatching action issue (if any) for
+a `mode` field. Two modes are supported:
+
+1. **breakdown** mode (dispatched by planner for bounced/stuck issues):
+   The front matter will contain:
+     formula: groom-backlog
+     vars:
+       target_issue: <number>
+       mode: breakdown
+       reason: "<why the issue bounced>"
+
+   In this mode, skip the normal tech-debt grooming pipeline. Instead:
+   a. Fetch the target issue:
+        curl -sf -H "Authorization: token $CODEBERG_TOKEN" \
+          "$CODEBERG_API/issues/<target_issue>"
+   b. Fetch ALL comments on the target issue to understand scope and
+      prior bounce reasons:
+        curl -sf -H "Authorization: token $CODEBERG_TOKEN" \
+          "$CODEBERG_API/issues/<target_issue>/comments?limit=50"
+   c. Read the affected files listed in the issue body to understand
+      the actual code scope.
+   d. Break the issue into 2-5 sub-issues, each sized for a single
+      dev-agent session. Each sub-issue MUST include:
+        - ## Problem (scoped piece of the parent issue)
+        - ## Affected files (specific files for this sub-task)
+        - ## Acceptance criteria (at least one checkbox)
+        - ## Dependencies (reference parent or sibling sub-issues if ordered)
+   e. Create the sub-issues via API with the `backlog` label.
+   f. Update the parent issue body to include a "## Sub-issues" section
+      linking to all created sub-issues.
+   g. Remove the `underspecified` label from the parent issue (if present).
+   h. If the parent issue is a meta-issue that is fully covered by
+      sub-issues, add a comment noting it is now tracked via sub-issues.
+   i. Signal completion:
+        echo "ACTION: broke down #<target_issue> into <N> sub-issues" >> "$RESULT_FILE"
+        echo 'PHASE:done' > "$PHASE_FILE"
+
+   After creating sub-issues in breakdown mode, the formula is DONE —
+   do not proceed to the normal tech-debt grooming steps.
+
+2. **grooming** mode (default — no mode field, or mode: grooming):
+   Proceed to the inventory step as normal.
+"""
+
 [[steps]]
 id    = "inventory"
 title = "Fetch, score, and classify all tech-debt issues"
+needs = ["check-mode"]
 description = """
+This step only runs in grooming mode. Skip if in breakdown mode.
+
 Fetch all open tech-debt issues:
  curl -sf -H "Authorization: token $CODEBERG_TOKEN" \
    "$CODEBERG_API/issues?type=issues&state=open&limit=50" | \
--- a/formulas/run-planner.toml
+++ b/formulas/run-planner.toml
@ -213,6 +213,36 @@ Read these inputs:
  - Planner memory (loaded in preflight)
  - Promoted predictions from prediction-triage (add as prerequisites if relevant)

+### Comment scanning for bounce/stuck detection
+
+For each issue referenced in the prerequisite tree (by #number), fetch its
+recent comments to detect signals that the issue is stuck or bouncing:
+
+    curl -sf -H "Authorization: token $CODEBERG_TOKEN" \
+      "$CODEBERG_API/issues/<number>/comments?limit=10"
+
+Scan each comment body for these signals:
+
+  - **BOUNCED**: body contains "too large for single session", "too_large",
+    "underspecified", or "needs splitting" (case-insensitive). This means
+    the dev-agent refused the issue — it needs breakdown before retry.
+  - **ESCALATED**: body contains "escalating for human decision", "needs
+    human decision", or "escalate" from a non-human author. The issue
+    needs steering input.
+  - **UNBLOCKED**: body contains "dependency .* is now closed" or
+    "unblocked". The issue may be ready to work.
+  - **LABEL_CHURN**: the issue has been relabeled between backlog and
+    underspecified (or blocked) 3+ times. Check via label change events
+    or multiple bounce comments. This indicates a ping-pong loop.
+
+Track detected signals in a list: `stuck_issues[]` where each entry is:
+  { issue: <number>, signal: BOUNCED|ESCALATED|LABEL_CHURN, count: <N>,
+    reason: "<summary from comments>" }
+
+These signals feed into the file-at-constraints step to prevent the
+planner from re-promoting stuck issues and to dispatch formula-based
+breakdown instead.
+
 Update the tree by applying these operations:

 1. **Mark resolved prerequisites**: For each prerequisite in the tree,
@ -306,7 +336,51 @@ Action issues count toward the 3-issue constraint budget — they are
 strategic investments, not maintenance. The planner decides what data
 matters based on current constraints, not what formulas exist.

-Filing gate — for each constraint:
+### Stuck issue handling — dispatch to groom-backlog formula
+
+Before filing, cross-reference the top 3 constraints against the
+`stuck_issues[]` list from the update-prerequisite-tree step.
+
+If a constraint issue was detected as BOUNCED or LABEL_CHURN:
+  - Do NOT re-promote it to backlog or add the priority label — this
+    would restart the ping-pong loop.
+  - Instead, dispatch the groom-backlog formula to break it down.
+    Create an action issue that invokes groom-backlog with the stuck
+    issue as target:
+
+      Title: "chore: break down #<number> — bounced <count>x, needs splitting"
+      Body:
+        ---
+        formula: groom-backlog
+        vars:
+          target_issue: <number>
+          mode: breakdown
+          reason: "<reason from stuck_issues entry>"
+        ---
+
+        ## Problem
+        Issue #<number> has bounced <count> time(s) between backlog and
+        underspecified. The dev-agent reports it is too large for a single
+        session. It needs to be broken into dev-agent-sized subtasks.
+
+        ## Affected files
+        - formulas/groom-backlog.toml
+
+        ## Acceptance criteria
+        - [ ] #<number> is split into implementable sub-issues
+        - [ ] Sub-issues have acceptance criteria and affected files
+        - [ ] Original issue updated with links to sub-issues
+
+    Label this action issue with the `action` label (not `backlog`).
+    This counts toward the 3-issue-per-run limit.
+
+If a constraint issue was detected as ESCALATED:
+  - Do NOT file new work. Add a comment to the issue noting the
+    escalation was seen, and mark the prerequisite in the tree as:
+      `[ ] <name> ⚠ escalated — awaiting human decision`
+  - Do NOT count this against the 3-issue limit.
+
+Filing gate — for each constraint (that is NOT stuck):

 1. Check if an issue already exists for this constraint (match by issue
   number reference in the tree, or search open issues by title).
@ -467,6 +541,12 @@ Format:
  2. <prerequisite> — blocks N objectives — issue #NNN
  3. <prerequisite> — blocks N objectives — issue #NNN

+  ## Stuck issues detected
+  - #NNN: BOUNCED (Nx) — dispatched groom-backlog as #MMM
+  - #NNN: ESCALATED — awaiting human decision
+  - #NNN: LABEL_CHURN (Nx) — dispatched groom-backlog as #MMM
+  (or "No stuck issues detected" if none)
+
  ## Issues created
  - #NNN: title — why (constraint for objectives X, Y)
  (or "No new issues — constraints already have open issues" if none)