Merge pull request 'fix: feat: planner reads issue comments to detect bounced/stuck work — delegates spec-out to formula (#595)' (#604) from fix/issue-595 into main

2026-03-23 13:24:03 +01:00 · 2026-03-23 13:24:03 +01:00 · 64b0412e41
commit 64b0412e41
parent 1c909e58b3 9f0a81145b
2 changed files with 134 additions and 3 deletions
--- a/formulas/groom-backlog.toml
+++ b/formulas/groom-backlog.toml
@ -1,16 +1,67 @@
 # formulas/groom-backlog.toml — Groom the backlog: triage all tech-debt with verify loop
 name        = "groom-backlog"
-description = "Triage and process all tech-debt issues — blockers first, then by impact score, verify to zero"
+description = "Triage tech-debt issues OR break down bounced issues dispatched by the planner"
-version     = 1
+version     = 2
 [context]
 files = ["README.md", "AGENTS.md", "VISION.md"]
 [[steps]]
 id    = "check-mode"
 title = "Determine operating mode: grooming vs breakdown"
 description = """
 Check the YAML front matter of the dispatching action issue (if any) for
 a `mode` field. Two modes are supported:
 1. **breakdown** mode (dispatched by planner for bounced/stuck issues):
   The front matter will contain:
     formula: groom-backlog
     vars:
       target_issue: <number>
       mode: breakdown
       reason: "<why the issue bounced>"
   In this mode, skip the normal tech-debt grooming pipeline. Instead:
   a. Fetch the target issue:
        curl -sf -H "Authorization: token $CODEBERG_TOKEN" \
          "$CODEBERG_API/issues/<target_issue>"
   b. Fetch ALL comments on the target issue to understand scope and
      prior bounce reasons:
        curl -sf -H "Authorization: token $CODEBERG_TOKEN" \
          "$CODEBERG_API/issues/<target_issue>/comments?limit=50"
   c. Read the affected files listed in the issue body to understand
      the actual code scope.
   d. Break the issue into 2-5 sub-issues, each sized for a single
      dev-agent session. Each sub-issue MUST include:
        - ## Problem (scoped piece of the parent issue)
        - ## Affected files (specific files for this sub-task)
        - ## Acceptance criteria (at least one checkbox)
        - ## Dependencies (reference parent or sibling sub-issues if ordered)
   e. Create the sub-issues via API with the `backlog` label.
   f. Update the parent issue body to include a "## Sub-issues" section
      linking to all created sub-issues.
   g. Remove the `underspecified` label from the parent issue (if present).
   h. If the parent issue is a meta-issue that is fully covered by
      sub-issues, add a comment noting it is now tracked via sub-issues.
   i. Signal completion:
        echo "ACTION: broke down #<target_issue> into <N> sub-issues" >> "$RESULT_FILE"
        echo 'PHASE:done' > "$PHASE_FILE"
   After creating sub-issues in breakdown mode, the formula is DONE —
   do not proceed to the normal tech-debt grooming steps.
 2. **grooming** mode (default — no mode field, or mode: grooming):
   Proceed to the inventory step as normal.
 """
 [[steps]]
 id    = "inventory"
 title = "Fetch, score, and classify all tech-debt issues"
 needs = ["check-mode"]
 description = """
 This step only runs in grooming mode. Skip if in breakdown mode.
 Fetch all open tech-debt issues:
  curl -sf -H "Authorization: token $CODEBERG_TOKEN" \
    "$CODEBERG_API/issues?type=issues&state=open&limit=50" | \
--- a/formulas/run-planner.toml
+++ b/formulas/run-planner.toml
@ -213,6 +213,36 @@ Read these inputs:
  - Planner memory (loaded in preflight)
  - Promoted predictions from prediction-triage (add as prerequisites if relevant)
 ### Comment scanning for bounce/stuck detection
 For each issue referenced in the prerequisite tree (by #number), fetch its
 recent comments to detect signals that the issue is stuck or bouncing:
    curl -sf -H "Authorization: token $CODEBERG_TOKEN" \
      "$CODEBERG_API/issues/<number>/comments?limit=10"
 Scan each comment body for these signals:
  - **BOUNCED**: body contains "too large for single session", "too_large",
    "underspecified", or "needs splitting" (case-insensitive). This means
    the dev-agent refused the issue — it needs breakdown before retry.
  - **ESCALATED**: body contains "escalating for human decision", "needs
    human decision", or "escalate" from a non-human author. The issue
    needs steering input.
  - **UNBLOCKED**: body contains "dependency .* is now closed" or
    "unblocked". The issue may be ready to work.
  - **LABEL_CHURN**: the issue has been relabeled between backlog and
    underspecified (or blocked) 3+ times. Check via label change events
    or multiple bounce comments. This indicates a ping-pong loop.
 Track detected signals in a list: `stuck_issues[]` where each entry is:
  { issue: <number>, signal: BOUNCED|ESCALATED|LABEL_CHURN, count: <N>,
    reason: "<summary from comments>" }
 These signals feed into the file-at-constraints step to prevent the
 planner from re-promoting stuck issues and to dispatch formula-based
 breakdown instead.
 Update the tree by applying these operations:
 1. **Mark resolved prerequisites**: For each prerequisite in the tree,
@ -306,7 +336,51 @@ Action issues count toward the 3-issue constraint budget — they are
 strategic investments, not maintenance. The planner decides what data
 matters based on current constraints, not what formulas exist.
-Filing gate — for each constraint:
+### Stuck issue handling — dispatch to groom-backlog formula
 Before filing, cross-reference the top 3 constraints against the
 `stuck_issues[]` list from the update-prerequisite-tree step.
 If a constraint issue was detected as BOUNCED or LABEL_CHURN:
  - Do NOT re-promote it to backlog or add the priority label — this
    would restart the ping-pong loop.
  - Instead, dispatch the groom-backlog formula to break it down.
    Create an action issue that invokes groom-backlog with the stuck
    issue as target:
      Title: "chore: break down #<number> — bounced <count>x, needs splitting"
      Body:
        ---
        formula: groom-backlog
        vars:
          target_issue: <number>
          mode: breakdown
          reason: "<reason from stuck_issues entry>"
        ---
        ## Problem
        Issue #<number> has bounced <count> time(s) between backlog and
        underspecified. The dev-agent reports it is too large for a single
        session. It needs to be broken into dev-agent-sized subtasks.
        ## Affected files
        - formulas/groom-backlog.toml
        ## Acceptance criteria
        - [ ] #<number> is split into implementable sub-issues
        - [ ] Sub-issues have acceptance criteria and affected files
        - [ ] Original issue updated with links to sub-issues
    Label this action issue with the `action` label (not `backlog`).
    This counts toward the 3-issue-per-run limit.
 If a constraint issue was detected as ESCALATED:
  - Do NOT file new work. Add a comment to the issue noting the
    escalation was seen, and mark the prerequisite in the tree as:
      `[ ] <name> ⚠ escalated — awaiting human decision`
  - Do NOT count this against the 3-issue limit.
 Filing gate — for each constraint (that is NOT stuck):
 1. Check if an issue already exists for this constraint (match by issue
   number reference in the tree, or search open issues by title).
@ -467,6 +541,12 @@ Format:
  2. <prerequisite> — blocks N objectives — issue #NNN
  3. <prerequisite> — blocks N objectives — issue #NNN
  ## Stuck issues detected
  - #NNN: BOUNCED (Nx) — dispatched groom-backlog as #MMM
  - #NNN: ESCALATED — awaiting human decision
  - #NNN: LABEL_CHURN (Nx) — dispatched groom-backlog as #MMM
  (or "No stuck issues detected" if none)
  ## Issues created
  - #NNN: title — why (constraint for objectives X, Y)
  (or "No new issues — constraints already have open issues" if none)