fix: dev-poll / reviewer gate on required contexts, not combined commit status #1136

Closed
opened 2026-04-21 16:38:01 +00:00 by dev-bot · 0 comments
Collaborator

Part of the architectural response to the 2026-04-21 CI chaos-monkey cascade (see incident #843 on Codeberg).

Problem

dev-poll and the reviewer-agent both consume the Forgejo combined commit status (/commits/<sha>/status). Combined state is the worst of any status that ever attached to the commit — a non-required flaky workflow (e.g. edge-subpath / caddy-validate, see #1124) pins combined to failure/pending and wedges merge and re-poll logic, even though branch protection already declares which contexts actually gate merge.

During the 2026-04-21 incident this turned one stuck optional step into a factory-wide freeze: reviewer refused to merge, dev-poll kept the PR in awaiting-ci, and the admin had to resort to force_merge: true.

Desired behaviour

Agents read the branch-protection required_status_checks.contexts list via

GET /api/v1/repos/{owner}/{repo}/branch_protections/{branch}

and filter the per-context statuses to just that set when deciding green / red / pending. Optional workflows still run and still show up in the PR UI — they just don't block automation.

Fix sketch

  1. Add ci_required_contexts(repo, branch) helper in lib/ci-helpers.sh returning the context list (cached per poll cycle).
  2. In lib/pr-lifecycle.sh wherever combined state is consulted, replace with a reducer over the required subset only.
  3. In reviewer-agent's merge-eligibility check, same swap.
  4. Keep combined-state fetch for logging/visibility only — never for decisions.

Acceptance

  • Reproduce the incident shape with a fixture PR where the required ci workflow passes but a non-required workflow is stuck pending — reviewer-agent proceeds to merge, dev-poll does not block on it.
  • ci_required_contexts helper is unit-tested against a mock forge response.
  • No new calls to combined state from decision code paths (grep check in CI).
  • shellcheck clean.
  • Codeberg #843 — incident writeup that motivated this class of fix
  • #1124 — the specific optional step that tripped the incident
  • #894 — supervisor sweeper proposal; complementary (sweeper handles legit-red PRs; this fix prevents optional-red from looking red to agents)
Part of the architectural response to the 2026-04-21 CI chaos-monkey cascade (see incident #843 on Codeberg). ## Problem dev-poll and the reviewer-agent both consume the Forgejo combined commit status (`/commits/<sha>/status`). Combined state is the **worst** of any status that ever attached to the commit — a non-required flaky workflow (e.g. `edge-subpath` / `caddy-validate`, see #1124) pins combined to `failure`/`pending` and wedges merge and re-poll logic, even though branch protection already declares which contexts actually gate merge. During the 2026-04-21 incident this turned one stuck optional step into a factory-wide freeze: reviewer refused to merge, dev-poll kept the PR in `awaiting-ci`, and the admin had to resort to `force_merge: true`. ## Desired behaviour Agents read the branch-protection **required_status_checks.contexts** list via ``` GET /api/v1/repos/{owner}/{repo}/branch_protections/{branch} ``` and filter the per-context statuses to just that set when deciding `green` / `red` / `pending`. Optional workflows still run and still show up in the PR UI — they just don't block automation. ## Fix sketch 1. Add `ci_required_contexts(repo, branch)` helper in `lib/ci-helpers.sh` returning the context list (cached per poll cycle). 2. In `lib/pr-lifecycle.sh` wherever combined state is consulted, replace with a reducer over the required subset only. 3. In reviewer-agent's merge-eligibility check, same swap. 4. Keep combined-state fetch for logging/visibility only — never for decisions. ## Acceptance - Reproduce the incident shape with a fixture PR where the required `ci` workflow passes but a non-required workflow is stuck `pending` — reviewer-agent proceeds to merge, dev-poll does not block on it. - `ci_required_contexts` helper is unit-tested against a mock forge response. - No new calls to `combined` state from decision code paths (grep check in CI). - `shellcheck` clean. ## Related - Codeberg #843 — incident writeup that motivated this class of fix - #1124 — the specific optional step that tripped the incident - #894 — supervisor sweeper proposal; complementary (sweeper handles legit-red PRs; this fix prevents optional-red from looking red to agents)
dev-bot added the
backlog
label 2026-04-21 16:38:01 +00:00
dev-bot self-assigned this 2026-04-21 17:30:54 +00:00
dev-bot added
in-progress
and removed
backlog
labels 2026-04-21 17:30:55 +00:00
dev-bot removed their assignment 2026-04-21 18:00:44 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: disinto-admin/disinto#1136
No description provided.