disinto-ops/sprints/vault-blast-radius-tiers.md

# Sprint: vault blast-radius tiers

## Vision issues
- #419 — Vault: blast-radius based approval tiers

## What this enables
After this sprint, low-tier vault actions execute without waiting for a human. The dispatcher
auto-approves and merges vault PRs classified as `low` in `policy.toml`. Medium and high tiers
are unchanged: medium notifies and allows async review; high blocks until admin approves.

This removes the bottleneck on low-risk bookkeeping operations while preserving the hard gate
on production deploys, secret operations, and agent self-modification.

## What exists today

The tier infrastructure is fully built. Only the enforcement is missing.

- `vault/policy.toml` — Maps every formula to low/medium/high. Current low tier: groom-backlog,
  triage, reproduce, review-pr. Medium: dev, run-planner, run-gardener, run-predictor,
  run-supervisor, run-architect, upgrade-dependency. High: run-publish-site, run-rent-a-human,
  add-rpc-method, release.
- `vault/classify.sh` — Shell classifier called by `vault-env.sh`. Returns tier for a given formula.
- `vault/SCHEMA.md` — Documents `blast_radius` override field (string: "low"/"medium"/"high")
  that vault action TOMLs can use to override policy defaults.
- `vault/validate.sh` — Validates vault action TOML fields including blast_radius.
- `docker/edge/dispatcher.sh` — Edge dispatcher. Polls ops repo for merged vault PRs and executes
  them. Currently fires ALL merged vault PRs without tier differentiation.

What's missing: the dispatcher does not read blast_radius, does not auto-approve low-tier PRs,
and does not differentiate notification behavior for medium vs high tier.

## Complexity

Files touched: 3
- `docker/edge/dispatcher.sh` — read blast_radius from vault action TOML; for low tier, call
  Forgejo API to approve + merge the PR directly (admin token); for medium, post "pending async
  review" comment; for high, leave pending (existing behavior)
- `lib/vault.sh` `vault_request()` — include blast_radius in the PR body so the dispatcher
  can read it without re-parsing the TOML
- `docs/VAULT.md` — document the three-tier behavior for operators

Sub-issues: 3
Gluecode ratio: ~70% gluecode (dispatcher reads existing classify.sh output), ~30% new (auto-approve API call, comment logic)

## Risks

- Admin token for auto-approve: the dispatcher needs an admin-level Forgejo token to approve
  and merge PRs. Currently `FORGE_TOKEN` is used; branch protection has `admin_enforced: true`
  which means even admin bots are blocked from bypassing the approval gate. This is the core
  design fork: either (a) relax admin_enforced for low-tier PRs, or (b) use a separate
  Forgejo "auto-approver" account with admin rights, or (c) bypass the PR workflow entirely
  for low-tier actions (execute directly without a PR).
- Policy drift: as new formulas are added, policy.toml must be updated. If a formula is missing,
  classify.sh should default to "high" (fail safe). Currently the default behavior is unknown —
  this needs to be hardened.
- Audit trail: low-tier auto-approvals should still leave a record. Auto-approve comment
  ("auto-approved: low blast radius") satisfies this.

## Cost — new infra to maintain

- One new Forgejo account or token (if auto-approver route chosen) — needs rotation policy
- `policy.toml` maintenance: every new formula must be classified before shipping
- No new services, cron jobs, or containers

## Recommendation

Worth it, but the design fork on auto-approve mechanism must be resolved before implementation
begins — this is the questions step.

The cleanest approach is option (c): bypass the PR workflow for low-tier actions entirely.
The dispatcher detects blast_radius=low, executes the formula immediately without creating
a PR, and writes to `vault/fired/` directly. This avoids the admin token problem, preserves
the PR workflow for medium/high, and keeps the audit trail in git. However, it changes the
blast_radius=low behavior from "PR exists but auto-merges" to "no PR, just executes" — operators
need to understand the difference.

The PR route (option b) is more visible but requires a dedicated account.
sprint: add vault-blast-radius-tiers.md 2026-04-09 08:33:51 +00:00			`# Sprint: vault blast-radius tiers`
sprint: add vault-blast-radius-tiers.md 2026-04-08 10:28:17 +00:00
			`## Vision issues`
			`- #419 — Vault: blast-radius based approval tiers`

			`## What this enables`
sprint: add vault-blast-radius-tiers.md 2026-04-09 08:33:51 +00:00			`After this sprint, low-tier vault actions execute without waiting for a human. The dispatcher`
			auto-approves and merges vault PRs classified as `low` in `policy.toml`. Medium and high tiers
			`are unchanged: medium notifies and allows async review; high blocks until admin approves.`
sprint: add vault-blast-radius-tiers.md 2026-04-08 10:28:17 +00:00
sprint: add vault-blast-radius-tiers.md 2026-04-09 08:33:51 +00:00			`This removes the bottleneck on low-risk bookkeeping operations while preserving the hard gate`
			`on production deploys, secret operations, and agent self-modification.`
sprint: add vault-blast-radius-tiers.md 2026-04-08 10:28:17 +00:00
			`## What exists today`

sprint: add vault-blast-radius-tiers.md 2026-04-09 08:33:51 +00:00			`The tier infrastructure is fully built. Only the enforcement is missing.`
sprint: add vault-blast-radius-tiers.md 2026-04-08 10:28:17 +00:00
sprint: add vault-blast-radius-tiers.md 2026-04-09 08:33:51 +00:00			- `vault/policy.toml` — Maps every formula to low/medium/high. Current low tier: groom-backlog,
			`triage, reproduce, review-pr. Medium: dev, run-planner, run-gardener, run-predictor,`
			`run-supervisor, run-architect, upgrade-dependency. High: run-publish-site, run-rent-a-human,`
			`add-rpc-method, release.`
			- `vault/classify.sh` — Shell classifier called by `vault-env.sh`. Returns tier for a given formula.
			- `vault/SCHEMA.md` — Documents `blast_radius` override field (string: "low"/"medium"/"high")
			`that vault action TOMLs can use to override policy defaults.`
			- `vault/validate.sh` — Validates vault action TOML fields including blast_radius.
			- `docker/edge/dispatcher.sh` — Edge dispatcher. Polls ops repo for merged vault PRs and executes
			`them. Currently fires ALL merged vault PRs without tier differentiation.`
sprint: add design forks and proposed sub-issues 2026-04-08 18:48:21 +00:00
sprint: add vault-blast-radius-tiers.md 2026-04-09 08:33:51 +00:00			`What's missing: the dispatcher does not read blast_radius, does not auto-approve low-tier PRs,`
			`and does not differentiate notification behavior for medium vs high tier.`
sprint: add design forks and proposed sub-issues 2026-04-08 18:48:21 +00:00
sprint: add vault-blast-radius-tiers.md 2026-04-09 08:33:51 +00:00			`## Complexity`
sprint: add design forks and proposed sub-issues 2026-04-08 18:48:21 +00:00
sprint: add vault-blast-radius-tiers.md 2026-04-09 08:33:51 +00:00			`Files touched: 3`
			- `docker/edge/dispatcher.sh` — read blast_radius from vault action TOML; for low tier, call
			`Forgejo API to approve + merge the PR directly (admin token); for medium, post "pending async`
			`review" comment; for high, leave pending (existing behavior)`
			- `lib/vault.sh` `vault_request()` — include blast_radius in the PR body so the dispatcher
			`can read it without re-parsing the TOML`
			- `docs/VAULT.md` — document the three-tier behavior for operators
sprint: add design forks and proposed sub-issues 2026-04-08 18:48:21 +00:00
sprint: add vault-blast-radius-tiers.md 2026-04-09 08:33:51 +00:00			`Sub-issues: 3`
			`Gluecode ratio: ~70% gluecode (dispatcher reads existing classify.sh output), ~30% new (auto-approve API call, comment logic)`
sprint: add design forks and proposed sub-issues 2026-04-08 18:48:21 +00:00
sprint: add vault-blast-radius-tiers.md 2026-04-09 08:33:51 +00:00			`## Risks`
sprint: add design forks and proposed sub-issues 2026-04-08 18:48:21 +00:00
sprint: add vault-blast-radius-tiers.md 2026-04-09 08:33:51 +00:00			`- Admin token for auto-approve: the dispatcher needs an admin-level Forgejo token to approve`
			and merge PRs. Currently `FORGE_TOKEN` is used; branch protection has `admin_enforced: true`
			`which means even admin bots are blocked from bypassing the approval gate. This is the core`
			`design fork: either (a) relax admin_enforced for low-tier PRs, or (b) use a separate`
			`Forgejo "auto-approver" account with admin rights, or (c) bypass the PR workflow entirely`
			`for low-tier actions (execute directly without a PR).`
			`- Policy drift: as new formulas are added, policy.toml must be updated. If a formula is missing,`
			`classify.sh should default to "high" (fail safe). Currently the default behavior is unknown —`
			`this needs to be hardened.`
			`- Audit trail: low-tier auto-approvals should still leave a record. Auto-approve comment`
			`("auto-approved: low blast radius") satisfies this.`
sprint: add design forks and proposed sub-issues 2026-04-08 18:48:21 +00:00
sprint: add vault-blast-radius-tiers.md 2026-04-09 08:33:51 +00:00			`## Cost — new infra to maintain`
sprint: add design forks and proposed sub-issues 2026-04-08 18:48:21 +00:00
sprint: add vault-blast-radius-tiers.md 2026-04-09 08:33:51 +00:00			`- One new Forgejo account or token (if auto-approver route chosen) — needs rotation policy`
			- `policy.toml` maintenance: every new formula must be classified before shipping
			`- No new services, cron jobs, or containers`
sprint: add design forks and proposed sub-issues 2026-04-08 18:48:21 +00:00
sprint: add vault-blast-radius-tiers.md 2026-04-09 08:33:51 +00:00			`## Recommendation`
sprint: add design forks and proposed sub-issues 2026-04-08 18:48:21 +00:00
sprint: add vault-blast-radius-tiers.md 2026-04-09 08:33:51 +00:00			`Worth it, but the design fork on auto-approve mechanism must be resolved before implementation`
			`begins — this is the questions step.`
sprint: add design forks and proposed sub-issues 2026-04-08 18:48:21 +00:00
sprint: add vault-blast-radius-tiers.md 2026-04-09 08:33:51 +00:00			`The cleanest approach is option (c): bypass the PR workflow for low-tier actions entirely.`
			`The dispatcher detects blast_radius=low, executes the formula immediately without creating`
			a PR, and writes to `vault/fired/` directly. This avoids the admin token problem, preserves
			`the PR workflow for medium/high, and keeps the audit trail in git. However, it changes the`
			`blast_radius=low behavior from "PR exists but auto-merges" to "no PR, just executes" — operators`
			`need to understand the difference.`
sprint: add design forks and proposed sub-issues 2026-04-08 18:48:21 +00:00
sprint: add vault-blast-radius-tiers.md 2026-04-09 08:33:51 +00:00			`The PR route (option b) is more visible but requires a dedicated account.`