4.1 KiB
Sprint: vault blast-radius tiers
Vision issues
- #419 — Vault: blast-radius based approval tiers
What this enables
After this sprint, low-tier vault actions execute without waiting for a human. The dispatcher
auto-approves and merges vault PRs classified as low in policy.toml. Medium and high tiers
are unchanged: medium notifies and allows async review; high blocks until admin approves.
This removes the bottleneck on low-risk bookkeeping operations while preserving the hard gate on production deploys, secret operations, and agent self-modification.
What exists today
The tier infrastructure is fully built. Only the enforcement is missing.
vault/policy.toml— Maps every formula to low/medium/high. Current low tier: groom-backlog, triage, reproduce, review-pr. Medium: dev, run-planner, run-gardener, run-predictor, run-supervisor, run-architect, upgrade-dependency. High: run-publish-site, run-rent-a-human, add-rpc-method, release.vault/classify.sh— Shell classifier called byvault-env.sh. Returns tier for a given formula.vault/SCHEMA.md— Documentsblast_radiusoverride field (string: "low"/"medium"/"high") that vault action TOMLs can use to override policy defaults.vault/validate.sh— Validates vault action TOML fields including blast_radius.docker/edge/dispatcher.sh— Edge dispatcher. Polls ops repo for merged vault PRs and executes them. Currently fires ALL merged vault PRs without tier differentiation.
What's missing: the dispatcher does not read blast_radius, does not auto-approve low-tier PRs, and does not differentiate notification behavior for medium vs high tier.
Complexity
Files touched: 3
docker/edge/dispatcher.sh— read blast_radius from vault action TOML; for low tier, call Forgejo API to approve + merge the PR directly (admin token); for medium, post "pending async review" comment; for high, leave pending (existing behavior)lib/vault.shvault_request()— include blast_radius in the PR body so the dispatcher can read it without re-parsing the TOMLdocs/VAULT.md— document the three-tier behavior for operators
Sub-issues: 3 Gluecode ratio: ~70% gluecode (dispatcher reads existing classify.sh output), ~30% new (auto-approve API call, comment logic)
Risks
- Admin token for auto-approve: the dispatcher needs an admin-level Forgejo token to approve
and merge PRs. Currently
FORGE_TOKENis used; branch protection hasadmin_enforced: truewhich means even admin bots are blocked from bypassing the approval gate. This is the core design fork: either (a) relax admin_enforced for low-tier PRs, or (b) use a separate Forgejo "auto-approver" account with admin rights, or (c) bypass the PR workflow entirely for low-tier actions (execute directly without a PR). - Policy drift: as new formulas are added, policy.toml must be updated. If a formula is missing, classify.sh should default to "high" (fail safe). Currently the default behavior is unknown — this needs to be hardened.
- Audit trail: low-tier auto-approvals should still leave a record. Auto-approve comment ("auto-approved: low blast radius") satisfies this.
Cost — new infra to maintain
- One new Forgejo account or token (if auto-approver route chosen) — needs rotation policy
policy.tomlmaintenance: every new formula must be classified before shipping- No new services, cron jobs, or containers
Recommendation
Worth it, but the design fork on auto-approve mechanism must be resolved before implementation begins — this is the questions step.
The cleanest approach is option (c): bypass the PR workflow for low-tier actions entirely.
The dispatcher detects blast_radius=low, executes the formula immediately without creating
a PR, and writes to vault/fired/ directly. This avoids the admin token problem, preserves
the PR workflow for medium/high, and keeps the audit trail in git. However, it changes the
blast_radius=low behavior from "PR exists but auto-merges" to "no PR, just executes" — operators
need to understand the difference.
The PR route (option b) is more visible but requires a dedicated account.