2026-04-09 08:33:51 +00:00
|
|
|
# Sprint: vault blast-radius tiers
|
2026-04-08 10:28:17 +00:00
|
|
|
|
|
|
|
|
## Vision issues
|
|
|
|
|
- #419 — Vault: blast-radius based approval tiers
|
|
|
|
|
|
|
|
|
|
## What this enables
|
2026-04-09 08:33:51 +00:00
|
|
|
After this sprint, low-tier vault actions execute without waiting for a human. The dispatcher
|
|
|
|
|
auto-approves and merges vault PRs classified as `low` in `policy.toml`. Medium and high tiers
|
|
|
|
|
are unchanged: medium notifies and allows async review; high blocks until admin approves.
|
2026-04-08 10:28:17 +00:00
|
|
|
|
2026-04-09 08:33:51 +00:00
|
|
|
This removes the bottleneck on low-risk bookkeeping operations while preserving the hard gate
|
|
|
|
|
on production deploys, secret operations, and agent self-modification.
|
2026-04-08 10:28:17 +00:00
|
|
|
|
|
|
|
|
## What exists today
|
|
|
|
|
|
2026-04-09 08:33:51 +00:00
|
|
|
The tier infrastructure is fully built. Only the enforcement is missing.
|
2026-04-08 10:28:17 +00:00
|
|
|
|
2026-04-09 08:33:51 +00:00
|
|
|
- `vault/policy.toml` — Maps every formula to low/medium/high. Current low tier: groom-backlog,
|
|
|
|
|
triage, reproduce, review-pr. Medium: dev, run-planner, run-gardener, run-predictor,
|
|
|
|
|
run-supervisor, run-architect, upgrade-dependency. High: run-publish-site, run-rent-a-human,
|
|
|
|
|
add-rpc-method, release.
|
|
|
|
|
- `vault/classify.sh` — Shell classifier called by `vault-env.sh`. Returns tier for a given formula.
|
|
|
|
|
- `vault/SCHEMA.md` — Documents `blast_radius` override field (string: "low"/"medium"/"high")
|
|
|
|
|
that vault action TOMLs can use to override policy defaults.
|
|
|
|
|
- `vault/validate.sh` — Validates vault action TOML fields including blast_radius.
|
|
|
|
|
- `docker/edge/dispatcher.sh` — Edge dispatcher. Polls ops repo for merged vault PRs and executes
|
|
|
|
|
them. Currently fires ALL merged vault PRs without tier differentiation.
|
2026-04-08 18:48:21 +00:00
|
|
|
|
2026-04-09 08:33:51 +00:00
|
|
|
What's missing: the dispatcher does not read blast_radius, does not auto-approve low-tier PRs,
|
|
|
|
|
and does not differentiate notification behavior for medium vs high tier.
|
2026-04-08 18:48:21 +00:00
|
|
|
|
2026-04-09 08:33:51 +00:00
|
|
|
## Complexity
|
2026-04-08 18:48:21 +00:00
|
|
|
|
2026-04-09 08:33:51 +00:00
|
|
|
Files touched: 3
|
|
|
|
|
- `docker/edge/dispatcher.sh` — read blast_radius from vault action TOML; for low tier, call
|
|
|
|
|
Forgejo API to approve + merge the PR directly (admin token); for medium, post "pending async
|
|
|
|
|
review" comment; for high, leave pending (existing behavior)
|
|
|
|
|
- `lib/vault.sh` `vault_request()` — include blast_radius in the PR body so the dispatcher
|
|
|
|
|
can read it without re-parsing the TOML
|
|
|
|
|
- `docs/VAULT.md` — document the three-tier behavior for operators
|
2026-04-08 18:48:21 +00:00
|
|
|
|
2026-04-09 08:33:51 +00:00
|
|
|
Sub-issues: 3
|
|
|
|
|
Gluecode ratio: ~70% gluecode (dispatcher reads existing classify.sh output), ~30% new (auto-approve API call, comment logic)
|
2026-04-08 18:48:21 +00:00
|
|
|
|
2026-04-09 08:33:51 +00:00
|
|
|
## Risks
|
2026-04-08 18:48:21 +00:00
|
|
|
|
2026-04-09 08:33:51 +00:00
|
|
|
- Admin token for auto-approve: the dispatcher needs an admin-level Forgejo token to approve
|
|
|
|
|
and merge PRs. Currently `FORGE_TOKEN` is used; branch protection has `admin_enforced: true`
|
|
|
|
|
which means even admin bots are blocked from bypassing the approval gate. This is the core
|
|
|
|
|
design fork: either (a) relax admin_enforced for low-tier PRs, or (b) use a separate
|
|
|
|
|
Forgejo "auto-approver" account with admin rights, or (c) bypass the PR workflow entirely
|
|
|
|
|
for low-tier actions (execute directly without a PR).
|
|
|
|
|
- Policy drift: as new formulas are added, policy.toml must be updated. If a formula is missing,
|
|
|
|
|
classify.sh should default to "high" (fail safe). Currently the default behavior is unknown —
|
|
|
|
|
this needs to be hardened.
|
|
|
|
|
- Audit trail: low-tier auto-approvals should still leave a record. Auto-approve comment
|
|
|
|
|
("auto-approved: low blast radius") satisfies this.
|
2026-04-08 18:48:21 +00:00
|
|
|
|
2026-04-09 08:33:51 +00:00
|
|
|
## Cost — new infra to maintain
|
2026-04-08 18:48:21 +00:00
|
|
|
|
2026-04-09 08:33:51 +00:00
|
|
|
- One new Forgejo account or token (if auto-approver route chosen) — needs rotation policy
|
|
|
|
|
- `policy.toml` maintenance: every new formula must be classified before shipping
|
|
|
|
|
- No new services, cron jobs, or containers
|
2026-04-08 18:48:21 +00:00
|
|
|
|
2026-04-09 08:33:51 +00:00
|
|
|
## Recommendation
|
2026-04-08 18:48:21 +00:00
|
|
|
|
2026-04-09 08:33:51 +00:00
|
|
|
Worth it, but the design fork on auto-approve mechanism must be resolved before implementation
|
|
|
|
|
begins — this is the questions step.
|
2026-04-08 18:48:21 +00:00
|
|
|
|
2026-04-09 08:33:51 +00:00
|
|
|
The cleanest approach is option (c): bypass the PR workflow for low-tier actions entirely.
|
|
|
|
|
The dispatcher detects blast_radius=low, executes the formula immediately without creating
|
|
|
|
|
a PR, and writes to `vault/fired/` directly. This avoids the admin token problem, preserves
|
|
|
|
|
the PR workflow for medium/high, and keeps the audit trail in git. However, it changes the
|
|
|
|
|
blast_radius=low behavior from "PR exists but auto-merges" to "no PR, just executes" — operators
|
|
|
|
|
need to understand the difference.
|
2026-04-08 18:48:21 +00:00
|
|
|
|
2026-04-09 08:33:51 +00:00
|
|
|
The PR route (option b) is more visible but requires a dedicated account.
|