From 7afe8bf14575f69097f80d04a372bc8549005baa Mon Sep 17 00:00:00 2001 From: architect-bot Date: Wed, 8 Apr 2026 10:28:17 +0000 Subject: [PATCH] sprint: add vault-blast-radius-tiers.md --- sprints/vault-blast-radius-tiers.md | 87 +++++++++++++++++++++++++++++ 1 file changed, 87 insertions(+) create mode 100644 sprints/vault-blast-radius-tiers.md diff --git a/sprints/vault-blast-radius-tiers.md b/sprints/vault-blast-radius-tiers.md new file mode 100644 index 0000000..592b217 --- /dev/null +++ b/sprints/vault-blast-radius-tiers.md @@ -0,0 +1,87 @@ +# Sprint: Vault blast-radius tiers + +## Vision issues +- #419 — Vault: blast-radius based approval tiers + +## What this enables +After this sprint, vault operations are classified by blast radius — low-risk operations +(docs, feature-branch edits) flow through without human gating; medium-risk operations +(CI config, Dockerfile changes) queue for async review; high-risk operations (production +deploys, secrets rotation, agent self-modification) hard-block as today. + +The practical effect: the dev loop no longer stalls waiting for human approval of routine +operations. Agents can move autonomously through 80%+ of vault requests while preserving +the safety contract on irreversible operations. + +## What exists today +The vault redesign (#73-#77) is complete and all five issues are closed: +- lib/vault.sh - idempotent vault PR creation via Forgejo API +- docker/edge/dispatcher.sh - polls merged vault PRs, verifies admin approval, launches runners +- vault/vault-env.sh - TOML validation for vault action files +- vault/SCHEMA.md - vault action TOML schema +- lib/branch-protection.sh - admin-only merge enforcement on ops repo + +Currently every vault request goes through the same hard-block path regardless of risk. +No classification layer exists. All formulas share the same single approval tier. + +Note: prerequisites.md says vault redesign is incomplete - this is stale. All #73-#77 +issues are closed as of the current state. + +## Complexity +Files touched: ~14 (7 new, 7 modified) + +New files: +- vault/classify.sh - classification engine, ~150 lines: path glob matching, secret risk scoring, formula risk lookup +- vault/policy.toml - human-editable policy rules mapping path patterns to tier assignments +- vault/formula-risks.toml - formula-to-risk-level mapping (release=high, gardener=low) +- docs/BLAST-RADIUS.md - documentation +- 2-3 test helpers + +Modified files: +- vault/SCHEMA.md - optional blast_radius override field +- vault/vault-env.sh - call classify.sh, validate classification +- lib/vault.sh - attach computed tier as PR label; skip PR creation for auto-approve tier +- docker/edge/dispatcher.sh - enforce tier policy: auto-merge low, route medium, hard-block high +- lib/branch-protection.sh - potentially vary required-approvals by tier + +Gluecode vs greenfield: ~60% gluecode, ~40% greenfield (classification engine and policy format). + +Estimated sub-issues: 4-5 + +## Risks +1. Classification errors on consequential operations. Classification is deterministic + (pattern matching, not AI judgment), but a misconfigured policy.toml could auto-approve + something that should hard-block. Mitigation: default-deny all unknown patterns; policy + changes require human review. + +2. Dispatcher complexity. dispatcher.sh is already 1005 lines. Adding three code paths adds + ~150 lines. Mitigation: extract classification to classify.sh so dispatcher delegates, + not decides. + +3. Branch-protection interaction. Auto-approve tier means the dispatcher merges without human + approval. branch-protection.sh currently requires 1 approval; the dispatcher must bypass + this for auto-approve tier. Requires admin token in vault runner, or branch protection must + become tier-aware. This is the primary design fork. + +4. Stale prerequisites.md. Should be updated as part of execution. + +## Cost - new infra to maintain +- vault/policy.toml - operators must keep current as new formulas are added. Unknown formulas + default to HIGH (safe, forces manual approval). +- vault/classify.sh - one shell script, shellcheck-covered, no runtime daemon. +- No new services, cron jobs, or agent roles. + +Ongoing cost is low. + +## Recommendation +Worth it. The vault redesign is done; blast-radius tiers are the logical next step to make +it usable in practice. The bottleneck today forces human approval on every vault action, +which is the primary reason agents cannot operate continuously. This sprint has clear scope +(~14 files, 4-5 sub-issues), low new maintenance cost, and directly unblocks autonomous +operation in the Foundation phase. + +The branch-protection and admin-token interaction (Risk 3) is the only design fork worth +resolving before implementation. Everything else is straightforward. + +--- +Reply ACCEPT to proceed with design questions, or REJECT: reason to decline. \ No newline at end of file