sprint: add design forks and proposed sub-issues

2026-04-08 18:48:21 +00:00 · 2026-04-08 18:48:21 +00:00 · 708901e4e2
commit 708901e4e2
parent 7afe8bf145
1 changed files with 140 additions and 50 deletions
--- a/sprints/vault-blast-radius-tiers.md
+++ b/sprints/vault-blast-radius-tiers.md
@ -24,64 +24,154 @@ The vault redesign (#73-#77) is complete and all five issues are closed:
 Currently every vault request goes through the same hard-block path regardless of risk.
 No classification layer exists. All formulas share the same single approval tier.

-Note: prerequisites.md says vault redesign is incomplete - this is stale. All #73-#77
-issues are closed as of the current state.
-
 ## Complexity
 Files touched: ~14 (7 new, 7 modified)
-
-New files:
- vault/classify.sh - classification engine, ~150 lines: path glob matching, secret risk scoring, formula risk lookup
- vault/policy.toml - human-editable policy rules mapping path patterns to tier assignments
- vault/formula-risks.toml - formula-to-risk-level mapping (release=high, gardener=low)
- docs/BLAST-RADIUS.md - documentation
- 2-3 test helpers
-
-Modified files:
- vault/SCHEMA.md - optional blast_radius override field
- vault/vault-env.sh - call classify.sh, validate classification
- lib/vault.sh - attach computed tier as PR label; skip PR creation for auto-approve tier
- docker/edge/dispatcher.sh - enforce tier policy: auto-merge low, route medium, hard-block high
- lib/branch-protection.sh - potentially vary required-approvals by tier
-
-Gluecode vs greenfield: ~60% gluecode, ~40% greenfield (classification engine and policy format).
-
-Estimated sub-issues: 4-5
+Gluecode vs greenfield: ~60% gluecode, ~40% greenfield.
+Estimated sub-issues: 4-7 depending on fork choices.

 ## Risks
-1. Classification errors on consequential operations. Classification is deterministic
-   (pattern matching, not AI judgment), but a misconfigured policy.toml could auto-approve
-   something that should hard-block. Mitigation: default-deny all unknown patterns; policy
-   changes require human review.
-
-2. Dispatcher complexity. dispatcher.sh is already 1005 lines. Adding three code paths adds
-   ~150 lines. Mitigation: extract classification to classify.sh so dispatcher delegates,
-   not decides.
-
-3. Branch-protection interaction. Auto-approve tier means the dispatcher merges without human
-   approval. branch-protection.sh currently requires 1 approval; the dispatcher must bypass
-   this for auto-approve tier. Requires admin token in vault runner, or branch protection must
-   become tier-aware. This is the primary design fork.
-
-4. Stale prerequisites.md. Should be updated as part of execution.
+1. Classification errors on consequential operations. Default-deny mitigates: unknown formula → high.
+2. Dispatcher complexity. Mitigation: extract to classify.sh, dispatcher delegates.
+3. Branch-protection interaction (primary design fork, see below).

 ## Cost - new infra to maintain
- vault/policy.toml - operators must keep current as new formulas are added. Unknown formulas
-  default to HIGH (safe, forces manual approval).
- vault/classify.sh - one shell script, shellcheck-covered, no runtime daemon.
+- vault/policy.toml or blast_radius fields — operators update when adding formulas.
+- vault/classify.sh — one shell script, shellcheck-covered, no runtime daemon.
 - No new services, cron jobs, or agent roles.

-Ongoing cost is low.
-
 ## Recommendation
-Worth it. The vault redesign is done; blast-radius tiers are the logical next step to make
-it usable in practice. The bottleneck today forces human approval on every vault action,
-which is the primary reason agents cannot operate continuously. This sprint has clear scope
-(~14 files, 4-5 sub-issues), low new maintenance cost, and directly unblocks autonomous
-operation in the Foundation phase.
-
-The branch-protection and admin-token interaction (Risk 3) is the only design fork worth
-resolving before implementation. Everything else is straightforward.
+Worth it. Vault redesign done; blast-radius tiers are the natural next step. Primary reason
+agents cannot operate continuously is that every vault action blocks on human availability.

 ---
-Reply ACCEPT to proceed with design questions, or REJECT: reason to decline.
+
+## Design forks
+
+Three decisions must be made before implementation begins.
+
+### Fork 1 (Critical): Auto-approve merge mechanism
+
+Branch protection on the ops repo requires `required_approvals: 1` and `admin_enforced: true`.
+For low-tier vault PRs, the dispatcher must merge without a human approval.
+
+**A. Skip PR entirely for low-tier**
+vault-bot commits directly to `vault/actions/` on main using admin token. No PR created.
+Dispatcher detects new TOML file by absence of `.result.json`.
+- Simplest dispatcher code
+- No PR audit trail for low-tier executions
+- `FORGE_ADMIN_TOKEN` already exists in vault env (used by `is_user_admin()`)
+
+**B. Dispatcher self-approves low-tier PRs**
+vault-bot creates PR as today, then immediately posts an APPROVED review using its own token,
+then merges. vault-bot needs Forgejo admin role so `admin_enforced: true` does not block it.
+- Full PR audit trail for all tiers
+- Requires granting vault-bot admin role on Forgejo
+
+**C. Tier-aware branch protection**
+Create a separate Forgejo protection rule for `vault/*` branch pattern with `required_approvals: 0`.
+Main branch protection stays unchanged. vault-bot merges low-tier PRs directly.
+- No new accounts or elevated role for vault-bot
+- Protection rules are in Forgejo admin UI, not code (harder to version)
+- Forgejo `vault/*` glob support needs verification
+
+**D. Dedicated auto-approve bot**
+Create a `vault-auto-bot` Forgejo account with admin role that auto-approves low-tier PRs.
+Cleanest trust separation; most operational overhead.
+
+---
+
+### Fork 2 (Secondary): Policy storage format
+
+Where does the formula → tier mapping live?
+
+**A. `vault/policy.toml` in disinto repo**
+Flat TOML: `formula = "tier"`. classify.sh reads it at runtime.
+Unknown formulas default to `high`. Changing policy requires a disinto PR.
+
+**B. `blast_radius` field in each `formulas/*.toml`**
+Add `blast_radius = "low"|"medium"|"high"` to each formula TOML.
+classify.sh reads the target formula TOML for its tier.
+Co-located with formula — impossible to add a formula without declaring its risk.
+
+**C. `vault/policy.toml` in ops repo**
+Same format as A but lives in the ops repo. Operators update without a disinto PR.
+Useful for per-deployment overrides.
+
+**D. Hybrid: formula TOML default + ops override**
+Formula TOML carries a default tier. Ops `vault/policy.toml` can override per-deployment.
+Most flexible; classify.sh must merge two sources.
+
+---
+
+### Fork 3 (Secondary): Medium-tier dev-loop behavior
+
+When dev-agent creates a vault PR for a medium-tier action, what does it do while waiting?
+
+**A. Non-blocking: fire and continue immediately**
+Agent creates vault PR and moves to next issue without waiting.
+Maximum autonomy; sequencing becomes unpredictable.
+
+**B. Soft-block with 2-hour timeout**
+Agent waits up to 2 hours polling for vault PR merge. If no response, continues.
+Balances oversight with velocity.
+
+**C. Status-quo block (medium = high)**
+Medium-tier blocks the agent loop like high-tier today. Only low-tier actions unblocked.
+Simplest behavior change — no modification to dev-agent flow needed.
+
+**D. Label-based approval signal**
+Agent polls for a `vault-approved` label on the vault PR instead of waiting for merge.
+Decouples "approved to continue" from "PR merged and executed."
+
+---
+
+## Proposed sub-issues
+
+### Core (always filed regardless of fork choices)
+
+**Sub-issue 1: vault/classify.sh — classification engine**
+Implement `vault/classify.sh`: reads formula name, secrets, optional `blast_radius` override,
+applies policy rules, outputs tier (`low|medium|high`). Default-deny: unknown → `high`.
+Files: `vault/classify.sh` (new), `vault/vault-env.sh` (call classify)
+
+**Sub-issue 2: docs/BLAST-RADIUS.md and SCHEMA.md update**
+Write `docs/BLAST-RADIUS.md`. Add optional `blast_radius` field to `vault/SCHEMA.md`
+and validator.
+Files: `docs/BLAST-RADIUS.md` (new), `vault/SCHEMA.md`, `vault/vault-env.sh`
+
+**Sub-issue 3: Update prerequisites.md**
+Mark vault redesign (#73-#77) as DONE (stale). Add blast-radius tiers to the tree.
+Files: `disinto-ops/prerequisites.md`
+
+### Fork 1 variants (pick one)
+
+**1A** — Modify `lib/vault.sh` to skip PR for low-tier, commit directly to main.
+Modify `dispatcher.sh` to skip `verify_admin_merged()` for low-tier TOMLs.
+
+**1B** — Modify `dispatcher.sh` to post APPROVED review + merge for low-tier.
+Grant vault-bot admin role in Forgejo setup scripts.
+
+**1C** — Add `setup_vault_branch_protection_tiered()` to `lib/branch-protection.sh`
+with `required_approvals: 0` for `vault/*` pattern (verify Forgejo glob support first).
+
+**1D** — Add `vault-auto-bot` account to `forge-setup.sh`. Implement approval watcher.
+
+### Fork 2 variants (pick one)
+
+**2A** — Create `vault/policy.toml` in disinto repo. classify.sh reads it.
+
+**2B** — Add `blast_radius` field to all 15 `formulas/*.toml`. classify.sh reads formula TOML.
+
+**2C** — Create `disinto-ops/vault/policy.toml`. classify.sh reads ops copy at runtime.
+
+**2D** — Two-pass classify.sh: formula TOML default, ops policy override.
+
+### Fork 3 variants (pick one)
+
+**3A** — Non-blocking: `lib/vault.sh` returns immediately after PR creation for all tiers.
+
+**3B** — Soft-block: poll medium-tier PR every 15 min for up to 2 hours.
+
+**3C** — No change: medium-tier behavior unchanged (only low-tier unblocked).
+
+**3D** — Create `vault-approved` label. Modify `lib/vault.sh` medium path to poll label.