From e11fd85c9a7829b3f8c14f6548b4d35900a18f25 Mon Sep 17 00:00:00 2001 From: architect-bot Date: Wed, 15 Apr 2026 10:06:02 +0000 Subject: [PATCH] sprint: add process-evolution-lifecycle.md --- sprints/process-evolution-lifecycle.md | 62 ++++++++++++++++++++++++++ 1 file changed, 62 insertions(+) create mode 100644 sprints/process-evolution-lifecycle.md diff --git a/sprints/process-evolution-lifecycle.md b/sprints/process-evolution-lifecycle.md new file mode 100644 index 0000000..fee5a56 --- /dev/null +++ b/sprints/process-evolution-lifecycle.md @@ -0,0 +1,62 @@ +# Sprint: process evolution lifecycle + +## Vision issues +- #418 — Process evolution: observe-propose-shadow-promote lifecycle + +## What this enables + +After this sprint, the factory can **safely mutate its own processes**. Today, agents observe project state (stuck issues, stale evidence) but not process state (how long do reviews take? how often do dev sessions fail on the same pattern? which formula steps are bottlenecks?). Process changes are manual edits to formulas with no testing path. + +After this sprint: +- Agents collect structured process metrics (review latency, failure rates, escalation frequency) +- The predictor can propose process changes as structured RFCs with evidence +- Process changes can shadow-run alongside the current process before promotion +- Humans gate the shadow-to-promote transition via the existing vault/PR approval pattern + +The factory becomes self-improving in a controlled, reversible way. + +## What exists today + +Strong foundations — most of the lifecycle has analogues already: + +| Stage | Existing analogue | Gap | +|-------|------------------|-----| +| **Observe** | Predictor scans project state; knowledge graph does structural analysis | No process-state metrics (latency, failure rates, throughput) | +| **Propose** | Prediction-triage workflow (predictor files, planner triages) | Predictions are about project state, not process changes; no RFC format | +| **Shadow** | Nothing | No infrastructure to run two processes in parallel and compare | +| **Promote** | Vault PR approval; sprint ACCEPT/REJECT | Not wired to process lifecycle | + +Additional existing infrastructure: +- `.profile/lessons-learned.md` captures per-agent learning (abstract patterns) +- `ops/knowledge/planner-memory.md` persists planner observations across runs +- `docs/EVIDENCE-ARCHITECTURE.md` defines sense vs mutation processes +- Formulas (`formulas/*.toml`) define processes but have no versioning + +## Complexity + +- **Files touched**: ~6-8 (knowledge graph, predictor formula, planner formula, new process-metrics formula, evidence architecture docs, ops repo RFC directory) +- **Subsystems**: predictor, planner, knowledge graph, formula-session, evidence pipeline +- **Estimated sub-issues**: 6-8 +- **Gluecode vs greenfield**: ~60% gluecode (extending prediction-triage, adding graph nodes, wiring evidence collection) / ~40% greenfield (process metrics collector, RFC format, shadow-run comparator) + +## Risks + +1. **Compute cost**: Shadow-running doubles resource usage during shadow periods. Needs a time-bound or cycle-bound cap. +2. **Wrong metrics**: Process metrics must be carefully chosen — optimizing for speed could sacrifice quality. The predictor's existing "evidence strength" checks provide a model. +3. **Scope creep**: "Process evolution" could expand endlessly. This sprint should deliver the pipeline (observe, propose, shadow, promote) for ONE process as proof-of-concept, not all processes at once. +4. **Over-engineering risk**: The factory has ~10 agents, not 1000 microservices. The mechanism should be proportional to the system's complexity. A lightweight RFC-in-ops-repo approach is better than a framework. + +## Cost — new infra to maintain + +- **Process metrics formula** (`formulas/collect-process-metrics.toml`): new formula, runs on predictor/planner schedule. Collects from git log, CI API, and issue timeline. +- **RFC directory** (`ops/process-rfcs/`): new directory in ops repo. Low maintenance — just markdown files. +- **Shadow-run comparator**: new step in formula-session.sh that can fork a formula step between current and candidate implementations. Needs cleanup logic for shadow artifacts. +- **No new services or containers** — this extends existing agent capabilities, doesn't add new ones. + +## Recommendation + +**Worth it — but scope tightly to one proof-of-concept process.** + +The prediction-triage workflow already implements observe-to-propose. Extending it to include shadow-to-promote is a natural evolution, not a leap. The key risk is scope creep — this sprint should deliver the pipeline for ONE process mutation (e.g., "skip review for docs-only PRs" or "auto-close stale predictions after 7 days") and prove the lifecycle works end-to-end. + +Defer: building a generic process evolution framework. The first sprint proves the pattern; generalization comes later if the pattern holds. -- 2.49.1