> **Status:** Active design principle. All agents, reviewers, and planners should follow this.
## The Determinism / Judgment Split
Every agent has two kinds of work. The architecture should separate them cleanly.
### Deterministic (bash orchestrator)
Mechanical operations that always work the same way. These belong in bash scripts:
- Create and destroy tmux sessions
- Create and destroy git worktrees
- Phase file watching (the event loop)
- Lock files and concurrency guards
- Environment setup and teardown
- Session lifecycle (start, monitor, kill)
**Properties:** No judgment required. Never fails differently based on interpretation. Easy to test. Hard to break.
### Judgment (Claude via formula)
Operations that require understanding context, making decisions, or adapting to novel situations. These belong in the formula — the prompt Claude executes inside the tmux session:
- Read and understand the task (fetch issue body + comments, parse intent)
- Assess dependencies ("does the code this depends on actually exist?")
- Implement the solution
- Create PR with meaningful title and description
- Read review feedback, decide what to address vs push back on
- Handle CI failures (read logs, decide: fix, retry, or escalate)
- Choose rebase strategy (rebase, merge, or start over)
- Decide when to refuse vs implement
**Properties:** Benefits from context. Improves when the formula is refined. Adapts to novel situations without new bash code.
## Why This Matters
### Today's problem
Agent scripts grow by accretion. Every new lesson becomes another `if/elif/else` in bash:
- "CI failed with this pattern → retry with this flag"
- "Review comment mentions X → rebase before addressing"
- "Merge conflict in this file → apply this strategy"
This makes agents brittle, hard to modify, and impossible to generalize across projects.
### The alternative
A thin bash orchestrator handles session lifecycle. Everything that requires judgment lives in the formula — a structured prompt that Claude interprets. Learnings become formula refinements, not bash patches.
| supervisor | 877 | Heavy — multi-project health checks, CI stall detection, container monitoring | Partially justified (monitoring is deterministic, but escalation decisions are judgment) |
| gardener | 1242 (agent 471 + poll 771) | Medium — backlog triage, duplicate detection, tech-debt scoring | Poll is heavy orchestration; agent is prompt-driven |
| vault | 442 (4 scripts) | Medium — approval flow, human gate decisions | Intentionally bash-heavy (security gate should be deterministic) |