From 0a5b54ff4f1c6146186090ccc8749e799370ad90 Mon Sep 17 00:00:00 2001
From: Claude <noreply@anthropic.com>
Date: Fri, 10 Apr 2026 09:03:52 +0000
Subject: [PATCH] =?UTF-8?q?fix:=20tech-debt:=20rewrite=20AD-002=20?=
 =?UTF-8?q?=E2=80=94=20concurrency=20is=20bounded=20per=20LLM=20backend,?=
 =?UTF-8?q?=20not=20per=20project=20(#550)?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

---
 AGENTS.md                   | 5 +++--
 gardener/best-practices.md  | 1 +
 knowledge/dev-agent.md      | 2 +-
 site/docs/architecture.html | 2 +-
 4 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/AGENTS.md b/AGENTS.md
index 816cdda..48aea6b 100644
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -150,7 +150,7 @@ Issues flow: `backlog` → `in-progress` → PR → CI → review → merge →
 
 ### Dependency conventions
 
-Issues declare dependencies via `## Dependencies` / `## Depends on` sections listing `#N` refs. `lib/parse-deps.sh` extracts these; dev-poll only picks issues whose deps are all closed. See AD-002 for single-threaded pipeline rules.
+Issues declare dependencies via `## Dependencies` / `## Depends on` sections listing `#N` refs. `lib/parse-deps.sh` extracts these; dev-poll only picks issues whose deps are all closed. See AD-002 for concurrency bounds per LLM backend.
 
 ---
 
@@ -174,7 +174,7 @@ Humans write these. Agents read and enforce them.
 | ID | Decision | Rationale |
 |---|---|---|
 | AD-001 | Nervous system runs from a polling loop (`docker/agents/entrypoint.sh`), not PR-based actions. | Planner, predictor, gardener, supervisor run directly via `*-run.sh`. They create work, they don't become work. (See PR #474 revert.) |
-| AD-002 | Single-threaded pipeline per project. | One dev issue at a time. No new work while a PR awaits CI or review. Prevents merge conflicts and keeps context clear. |
+| AD-002 | **Concurrency is bounded per LLM backend, not per project.** One concurrent Claude session per OAuth credential pool; one concurrent session per llama-server instance. Containers with disjoint backends may run in parallel. | The single-thread invariant is about *backends*, not pipelines. **(a) Anthropic OAuth credentials race on token refresh** — two sessions sharing one mounted `~/.claude` will trip over each other during rotation and 401. All agents inside an OAuth-mounted container serialize on `flock session.lock`. **(b) llama-server has finite VRAM and one KV cache** — parallel inference thrashes the cache and risks OOM. All llama-backed agents serialize on the same lock. **(c) Disjoint backends are free to parallelize.** Today `disinto-agents` (Anthropic OAuth, runs `review,gardener`) runs concurrently with `disinto-agents-llama` (llama, runs `dev`) on the same project — they share neither OAuth state nor llama VRAM. **(d) Per-project work-conflict safety** (no duplicate dev work, no merge conflicts on the same branch) is enforced by `issue_claim` (assignee + `in-progress` label) and per-issue worktrees — that's a separate guard that does NOT depend on this AD. |
 | AD-003 | The runtime creates and destroys, the formula preserves. | Runtime manages worktrees/sessions/temp. Formulas commit knowledge to git before signaling done. |
 | AD-004 | Event-driven > polling > fixed delays. | Never `waitForTimeout` or hardcoded sleep. Use phase files, webhooks, or poll loops with backoff. |
 | AD-005 | Secrets via env var indirection, never in issue bodies. | Issue bodies become code. Agent secrets go in `.env.enc`, vault secrets in `.env.vault.enc` (SOPS-encrypted when available; plaintext `.env`/`.env.vault` fallback supported). Referenced as `$VAR_NAME`. Runner gets only vault secrets; agents get only agent secrets. |
@@ -184,6 +184,7 @@ Humans write these. Agents read and enforce them.
 - **Gardener** checks open backlog issues against ADs during grooming; closes violations with a comment referencing the AD number.
 - **Planner** plans within the architecture; does not create issues that violate ADs.
 - **Dev-agent** reads AGENTS.md before implementing; refuses work that violates ADs.
+- **AD-002 is a runtime invariant; nothing for the gardener to check at issue-groom time.** Concurrency is enforced by `flock session.lock` within each container and by `issue_claim` for per-issue work. A violation manifests as a 401 or VRAM OOM in agent logs, not as a malformed issue.
 
 ---
 
diff --git a/gardener/best-practices.md b/gardener/best-practices.md
index cc75d79..5242b26 100644
--- a/gardener/best-practices.md
+++ b/gardener/best-practices.md
@@ -51,3 +51,4 @@ Compact, decision-ready. Human should be able to reply "1a 2c 3b" and be done.
 - Dev-agent doesn't understand the product — clear acceptance criteria save 2-3 CI cycles
 - Feature issues MUST list affected e2e test files
 - Issue templates from ISSUE-TEMPLATES.md propagate via triage gate
+- **AD-002 is a runtime invariant; nothing for the gardener to check at issue-groom time.** Concurrency is enforced by `flock session.lock` within each container and by `issue_claim` for per-issue work. A violation manifests as a 401 or VRAM OOM in agent logs, not as a malformed issue.
diff --git a/knowledge/dev-agent.md b/knowledge/dev-agent.md
index c32f519..b53e1ec 100644
--- a/knowledge/dev-agent.md
+++ b/knowledge/dev-agent.md
@@ -23,6 +23,6 @@ git worktree prune 2>/dev/null || true
 - Check for unmet dependencies (issues with `Depends on` refs)
 
 ### Prevention
-- Single-threaded pipeline per project (AD-002)
+- Concurrency bounded per LLM backend (AD-002)
 - Clear lock files in EXIT traps
 - Use phase files to track agent state
diff --git a/site/docs/architecture.html b/site/docs/architecture.html
index 0d1dc9e..cd33f72 100644
--- a/site/docs/architecture.html
+++ b/site/docs/architecture.html
@@ -489,7 +489,7 @@
         </div>
         <div class="principle">
           <div class="id">AD-002</div>
-          <div class="text"><strong>Single-threaded pipeline per project.</strong> One dev issue at a time. No new work while a PR awaits CI or review.</div>
+          <div class="text"><strong>Concurrency is bounded per LLM backend, not per project.</strong> One concurrent Claude session per OAuth credential pool; one concurrent session per llama-server instance. Containers with disjoint backends may run in parallel.</div>
         </div>
         <div class="principle">
           <div class="id">AD-003</div>