From 0a5b54ff4f1c6146186090ccc8749e799370ad90 Mon Sep 17 00:00:00 2001 From: Claude Date: Fri, 10 Apr 2026 09:03:52 +0000 Subject: [PATCH] =?UTF-8?q?fix:=20tech-debt:=20rewrite=20AD-002=20?= =?UTF-8?q?=E2=80=94=20concurrency=20is=20bounded=20per=20LLM=20backend,?= =?UTF-8?q?=20not=20per=20project=20(#550)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- AGENTS.md | 5 +++-- gardener/best-practices.md | 1 + knowledge/dev-agent.md | 2 +- site/docs/architecture.html | 2 +- 4 files changed, 6 insertions(+), 4 deletions(-) diff --git a/AGENTS.md b/AGENTS.md index 816cdda..48aea6b 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -150,7 +150,7 @@ Issues flow: `backlog` → `in-progress` → PR → CI → review → merge → ### Dependency conventions -Issues declare dependencies via `## Dependencies` / `## Depends on` sections listing `#N` refs. `lib/parse-deps.sh` extracts these; dev-poll only picks issues whose deps are all closed. See AD-002 for single-threaded pipeline rules. +Issues declare dependencies via `## Dependencies` / `## Depends on` sections listing `#N` refs. `lib/parse-deps.sh` extracts these; dev-poll only picks issues whose deps are all closed. See AD-002 for concurrency bounds per LLM backend. --- @@ -174,7 +174,7 @@ Humans write these. Agents read and enforce them. | ID | Decision | Rationale | |---|---|---| | AD-001 | Nervous system runs from a polling loop (`docker/agents/entrypoint.sh`), not PR-based actions. | Planner, predictor, gardener, supervisor run directly via `*-run.sh`. They create work, they don't become work. (See PR #474 revert.) | -| AD-002 | Single-threaded pipeline per project. | One dev issue at a time. No new work while a PR awaits CI or review. Prevents merge conflicts and keeps context clear. | +| AD-002 | **Concurrency is bounded per LLM backend, not per project.** One concurrent Claude session per OAuth credential pool; one concurrent session per llama-server instance. Containers with disjoint backends may run in parallel. | The single-thread invariant is about *backends*, not pipelines. **(a) Anthropic OAuth credentials race on token refresh** — two sessions sharing one mounted `~/.claude` will trip over each other during rotation and 401. All agents inside an OAuth-mounted container serialize on `flock session.lock`. **(b) llama-server has finite VRAM and one KV cache** — parallel inference thrashes the cache and risks OOM. All llama-backed agents serialize on the same lock. **(c) Disjoint backends are free to parallelize.** Today `disinto-agents` (Anthropic OAuth, runs `review,gardener`) runs concurrently with `disinto-agents-llama` (llama, runs `dev`) on the same project — they share neither OAuth state nor llama VRAM. **(d) Per-project work-conflict safety** (no duplicate dev work, no merge conflicts on the same branch) is enforced by `issue_claim` (assignee + `in-progress` label) and per-issue worktrees — that's a separate guard that does NOT depend on this AD. | | AD-003 | The runtime creates and destroys, the formula preserves. | Runtime manages worktrees/sessions/temp. Formulas commit knowledge to git before signaling done. | | AD-004 | Event-driven > polling > fixed delays. | Never `waitForTimeout` or hardcoded sleep. Use phase files, webhooks, or poll loops with backoff. | | AD-005 | Secrets via env var indirection, never in issue bodies. | Issue bodies become code. Agent secrets go in `.env.enc`, vault secrets in `.env.vault.enc` (SOPS-encrypted when available; plaintext `.env`/`.env.vault` fallback supported). Referenced as `$VAR_NAME`. Runner gets only vault secrets; agents get only agent secrets. | @@ -184,6 +184,7 @@ Humans write these. Agents read and enforce them. - **Gardener** checks open backlog issues against ADs during grooming; closes violations with a comment referencing the AD number. - **Planner** plans within the architecture; does not create issues that violate ADs. - **Dev-agent** reads AGENTS.md before implementing; refuses work that violates ADs. +- **AD-002 is a runtime invariant; nothing for the gardener to check at issue-groom time.** Concurrency is enforced by `flock session.lock` within each container and by `issue_claim` for per-issue work. A violation manifests as a 401 or VRAM OOM in agent logs, not as a malformed issue. --- diff --git a/gardener/best-practices.md b/gardener/best-practices.md index cc75d79..5242b26 100644 --- a/gardener/best-practices.md +++ b/gardener/best-practices.md @@ -51,3 +51,4 @@ Compact, decision-ready. Human should be able to reply "1a 2c 3b" and be done. - Dev-agent doesn't understand the product — clear acceptance criteria save 2-3 CI cycles - Feature issues MUST list affected e2e test files - Issue templates from ISSUE-TEMPLATES.md propagate via triage gate +- **AD-002 is a runtime invariant; nothing for the gardener to check at issue-groom time.** Concurrency is enforced by `flock session.lock` within each container and by `issue_claim` for per-issue work. A violation manifests as a 401 or VRAM OOM in agent logs, not as a malformed issue. diff --git a/knowledge/dev-agent.md b/knowledge/dev-agent.md index c32f519..b53e1ec 100644 --- a/knowledge/dev-agent.md +++ b/knowledge/dev-agent.md @@ -23,6 +23,6 @@ git worktree prune 2>/dev/null || true - Check for unmet dependencies (issues with `Depends on` refs) ### Prevention -- Single-threaded pipeline per project (AD-002) +- Concurrency bounded per LLM backend (AD-002) - Clear lock files in EXIT traps - Use phase files to track agent state diff --git a/site/docs/architecture.html b/site/docs/architecture.html index 0d1dc9e..cd33f72 100644 --- a/site/docs/architecture.html +++ b/site/docs/architecture.html @@ -489,7 +489,7 @@
AD-002
-
Single-threaded pipeline per project. One dev issue at a time. No new work while a PR awaits CI or review.
+
Concurrency is bounded per LLM backend, not per project. One concurrent Claude session per OAuth credential pool; one concurrent session per llama-server instance. Containers with disjoint backends may run in parallel.
AD-003