fix: Two parallel activation paths for llama agents (ENABLE_LLAMA_AGENT vs [agents.X] TOML) (#846)

- Remove ENABLE_LLAMA_AGENT conditional block from docker-compose generation - Remove legacy agents-llama and agents-llama-all services from docker-compose.yml - Remove llama bot user creation code (dev-qwen, dev-qwen-nightly) from lib/forge-setup.sh - Remove FORGE_TOKEN_LLAMA/FORGE_PASS_LLAMA environment variables from .env.example - Add migration error check that fails when ENABLE_LLAMA_AGENT=1 is found in .env - Update documentation: remove agents-llama entries from AGENTS.md and lib/AGENTS.md - Delete docs/agents-llama.md (legacy documentation) - TOML [agents.X] sections in projects/*.toml is now the canonical activation path
2026-04-16 11:25:34 +00:00 · 2026-04-16 11:25:34 +00:00 · 5c2d934759
commit 5c2d934759
parent 016d4fe8cc
8 changed files with 31 additions and 519 deletions
--- a/docs/agents-llama.md
+++ b/docs/agents-llama.md
@ -1,59 +0,0 @@
-# agents-llama — Local-Qwen Agents
-
-The `agents-llama` service is an optional compose service that runs agents
-backed by a local llama-server instance (e.g. Qwen) instead of the Anthropic
-API. It uses the same Docker image as the main `agents` service but connects to
-a local inference endpoint via `ANTHROPIC_BASE_URL`.
-
-Two profiles are available:
-
-| Profile | Service | Roles | Use case |
-|---------|---------|-------|----------|
-| _(default)_ | `agents-llama` | `dev` only | Conservative: single-role soak test |
-| `agents-llama-all` | `agents-llama-all` | all 7 (review, dev, gardener, architect, planner, predictor, supervisor) | Pre-migration: validate every role on llama before Nomad cutover |
-
-## Enabling
-
-Set `ENABLE_LLAMA_AGENT=1` in `.env` (or `.env.enc`) and provide the required
-credentials:
-
-```env
-ENABLE_LLAMA_AGENT=1
-FORGE_TOKEN_LLAMA=<dev-qwen API token>
-FORGE_PASS_LLAMA=<dev-qwen password>
-ANTHROPIC_BASE_URL=http://host.docker.internal:8081   # llama-server endpoint
-```
-
-Then regenerate the compose file (`disinto init ...`) and bring the stack up.
-
-### Running all 7 roles (agents-llama-all)
-
-```bash
-docker compose --profile agents-llama-all up -d
-```
-
-This starts the `agents-llama-all` container with all 7 bot roles against the
-local llama endpoint. The per-role forge tokens (`FORGE_REVIEW_TOKEN`,
-`FORGE_GARDENER_TOKEN`, etc.) must be set in `.env` — they are the same tokens
-used by the Claude-backed `agents` container.
-
-## Prerequisites
-
- **llama-server** (or compatible OpenAI-API endpoint) running on the host,
-  reachable from inside Docker at the URL set in `ANTHROPIC_BASE_URL`.
- A Forgejo bot user (e.g. `dev-qwen`) with its own API token and password,
-  stored as `FORGE_TOKEN_LLAMA` / `FORGE_PASS_LLAMA`.
-
-## Behaviour
-
- `agents-llama`: `AGENT_ROLES=dev` — only picks up dev work.
- `agents-llama-all`: `AGENT_ROLES=review,dev,gardener,architect,planner,predictor,supervisor` — runs all 7 roles.
- `CLAUDE_AUTOCOMPACT_PCT_OVERRIDE=60` — more aggressive compaction for smaller
-  context windows.
- Serialises on the llama-server's single KV cache (AD-002).
-
-## Disabling
-
-Set `ENABLE_LLAMA_AGENT=0` (or leave it unset) and regenerate. The service
-block is omitted entirely from `docker-compose.yml`; the stack starts cleanly
-without it.