2026-04-16 12:57:28 +00:00
1 changed files with 146 additions and 59 deletions
--- a/docs/agents-llama.md
+++ b/docs/agents-llama.md
@ -1,54 +1,94 @@
-# agents-llama — Local-Qwen Agents
+# Local-Model Agents
-The `agents-llama` service is an optional compose service that runs agents
+Local-model agents run the same agent code as the Claude-backed agents, but
-backed by a local llama-server instance (e.g. Qwen) instead of the Anthropic
+connect to a local llama-server (or compatible OpenAI-API endpoint) instead of
-API. It uses the same Docker image as the main `agents` service but connects to
+the Anthropic API. This document describes the current activation flow using
-a local inference endpoint via `ANTHROPIC_BASE_URL`.
+`disinto hire-an-agent` and `[agents.X]` TOML configuration.
-Two profiles are available:
+## Overview
-| Profile | Service | Roles | Use case |
+Local-model agents are configured via `[agents.<name>]` sections in
-|---------|---------|-------|----------|
+`projects/<project>.toml`. Each agent gets:
-| _(default)_ | `agents-llama` | `dev` only | Conservative: single-role soak test |
+- Its own Forgejo bot user with dedicated API token and password
-| `agents-llama-all` | `agents-llama-all` | all 7 (review, dev, gardener, architect, planner, predictor, supervisor) | Pre-migration: validate every role on llama before Nomad cutover |
+- A dedicated compose service `agents-<name>`
 - Isolated credentials stored as `FORGE_TOKEN_<USER_UPPER>` and `FORGE_PASS_<USER_UPPER>` in `.env`
-## Enabling
+## Prerequisites
-Set `ENABLE_LLAMA_AGENT=1` in `.env` (or `.env.enc`) and provide the required
+- **llama-server** (or compatible OpenAI-API endpoint) running on the host,
-credentials:
+  reachable from inside Docker at the URL you will configure.
 - A disinto factory already initialized (`disinto init` completed).
-```env
+## Hiring a local-model agent
 ENABLE_LLAMA_AGENT=1
 FORGE_TOKEN_LLAMA=<dev-qwen API token>
 FORGE_PASS_LLAMA=<dev-qwen password>
 ANTHROPIC_BASE_URL=http://host.docker.internal:8081   # llama-server endpoint
 ```
-Then regenerate the compose file (`disinto init ...`) and bring the stack up.
+Use `disinto hire-an-agent` with `--local-model` to create a bot user and
-
+configure the agent:
 ## Hiring a new agent
 Use `disinto hire-an-agent` to create a Forgejo user, API token, and password,
 and write all required credentials to `.env`:
 ```bash
-# Local model agent
+# Hire a local-model agent for the dev role
 disinto hire-an-agent dev-qwen dev \
  --local-model http://10.10.10.1:8081 \
  --model unsloth/Qwen3.5-35B-A3B
 # Anthropic backend agent (requires ANTHROPIC_API_KEY in environment)
 disinto hire-an-agent dev-qwen dev
 ```
-The command writes the following to `.env`:
+The command performs these steps:
 - `FORGE_TOKEN_<USER_UPPER>` — derived from the agent's Forgejo username (e.g., `FORGE_TOKEN_DEV_QWEN`)
 - `FORGE_PASS_<USER_UPPER>` — the agent's Forgejo password
 - `ANTHROPIC_BASE_URL` (local model) or `ANTHROPIC_API_KEY` (Anthropic backend)
-## Rotation
+1. **Creates a Forgejo user** `dev-qwen` with a random password
 2. **Generates an API token** for the user
 3. **Writes credentials to `.env`**:
   - `FORGE_TOKEN_DEV_QWEN` — the API token
   - `FORGE_PASS_DEV_QWEN` — the password
   - `ANTHROPIC_BASE_URL` — the llama endpoint (required by the agent)
 4. **Writes `[agents.dev-qwen]` to `projects/<project>.toml`** with:
   - `base_url`, `model`, `api_key`
   - `roles = ["dev"]`
   - `forge_user = "dev-qwen"`
   - `compact_pct = 60`
   - `poll_interval = 60`
 5. **Regenerates `docker-compose.yml`** to include the `agents-dev-qwen` service
-Re-running `disinto hire-an-agent <same-name>` rotates credentials idempotently:
+### Anthropic backend agents
 For agents that use Anthropic API instead of a local model, omit `--local-model`:
 ```bash
 # Anthropic backend agent (requires ANTHROPIC_API_KEY in environment)
 export ANTHROPIC_API_KEY="sk-..."
 disinto hire-an-agent dev-claude dev
 ```
 This writes `ANTHROPIC_API_KEY` to `.env` instead of `ANTHROPIC_BASE_URL`.
 ## Activation and running
 Once hired, the agent service is added to `docker-compose.yml`. Start the
 service with `docker compose up -d`:
 ```bash
 # Start all agent services
 docker compose up -d
 # Start a single named agent service
 docker compose up -d agents-dev-qwen
 # Start multiple named agent services
 docker compose up -d agents-dev-qwen agents-planner
 ```
 ### Stopping agents
 ```bash
 # Stop a specific agent service
 docker compose down agents-dev-qwen
 # Stop all agent services
 docker compose down
 ```
 ## Credential rotation
 Re-running `disinto hire-an-agent <same-name>` with the same parameters rotates
 credentials idempotently:
 ```bash
 # Re-hire the same agent to rotate token and password
@ -66,39 +106,86 @@ disinto hire-an-agent dev-qwen dev \
 This is the recommended way to rotate agent credentials. The `.env` file is
 updated in place, so no manual editing is required.
-If you need to manually rotate credentials, you can:
+If you need to manually rotate credentials:
 1. Generate a new token in Forgejo admin UI
 2. Edit `.env` and replace `FORGE_TOKEN_<USER_UPPER>` and `FORGE_PASS_<USER_UPPER>`
-3. Restart the agent service: `docker compose restart disinto-agents-<name>`
+3. Restart the agent service: `docker compose restart agents-<name>`
-### Running all 7 roles (agents-llama-all)
+## Configuration reference
-```bash
+### Environment variables (`.env`)
-docker compose --profile agents-llama-all up -d
+
 | Variable | Description | Example |
 |----------|-------------|---------|
 | `FORGE_TOKEN_<USER_UPPER>` | Forgejo API token for the bot user | `FORGE_TOKEN_DEV_QWEN` |
 | `FORGE_PASS_<USER_UPPER>` | Forgejo password for the bot user | `FORGE_PASS_DEV_QWEN` |
 | `ANTHROPIC_BASE_URL` | Local llama endpoint (local model agents) | `http://host.docker.internal:8081` |
 | `ANTHROPIC_API_KEY` | Anthropic API key (Anthropic backend agents) | `sk-...` |
 ### Project TOML (`[agents.<name>]` section)
 ```toml
 [agents.dev-qwen]
 base_url = "http://10.10.10.1:8081"
 model = "unsloth/Qwen3.5-35B-A3B"
 api_key = "sk-no-key-required"
 roles = ["dev"]
 forge_user = "dev-qwen"
 compact_pct = 60
 poll_interval = 60
 ```
-This starts the `agents-llama-all` container with all 7 bot roles against the
+| Field | Description |
-local llama endpoint. The per-role forge tokens (`FORGE_REVIEW_TOKEN`,
+|-------|-------------|
-`FORGE_GARDENER_TOKEN`, etc.) must be set in `.env` — they are the same tokens
+| `base_url` | llama-server endpoint |
-used by the Claude-backed `agents` container.
+| `model` | Model name (for logging/identification) |
-
+| `api_key` | Required by API; set to placeholder for llama |
-## Prerequisites
+| `roles` | Agent roles this instance handles |
-
+| `forge_user` | Forgejo bot username |
- **llama-server** (or compatible OpenAI-API endpoint) running on the host,
+| `compact_pct` | Context compaction threshold (lower = more aggressive) |
-  reachable from inside Docker at the URL set in `ANTHROPIC_BASE_URL`.
+| `poll_interval` | Seconds between polling cycles |
 - A Forgejo bot user (e.g. `dev-qwen`) with its own API token and password,
  stored as `FORGE_TOKEN_LLAMA` / `FORGE_PASS_LLAMA`.
 ## Behaviour
- `agents-llama`: `AGENT_ROLES=dev` — only picks up dev work.
+- Each agent runs with `AGENT_ROLES` set to its configured roles
 - `agents-llama-all`: `AGENT_ROLES=review,dev,gardener,architect,planner,predictor,supervisor` — runs all 7 roles.
 - `CLAUDE_AUTOCOMPACT_PCT_OVERRIDE=60` — more aggressive compaction for smaller
-  context windows.
+  context windows
- Serialises on the llama-server's single KV cache (AD-002).
+- Agents serialize on the llama-server's single KV cache (AD-002)
-## Disabling
+## Troubleshooting
-Set `ENABLE_LLAMA_AGENT=0` (or leave it unset) and regenerate. The service
+### Agent service not starting
-block is omitted entirely from `docker-compose.yml`; the stack starts cleanly
+
-without it.
+Check that the service was created by `disinto hire-an-agent`:
 ```bash
 docker compose config | grep -A5 "agents-dev-qwen"
 ```
 If the service is missing, re-run `disinto hire-an-agent dev-qwen dev` to
 regenerate `docker-compose.yml`.
 ### Model endpoint unreachable
 Verify llama-server is accessible from inside Docker:
 ```bash
 docker compose -f docker-compose.yml exec agents curl -sf http://host.docker.internal:8081/health
 ```
 If using a custom host IP, update `ANTHROPIC_BASE_URL` in `.env`:
 ```bash
 # Update the base URL
 sed -i 's|^ANTHROPIC_BASE_URL=.*|ANTHROPIC_BASE_URL=http://192.168.1.100:8081|' .env
 # Restart the agent
 docker compose restart agents-dev-qwen
 ```
 ### Invalid agent name
 Agent names must match `^[a-z]([a-z0-9]|-[a-z0-9])*$` (lowercase letters, digits,
 hyphens; starts with letter, ends with alphanumeric). Invalid names like
 `dev-qwen2` (trailing digit is OK) or `dev--qwen` (consecutive hyphens) will
 be rejected.