Compare commits
1 commit
ffcadbfee0
...
e611288b80
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
e611288b80 |
1 changed files with 154 additions and 59 deletions
|
|
@ -1,54 +1,101 @@
|
||||||
# agents-llama — Local-Qwen Agents
|
# Local-Model Agents
|
||||||
|
|
||||||
The `agents-llama` service is an optional compose service that runs agents
|
Local-model agents run the same agent code as the Claude-backed agents, but
|
||||||
backed by a local llama-server instance (e.g. Qwen) instead of the Anthropic
|
connect to a local llama-server (or compatible OpenAI-API endpoint) instead of
|
||||||
API. It uses the same Docker image as the main `agents` service but connects to
|
the Anthropic API. This document describes the current activation flow using
|
||||||
a local inference endpoint via `ANTHROPIC_BASE_URL`.
|
`disinto hire-an-agent` and `[agents.X]` TOML configuration.
|
||||||
|
|
||||||
Two profiles are available:
|
## Overview
|
||||||
|
|
||||||
| Profile | Service | Roles | Use case |
|
Local-model agents are configured via `[agents.<name>]` sections in
|
||||||
|---------|---------|-------|----------|
|
`projects/<project>.toml`. Each agent gets:
|
||||||
| _(default)_ | `agents-llama` | `dev` only | Conservative: single-role soak test |
|
- Its own Forgejo bot user with dedicated API token and password
|
||||||
| `agents-llama-all` | `agents-llama-all` | all 7 (review, dev, gardener, architect, planner, predictor, supervisor) | Pre-migration: validate every role on llama before Nomad cutover |
|
- A dedicated compose service `agents-<name>`
|
||||||
|
- Isolated credentials stored as `FORGE_TOKEN_<USER_UPPER>` and `FORGE_PASS_<USER_UPPER>` in `.env`
|
||||||
|
|
||||||
## Enabling
|
## Prerequisites
|
||||||
|
|
||||||
Set `ENABLE_LLAMA_AGENT=1` in `.env` (or `.env.enc`) and provide the required
|
- **llama-server** (or compatible OpenAI-API endpoint) running on the host,
|
||||||
credentials:
|
reachable from inside Docker at the URL you will configure.
|
||||||
|
- A disinto factory already initialized (`disinto init` completed).
|
||||||
|
|
||||||
```env
|
## Hiring a local-model agent
|
||||||
ENABLE_LLAMA_AGENT=1
|
|
||||||
FORGE_TOKEN_LLAMA=<dev-qwen API token>
|
|
||||||
FORGE_PASS_LLAMA=<dev-qwen password>
|
|
||||||
ANTHROPIC_BASE_URL=http://host.docker.internal:8081 # llama-server endpoint
|
|
||||||
```
|
|
||||||
|
|
||||||
Then regenerate the compose file (`disinto init ...`) and bring the stack up.
|
Use `disinto hire-an-agent` with `--local-model` to create a bot user and
|
||||||
|
configure the agent:
|
||||||
## Hiring a new agent
|
|
||||||
|
|
||||||
Use `disinto hire-an-agent` to create a Forgejo user, API token, and password,
|
|
||||||
and write all required credentials to `.env`:
|
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# Local model agent
|
# Hire a local-model agent for the dev role
|
||||||
disinto hire-an-agent dev-qwen dev \
|
disinto hire-an-agent dev-qwen dev \
|
||||||
--local-model http://10.10.10.1:8081 \
|
--local-model http://10.10.10.1:8081 \
|
||||||
--model unsloth/Qwen3.5-35B-A3B
|
--model unsloth/Qwen3.5-35B-A3B
|
||||||
|
|
||||||
# Anthropic backend agent (requires ANTHROPIC_API_KEY in environment)
|
|
||||||
disinto hire-an-agent dev-qwen dev
|
|
||||||
```
|
```
|
||||||
|
|
||||||
The command writes the following to `.env`:
|
The command performs these steps:
|
||||||
- `FORGE_TOKEN_<USER_UPPER>` — derived from the agent's Forgejo username (e.g., `FORGE_TOKEN_DEV_QWEN`)
|
|
||||||
- `FORGE_PASS_<USER_UPPER>` — the agent's Forgejo password
|
|
||||||
- `ANTHROPIC_BASE_URL` (local model) or `ANTHROPIC_API_KEY` (Anthropic backend)
|
|
||||||
|
|
||||||
## Rotation
|
1. **Creates a Forgejo user** `dev-qwen` with a random password
|
||||||
|
2. **Generates an API token** for the user
|
||||||
|
3. **Writes credentials to `.env`**:
|
||||||
|
- `FORGE_TOKEN_DEV_QWEN` — the API token
|
||||||
|
- `FORGE_PASS_DEV_QWEN` — the password
|
||||||
|
- `ANTHROPIC_BASE_URL` — the llama endpoint (required by the agent)
|
||||||
|
4. **Writes `[agents.dev-qwen]` to `projects/<project>.toml`** with:
|
||||||
|
- `base_url`, `model`, `api_key`
|
||||||
|
- `roles = ["dev"]`
|
||||||
|
- `forge_user = "dev-qwen"`
|
||||||
|
- `compact_pct = 60`
|
||||||
|
- `poll_interval = 60`
|
||||||
|
5. **Regenerates `docker-compose.yml`** to include the `agents-dev-qwen` service
|
||||||
|
|
||||||
Re-running `disinto hire-an-agent <same-name>` rotates credentials idempotently:
|
### Anthropic backend agents
|
||||||
|
|
||||||
|
For agents that use Anthropic API instead of a local model, omit `--local-model`:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Anthropic backend agent (requires ANTHROPIC_API_KEY in environment)
|
||||||
|
export ANTHROPIC_API_KEY="sk-..."
|
||||||
|
disinto hire-an-agent dev-claude dev
|
||||||
|
```
|
||||||
|
|
||||||
|
This writes `ANTHROPIC_API_KEY` to `.env` instead of `ANTHROPIC_BASE_URL`.
|
||||||
|
|
||||||
|
## Activation and running
|
||||||
|
|
||||||
|
### Default activation (single agent)
|
||||||
|
|
||||||
|
Once hired, the agent service is added to `docker-compose.yml` but not started
|
||||||
|
by default. To start a single agent:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Start just the dev-qwen agent
|
||||||
|
COMPOSE_PROFILES=agents-dev-qwen docker compose up -d
|
||||||
|
```
|
||||||
|
|
||||||
|
**Important:** Local-model agent services are profile-gated. Running `docker
|
||||||
|
compose up -d` without `COMPOSE_PROFILES` will not start them, and `--remove-orphans`
|
||||||
|
may remove them as unmanaged containers.
|
||||||
|
|
||||||
|
### Starting multiple agents
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Start multiple agents
|
||||||
|
COMPOSE_PROFILES=agents-dev-qwen COMPOSE_PROFILES=agents-planner docker compose up -d
|
||||||
|
```
|
||||||
|
|
||||||
|
### Stopping agents
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Stop specific agents
|
||||||
|
COMPOSE_PROFILES=agents-dev-qwen docker compose down
|
||||||
|
|
||||||
|
# Stop all agents
|
||||||
|
docker compose down
|
||||||
|
```
|
||||||
|
|
||||||
|
## Credential rotation
|
||||||
|
|
||||||
|
Re-running `disinto hire-an-agent <same-name>` with the same parameters rotates
|
||||||
|
credentials idempotently:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# Re-hire the same agent to rotate token and password
|
# Re-hire the same agent to rotate token and password
|
||||||
|
|
@ -66,39 +113,87 @@ disinto hire-an-agent dev-qwen dev \
|
||||||
This is the recommended way to rotate agent credentials. The `.env` file is
|
This is the recommended way to rotate agent credentials. The `.env` file is
|
||||||
updated in place, so no manual editing is required.
|
updated in place, so no manual editing is required.
|
||||||
|
|
||||||
If you need to manually rotate credentials, you can:
|
If you need to manually rotate credentials:
|
||||||
1. Generate a new token in Forgejo admin UI
|
1. Generate a new token in Forgejo admin UI
|
||||||
2. Edit `.env` and replace `FORGE_TOKEN_<USER_UPPER>` and `FORGE_PASS_<USER_UPPER>`
|
2. Edit `.env` and replace `FORGE_TOKEN_<USER_UPPER>` and `FORGE_PASS_<USER_UPPER>`
|
||||||
3. Restart the agent service: `docker compose restart disinto-agents-<name>`
|
3. Restart the agent service: `docker compose restart agents-<name>`
|
||||||
|
|
||||||
### Running all 7 roles (agents-llama-all)
|
## Configuration reference
|
||||||
|
|
||||||
```bash
|
### Environment variables (`.env`)
|
||||||
docker compose --profile agents-llama-all up -d
|
|
||||||
|
| Variable | Description | Example |
|
||||||
|
|----------|-------------|---------|
|
||||||
|
| `FORGE_TOKEN_<USER_UPPER>` | Forgejo API token for the bot user | `FORGE_TOKEN_DEV_QWEN` |
|
||||||
|
| `FORGE_PASS_<USER_UPPER>` | Forgejo password for the bot user | `FORGE_PASS_DEV_QWEN` |
|
||||||
|
| `ANTHROPIC_BASE_URL` | Local llama endpoint (local model agents) | `http://host.docker.internal:8081` |
|
||||||
|
| `ANTHROPIC_API_KEY` | Anthropic API key (Anthropic backend agents) | `sk-...` |
|
||||||
|
|
||||||
|
### Project TOML (`[agents.<name>]` section)
|
||||||
|
|
||||||
|
```toml
|
||||||
|
[agents.dev-qwen]
|
||||||
|
base_url = "http://10.10.10.1:8081"
|
||||||
|
model = "unsloth/Qwen3.5-35B-A3B"
|
||||||
|
api_key = "sk-no-key-required"
|
||||||
|
roles = ["dev"]
|
||||||
|
forge_user = "dev-qwen"
|
||||||
|
compact_pct = 60
|
||||||
|
poll_interval = 60
|
||||||
```
|
```
|
||||||
|
|
||||||
This starts the `agents-llama-all` container with all 7 bot roles against the
|
| Field | Description |
|
||||||
local llama endpoint. The per-role forge tokens (`FORGE_REVIEW_TOKEN`,
|
|-------|-------------|
|
||||||
`FORGE_GARDENER_TOKEN`, etc.) must be set in `.env` — they are the same tokens
|
| `base_url` | llama-server endpoint |
|
||||||
used by the Claude-backed `agents` container.
|
| `model` | Model name (for logging/identification) |
|
||||||
|
| `api_key` | Required by API; set to placeholder for llama |
|
||||||
## Prerequisites
|
| `roles` | Agent roles this instance handles |
|
||||||
|
| `forge_user` | Forgejo bot username |
|
||||||
- **llama-server** (or compatible OpenAI-API endpoint) running on the host,
|
| `compact_pct` | Context compaction threshold (lower = more aggressive) |
|
||||||
reachable from inside Docker at the URL set in `ANTHROPIC_BASE_URL`.
|
| `poll_interval` | Seconds between polling cycles |
|
||||||
- A Forgejo bot user (e.g. `dev-qwen`) with its own API token and password,
|
|
||||||
stored as `FORGE_TOKEN_LLAMA` / `FORGE_PASS_LLAMA`.
|
|
||||||
|
|
||||||
## Behaviour
|
## Behaviour
|
||||||
|
|
||||||
- `agents-llama`: `AGENT_ROLES=dev` — only picks up dev work.
|
- Each agent runs with `AGENT_ROLES` set to its configured roles
|
||||||
- `agents-llama-all`: `AGENT_ROLES=review,dev,gardener,architect,planner,predictor,supervisor` — runs all 7 roles.
|
|
||||||
- `CLAUDE_AUTOCOMPACT_PCT_OVERRIDE=60` — more aggressive compaction for smaller
|
- `CLAUDE_AUTOCOMPACT_PCT_OVERRIDE=60` — more aggressive compaction for smaller
|
||||||
context windows.
|
context windows
|
||||||
- Serialises on the llama-server's single KV cache (AD-002).
|
- Agents serialize on the llama-server's single KV cache (AD-002)
|
||||||
|
|
||||||
## Disabling
|
## Troubleshooting
|
||||||
|
|
||||||
Set `ENABLE_LLAMA_AGENT=0` (or leave it unset) and regenerate. The service
|
### Agent service not starting
|
||||||
block is omitted entirely from `docker-compose.yml`; the stack starts cleanly
|
|
||||||
without it.
|
Check that you're using `COMPOSE_PROFILES`:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Wrong: this won't start profile-gated agent services
|
||||||
|
docker compose up -d
|
||||||
|
|
||||||
|
# Correct: explicitly specify the profile
|
||||||
|
COMPOSE_PROFILES=agents-dev-qwen docker compose up -d
|
||||||
|
```
|
||||||
|
|
||||||
|
### Model endpoint unreachable
|
||||||
|
|
||||||
|
Verify llama-server is accessible from inside Docker:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
docker compose -f docker-compose.yml exec agents curl -sf http://host.docker.internal:8081/health
|
||||||
|
```
|
||||||
|
|
||||||
|
If using a custom host IP, update `ANTHROPIC_BASE_URL` in `.env`:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Update the base URL
|
||||||
|
sed -i 's|^ANTHROPIC_BASE_URL=.*|ANTHROPIC_BASE_URL=http://192.168.1.100:8081|' .env
|
||||||
|
|
||||||
|
# Restart the agent
|
||||||
|
COMPOSE_PROFILES=agents-dev-qwen docker compose restart agents-dev-qwen
|
||||||
|
```
|
||||||
|
|
||||||
|
### Invalid agent name
|
||||||
|
|
||||||
|
Agent names must match `^[a-z]([a-z0-9]|-[a-z0-9])*$` (lowercase letters, digits,
|
||||||
|
hyphens; starts with letter, ends with alphanumeric). Invalid names like
|
||||||
|
`dev-qwen2` (trailing digit is OK) or `dev--qwen` (consecutive hyphens) will
|
||||||
|
be rejected.
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue