refactor: generate one agents-llama compose service per LLAMA_BOTS entry #832
Labels
No labels
action
backlog
blocked
bug-report
cannot-reproduce
in-progress
in-triage
needs-triage
prediction/actioned
prediction/dismissed
prediction/unreviewed
priority
rejected
reproduced
tech-debt
underspecified
vision
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: disinto-admin/disinto#832
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Problem
docker-compose.ymlhardcodes a singleagents-llamaservice backed byFORGE_TOKEN_LLAMA. Adding dev-qwen2 requires either a second hardcoded stanza or a compose regenerator that iterates the bot list.lib/generators.shalready owns compose regeneration (per #783generate_caddyfile/disinto upregen pattern). Extending it to loop overLLAMA_BOTSis the natural fit.Dependencies
LLAMA_BOTSin forge-setup) — must land first so env vars exist.Proposed solution
In
lib/generators.sh, for each$botin$LLAMA_BOTS, emit:agents-llama-${bot}with:container_name: disinto-agents-${bot}FORGE_TOKEN=${FORGE_TOKEN_<SUFFIX>},FORGE_PASS=${FORGE_PASS_<SUFFIX>}AGENT_ROLES=devproject-repos-${bot}:/home/agent/repos,agent-data-${bot}:/home/agent/dataCLAUDE_SHARED_DIR,AGENT_SSH_DIR,SOPS_AGE_DIR,woodpecker-data) unchangedproject-repos-${bot},agent-data-${bot}Use a YAML anchor (
&agents_llama_base) for the shared stanza so per-bot emission is minimal.Why per-bot volumes (not shared)
Two dev agents writing to the same
/home/agent/repos/_factoryworktree corrupt git state. The state dir${DISINTO_DIR}/state(containing.dev-activelock files used bycheck_activeinlib/guard.sh) lives inside this volume — sharing causes two containers to serialize on the same lock file. Per-bot volumes isolate both worktree and locks, which is what unlocks parallel dev throughput.Volume migration
Current
agents-llamauses shared volumesproject-repos/agent-data. After this refactor,dev-qwenmoves toproject-repos-dev-qwen/agent-data-dev-qwen. The worktree is re-cloned by entrypoint bootstrap (it starts from the baked copy, switches to live after first clone). CI-fix tracker + logs inagent-dataare append-only operational data — losing them on migration is acceptable. No repo state lives only in these volumes that isn't also on Forgejo.Acceptance criteria
disinto upregenerates compose with oneagents-llama-<bot>service perLLAMA_BOTSentryproject-repos-<bot>andagent-data-<bot>named volumeLLAMA_BOTS="dev-qwen dev-qwen2"and runningdisinto upbrings updisinto-agents-dev-qwen2without disturbing existing running containersdisinto-agents-dev-qwen; worktree re-clones on first run; no orphan statedisinto upon unchangedLLAMA_BOTSdoes not restart containersAffected files
lib/generators.sh— add per-bot loop in compose generationdocker-compose.yml— remove hardcodedagents-llamastanza; generator is the source of truth (aligned with #785 deprecation pattern for Caddyfile)Non-goals
.envedit +disinto up.Context
Third of three issues to enable operational scaling of llama dev agents. After merge, adding dev-qwen2 is: append to
LLAMA_BOTSin.env, rundisinto init(generates user + token), rundisinto up(brings up container). Zero further code change for dev-qwen3/4/N.Superseded by #834. The existing
disinto hire-an-agent+ project-TOML[agents.*]pattern already provides the parametric structure proposed here; #834 addresses the actual narrow defects (token var naming, FORGE_PASS persistence, per-agent compose token lookup, per-agent project-repos volume). Closing unpicked.