Sprint: agent management redesign

Vision issues

#557 — redesign agent management — hire by inference backend, list by capability

What this enables

After this sprint, operators can:

Hire agents by backend (disinto hire anthropic, disinto hire llama --url ...) instead of inventing names and roles
List all agents (disinto agents list) with backend, model, roles, and status in one table
Discover what is running without grepping compose files, TOML configs, and state directories

The factory becomes self-describing: an operator who inherits a running instance can immediately see what agents exist, what backends they use, and what roles they fill.

What exists today

The agent management system is functional but fragmented:

disinto hire-an-agent name role (lib/hire-agent.sh): Creates Forgejo user, .profile repo, API token, state file, and optionally writes agents TOML section plus regenerates compose. Works, but the mental model is backwards — operator must invent a name and pick a role before specifying the backend.
disinto agent enable/disable/status (bin/disinto): Manages state files for 6 hardcoded core agents (dev, reviewer, gardener, architect, planner, predictor). Local-model agents are invisible to this command.
agents TOML sections (projects/*.toml): Store local-model agent config (base_url, model, roles, forge_user). Read by lib/generators.sh to generate per-agent docker-compose services.
AGENT_ROLES env var: Runtime gate in entrypoint.sh — comma-separated list of roles the container runs.
Compose profiles: Local-model agents gated by profiles, requiring explicit --profile to start.

State lives in three disconnected places: state files (CLI), env vars (runtime), compose services (docker). No single command unifies them.

Complexity

Files touched: ~4 (bin/disinto, lib/hire-agent.sh, lib/generators.sh, docker/agents/entrypoint.sh)
Subsystems: CLI, compose generator, container entrypoint, project TOML schema
Estimated sub-issues: 4-5
Gluecode vs greenfield: ~80% gluecode (refactoring existing hire-agent.sh and CLI), ~20% greenfield (new agents list output, backend-first hire UX)

Risks

Breaking existing hire-an-agent: The old command must keep working during transition. Operators may have scripts that call it. Deprecation path needed.
State migration: Existing local-model agents configured via agents TOML need to work unchanged. The new system reads the same TOML — no migration required if we keep the schema.
Entrypoint.sh hardcoded list: The 6 core agents are hardcoded in multiple places (entrypoint.sh, bin/disinto). Making this dynamic requires careful testing to avoid breaking the polling loop.
TOML parsing fragility: The hire-agent.sh TOML writer uses a Python inline script. Changes to the TOML schema could break parsing if not tested.

Cost — new infra to maintain

No new services, cron jobs, or formulas. This is a refactor of existing CLI and configuration paths.
New code: disinto hire subcommand (~100 lines), disinto agents list subcommand (~80 lines), agent registry logic that unifies the three state sources (~50 lines).
Removed code: Portions of the current hire-an-agent that duplicate backend detection logic.
Ongoing: The hardcoded agent list in bin/disinto and entrypoint.sh becomes a derived list (from state files + TOML + compose). Slightly more complex discovery logic, but eliminates the need to update hardcoded lists when new agent types are added.

Recommendation

Worth it. This is a high-value, low-risk refactor that directly improves the adoption story. The current UX is the number one friction point for new operators — hire-an-agent requires knowing three things (name, role, backend) in the wrong order. The redesign makes the common case (disinto hire anthropic) a one-liner and gives operators visibility into what is running. No new infrastructure, no new dependencies, mostly gluecode over existing interfaces.

Defer only if the team wants to stabilize the current agent set first (all 4 open architect sprints are pending human review). Otherwise, this is independent work that does not conflict with any in-flight sprint.

4.1 KiB Raw Blame History