vision: supervisor agent running on host level with full system visibility #232
Labels
No labels
action
backlog
blocked
bug-report
in-progress
prediction/actioned
prediction/dismissed
prediction/unreviewed
priority
tech-debt
underspecified
vision
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: disinto-admin/disinto#232
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Goal
The supervisor agent monitors factory health: system resources, Docker containers, CI pipelines, agent sessions, stale locks, and worktrees. It needs Docker socket access for container visibility, which the agents container should not have.
Decision: run supervisor in the edge container
The edge container already runs infrastructure concerns (Caddy reverse proxy, vault dispatcher). Adding the supervisor here keeps the architecture clean:
This stays inside the disinto up/down lifecycle. No host-level services, no extra systemd units.
Existing code (all in the repo, ready to use)
Supervisor scripts
supervisor/supervisor-run.sh— complete cron wrapper following the standard *-run.sh pattern. Sources lib/env.sh, lib/formula-session.sh, lib/worktree.sh, lib/guard.sh, lib/agent-sdk.sh. Uses check_active, acquire_cron_lock, check_memory, load_formula_or_profile, build_context_block, formula_prepare_profile_context, formula_worktree_setup, agent_run, profile_write_journal.supervisor/preflight.sh— metrics collection. Sources lib/env.sh, lib/ci-helpers.sh. Collects RAM/swap/disk/load from /proc, Docker container status, CI pipeline state, open PRs, issue status, stale worktrees.formulas/run-supervisor.toml— 5-step formula: preflight, health-assessment, decide-actions, report, journal. Defines P0-P4 priority thresholds and auto-fix recipes.supervisor/AGENTS.md— agent documentation.Edge container (prior art for the integration pattern)
docker/edge/entrypoint-edge.sh— starts dispatcher in background, Caddy as main process. The supervisor should follow the same pattern: start supervisor loop in background alongside dispatcher.docker/edge/dispatcher.sh— sources lib/env.sh, runs a while-true poll loop with sleep 60. The supervisor loop should follow this pattern (while-true with sleep, not cron — cron doesn't inherit env vars).docker/edge/Dockerfile— Alpine-based, already has bash, jq, curl, git, docker-cli./var/run/docker.sock:/var/run/docker.sockStandard *-run.sh pattern (follow planner-run.sh as the cleanest reference)
Every formula agent follows the same sequence:
${DISINTO_LOG_DIR}/<agent>/<agent>.log(NOT$SCRIPT_DIR— see #210)What needs to change
docker/edge/Dockerfile
docker-compose.yml (edge service)
docker/edge/entrypoint-edge.sh
bash /opt/disinto/supervisor/supervisor-run.sh &supervisor/supervisor-run.sh
${DISINTO_LOG_DIR}/supervisor/supervisor.log(currently hardcoded to $SCRIPT_DIR, same bug as gardener had in #210)supervisor/preflight.sh
Known issues to address
Acceptance criteria
Decomposed into #343 (cleanup) and #344 (edge integration). Closing.