bug: code fixes to docker/agents/ don't take effect — agent image is never rebuilt #887
Labels
No labels
action
backlog
blocked
bug-report
cannot-reproduce
in-progress
in-triage
needs-triage
prediction/actioned
prediction/dismissed
prediction/unreviewed
priority
rejected
reproduced
tech-debt
underspecified
vision
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: disinto-admin/disinto#887
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Problem
When a PR lands that modifies
docker/agents/entrypoint.sh(or any baked-in file underdocker/agents/), the running agent containers continue to execute the old code. Thedisinto/agents:latestimage on the host was built at some past point and is never rebuilt when the repo is updated.docker-compose.ymlfor TOML-driven agent services usesimage: ghcr.io/disinto/agents:${DISINTO_IMAGE_TAG:-latest}— nobuild:directive (see also #853). No registry pipeline publishes fresh images. No local rebuild is triggered bydisinto uporhire-an-agent. Result: landed fixes never reach agent containers unless an operator manually runsdocker compose buildor equivalent.Repro
Parsed PROJECT_NAME=...).disinto hire-an-agent dev-qwen2 dev ...regenerates compose, which still referencesghcr.io/disinto/agents:latest.docker compose --profile agents-dev-qwen2 up -d agents-dev-qwen2— container starts from the pre-fix image.docker exec disinto-agents-dev-qwen2 grep 'Parsed PROJECT_NAME' /entrypoint.sh→ not present.md5sum /home/johba/disinto/docker/agents/entrypoint.shanddocker exec ... md5sum /entrypoint.shreturn DIFFERENT hashes.Fix for #861 is thus a no-op in production until someone remembers to rebuild. Agent containers run indefinitely on stale code.
Fix options
Option A: auto-build in generator
lib/generators.shshould emitbuild: { context: ., dockerfile: docker/agents/Dockerfile }alongsideimage:.docker compose upwith both directives will rebuild when local files differ. Covers both:disinto hire-an-agent→docker compose --profile X uprebuilds if needed.docker compose upon any change rebuilds.Option B: CI pipeline publishes to ghcr
Woodpecker job on merge to main rebuilds
ghcr.io/disinto/agents:latestand pushes.disinto uppulls fresh. Requires ghcr auth configured on disinto-dev-box. Heavier lift; touches #853 (currently registry is inaccessible, causing pull failures — workaround:docker tag disinto/agents:latest ghcr.io/disinto/agents:latest).Option C:
disinto upruns rebuildbin/disinto upperformsdocker compose build agents agents-<name>beforeup. Simple but slow.Recommend Option A as the minimum useful step; it sidesteps #853 and matches the pattern of the legacy hardcoded
agentsservice (which does havebuild:).Acceptance
docker/agents/entrypoint.shis reflected in running agent containers after an operator runsdocker compose --profile agents-<name> up -d --force-recreate agents-<name>(no manualdocker buildneeded)hire → up)Affected files
lib/generators.sh— emitbuild:directive alongsideimage:for_generate_local_model_servicesdocker-compose.ymltemplate (if any) — ensure both directives presentbin/disinto— if Option C path chosen, adddocker compose buildstep toupContext
Caught today while chasing why #861 didn't unblock dev-qwen2 despite being merged. Host code had the fix; container didn't. Two hours of debugging a non-existent bug in code that was actually fixed but not deployed. This also explains why other landed fixes (e.g. #856 collaborator auto-add) may only work for NEW operations — any existing code paths baked into an older image are still broken until rebuild.
Related: #853 (ghcr image ref with no pull auth) — these two together mean that the only way a fix reaches agent containers today is an undocumented manual rebuild + retag sequence.