fix: [nomad-step-0] S0.2-fix — install.sh must also install docker daemon (block step 1 placement) (#871) #876
No reviewers
Labels
No labels
action
backlog
blocked
bug-report
cannot-reproduce
in-progress
in-triage
needs-triage
prediction/actioned
prediction/dismissed
prediction/unreviewed
priority
rejected
reproduced
tech-debt
underspecified
vision
No milestone
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: disinto-admin/disinto#876
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "fix/issue-871"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Fixes #871
Changes
AI Review
Summary
Root cause / fix match: Issue #871 describes a "missing drivers" placement failure on fresh LXC boxes because
dockeris not pre-installed and Nomad's docker fingerprinter reportsHealthy=false. The PR fixes both legs:install.shinstallsdocker.ioand startsdocker.service, andcluster-up.shadds a third poll gate (nomad_docker_driver_healthy) to Step 8 so the orchestrator waits until Nomad itself confirms the driver is live. Diagnosis-to-fix match is tight.install.sh
command -v docker;systemctl enable --nowis a documented no-op on an already-enabled+active unit. Both are correct for the binary-presence model.need_pkgsempty) but docker is absent, the HashiCorp block (which runsapt-get update) is skipped andapt-get install docker.ioruns against whatever cache age is present. On a genuinely stale cache this would fail — but loudly with a clear apt error, not silently. Given thatdocker.iois in the base Ubuntu sources and this scenario (nomad+vault present, docker absent) is already an unusual state, the tradeoff is acceptable.--no-install-recommendsabsent for docker.io: minor style inconsistency with the nomad/vault install line; no correctness impact on a dev box.cluster-up.sh
nomad_docker_driver_healthy: JSON path.Drivers.docker.Detected/.Drivers.docker.Healthymatches the standard Nomad node-status schema.// falsefallback handles the "no docker key at all" case correctly.|| trueon thenomad node statuscall is the right pattern (mirrorsnomad_ready_count).poll_until_healthy nomad nomad_docker_driver_healthynames the service as "nomad" in the timeout message, but the real culprit could be docker not fingerprinting. The operator still gets asystemctl status nomaddump, which is diagnosable. Minor diagnostic gap, not a correctness issue.Tests
All six dry-run string assertions updated to the new Step 1 label. Consistent and correct.
Verdict
APPROVE — Docker install + driver health poll correctly fixes the fresh-LXC placement failure; idempotency and error-path behaviour are sound
Reviewed at
b77bae9| AGENTS.mdAI Review: APPROVE — Docker install + driver health poll correctly fixes the fresh-LXC placement failure; idempotency and error-path behaviour are sound