fix: [nomad-step-0] S0.4 — disinto init --backend=nomad --empty orchestrator (cluster-up) (#824) #829

Merged
dev-bot merged 2 commits from fix/issue-824 into main 2026-04-16 07:42:47 +00:00
Collaborator

Fixes #824

Changes

Fixes #824 ## Changes
dev-bot added 1 commit 2026-04-16 07:22:38 +00:00
fix: [nomad-step-0] S0.4 — disinto init --backend=nomad --empty orchestrator (cluster-up) (#824)
Some checks failed
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/pr/ci Pipeline failed
ci/woodpecker/pr/smoke-init Pipeline failed
d2c6b33271
Wires S0.1–S0.3 into a single idempotent bring-up script and replaces
the S0.1 stub in _disinto_init_nomad so `disinto init --backend=nomad
--empty` produces a running empty single-node cluster on a fresh box.

lib/init/nomad/cluster-up.sh (new):
  1. install.sh                (nomad + vault binaries)
  2. systemd-nomad.sh          (unit + enable, not started)
  3. systemd-vault.sh          (unit + vault.hcl + enable)
  4. host-volume dirs under /srv/disinto/* (matching nomad/client.hcl)
  5. /etc/nomad.d/{server,client}.hcl (content-compare before write)
  6. vault-init.sh             (first-run init + unseal + persist keys)
  7. systemctl start vault     (poll until unsealed; fail-fast on
                                is-failed)
  8. systemctl start nomad     (poll until ≥1 node ready)
  9. /etc/profile.d/disinto-nomad.sh (VAULT_ADDR + NOMAD_ADDR for
                                      interactive shells)
  Re-running on a healthy box is a no-op — each sub-step is itself
  idempotent and steps 7/8 fast-path when already active + healthy.
  `--dry-run` prints the full step list and exits 0.

bin/disinto:
  - _disinto_init_nomad: replaces the S0.1 stub. Invokes cluster-up.sh
    directly (as root) or via `sudo -n` otherwise. Both `--empty` and
    the default (no flag) call cluster-up.sh today; Step 1 will branch
    on $empty to gate job deployment. --dry-run forwards through.
  - disinto_init: adds `--empty` flag parsing; rejects `--empty`
    combined with `--backend=docker` explicitly instead of silently
    ignoring it.
  - usage: documents `--empty` and drops the "stub, S0.1" annotation
    from --backend.

Closes #824.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
dev-bot added 1 commit 2026-04-16 07:27:01 +00:00
fix: dedupe cluster-up.sh polling via poll_until_healthy helper (#824)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful
481175e043
CI duplicate-detection flagged the in-line vault + nomad polling loops
in cluster-up.sh as matching a 5-line window in vault-init.sh (the
`ready=1 / break / fi / sleep 1 / done` boilerplate).

Extracts the repeated pattern into three helpers at the top of the
file:

  - nomad_has_ready_node       wrapper so poll_until_healthy can take a
                               bare command name.
  - _die_with_service_status   shared "log + dump systemctl status +
                               die" path (factored out of the two
                               callsites + the timeout branch).
  - poll_until_healthy         ticks once per second up to TIMEOUT,
                               fail-fasts on systemd "failed" state,
                               and returns 0 on first successful check.

Step 7 (vault unseal) and Step 8 (nomad ready node) each collapse from
~15 lines of explicit for-loop bookkeeping to a one-line call. No
behavioural change: same tick cadence, same fail-fast, same status
dump on timeout. Local detect-duplicates.py run against main confirms
no new duplicates introduced.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Collaborator

AI Review

Summary

S0.4 wires together the S0.1–S0.3 building blocks (install, systemd units, vault-init) into an idempotent 9-step cluster orchestrator, and upgrades _disinto_init_nomad from a loud-failing stub to a real dispatcher. The scope is tight, the docker default path is byte-for-byte unchanged, and the implementation is solid.

Correctness

  • Dry-run root bypass (cluster-up.sh:105–128): _disinto_init_nomad forwards --dry-run without sudo; cluster-up.sh exits inside the dry-run block (line 105) before the root precondition check (line 134). Non-root dry-runs work as intended.
  • poll_until_healthy set-e safety (cluster-up.sh:230–244): systemctl is-failed --quiet "$svc" && _die_with_service_status — the && puts is-failed in a conditional context, so a non-failed (non-zero) exit doesn't abort under set -e. Correct.
  • vault_is_unsealed jq fallback (cluster-up.sh:187–196): jq ... || init="" correctly handles jq failure; final [ "$init" = "true" ] && [ "$sealed" = "false" ] returns non-zero for any missing/wrong value. Sound.
  • Step 6→7 ordering: vault-init.sh spawns a temp server, persists keys, then its EXIT trap stops the temp server before returning — leaving port 8200 free for step 7's systemctl start vault. Ordering is sound.
  • Step 9 subshell+EXIT trap (cluster-up.sh:328–333): Correctly applies the RETURN-trap-under-set-e lesson — cleanup fires on any exit, including errexit-abort.
  • HOST_VOLUME_DIRS matches nomad/client.hcl host_volume declarations exactly (7 dirs, same paths). Good.
  • --empty guard (bin/disinto:737–741): Explicitly rejected for --backend=docker with a clear error message. Correct.
  • Exit-code propagation: Both sudo -n -- "${cmd[@]}" || rc=$? and the dry-run path's exit $? correctly propagate cluster-up.sh's exit code.

Security

No issues. No secrets echoed. No eval. All variables quoted. sudo -n correctly prevents hanging TTY prompts in factory runs. No external actions (no vault dispatch needed for local systemd/init operations).

Architecture / Documentation

AGENTS.md documents no nomad-init behavior — this is new capability, not a behavior change to existing documented behavior. No documentation update required. Stub-first dispatch pattern from S0.1 is correctly retired now that the real implementation lands.

Verdict

APPROVE — S0.4 orchestrator is correct, idempotent, and follows all project conventions; no blockers found.


Reviewed at 481175e | AGENTS.md

## AI Review <!-- reviewed: 481175e04386eadb24454cd4960524417d1b1bc2 --> ### Summary S0.4 wires together the S0.1–S0.3 building blocks (install, systemd units, vault-init) into an idempotent 9-step cluster orchestrator, and upgrades `_disinto_init_nomad` from a loud-failing stub to a real dispatcher. The scope is tight, the docker default path is byte-for-byte unchanged, and the implementation is solid. ### Correctness - **Dry-run root bypass** (`cluster-up.sh:105–128`): `_disinto_init_nomad` forwards `--dry-run` without `sudo`; `cluster-up.sh` exits inside the dry-run block (line 105) before the root precondition check (line 134). Non-root dry-runs work as intended. - **`poll_until_healthy` set-e safety** (`cluster-up.sh:230–244`): `systemctl is-failed --quiet "$svc" && _die_with_service_status` — the `&&` puts `is-failed` in a conditional context, so a non-failed (non-zero) exit doesn't abort under `set -e`. Correct. - **`vault_is_unsealed` jq fallback** (`cluster-up.sh:187–196`): `jq ... || init=""` correctly handles jq failure; final `[ "$init" = "true" ] && [ "$sealed" = "false" ]` returns non-zero for any missing/wrong value. Sound. - **Step 6→7 ordering**: `vault-init.sh` spawns a temp server, persists keys, then its EXIT trap stops the temp server before returning — leaving port 8200 free for step 7's `systemctl start vault`. Ordering is sound. - **Step 9 subshell+EXIT trap** (`cluster-up.sh:328–333`): Correctly applies the RETURN-trap-under-set-e lesson — cleanup fires on any exit, including errexit-abort. - **HOST_VOLUME_DIRS** matches `nomad/client.hcl` host_volume declarations exactly (7 dirs, same paths). Good. - **`--empty` guard** (`bin/disinto:737–741`): Explicitly rejected for `--backend=docker` with a clear error message. Correct. - **Exit-code propagation**: Both `sudo -n -- "${cmd[@]}" || rc=$?` and the dry-run path's `exit $?` correctly propagate `cluster-up.sh`'s exit code. ### Security No issues. No secrets echoed. No `eval`. All variables quoted. `sudo -n` correctly prevents hanging TTY prompts in factory runs. No external actions (no vault dispatch needed for local systemd/init operations). ### Architecture / Documentation AGENTS.md documents no nomad-init behavior — this is new capability, not a behavior change to existing documented behavior. No documentation update required. Stub-first dispatch pattern from S0.1 is correctly retired now that the real implementation lands. ### Verdict **APPROVE** — S0.4 orchestrator is correct, idempotent, and follows all project conventions; no blockers found. --- *Reviewed at `481175e` | [AGENTS.md](AGENTS.md)*
review-bot approved these changes 2026-04-16 07:39:20 +00:00
review-bot left a comment
Collaborator

AI Review: APPROVE — S0.4 orchestrator is correct, idempotent, and follows all project conventions; no blockers found.

AI Review: **APPROVE** — S0.4 orchestrator is correct, idempotent, and follows all project conventions; no blockers found.
dev-bot merged commit 271ec9d8f5 into main 2026-04-16 07:42:47 +00:00
dev-bot deleted branch fix/issue-824 2026-04-16 07:42:48 +00:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: disinto-admin/disinto#829
No description provided.