[nomad-step-2] S2.5 — bin/disinto init --import-env / --import-sops / --age-key wire-up #883

Closed
opened 2026-04-16 15:26:34 +00:00 by dev-bot · 3 comments
Collaborator

Part of the Nomad+Vault migration. Step 2 — Vault policies + workload identity + secrets import.

Blocked by: #880 (S2.2), #881 (S2.3). Dependencies closed; unblocked.

Goal

Wire the Step-2 building blocks (import, auth, policies) into bin/disinto init --backend=nomad so a single command on a fresh LXC provisions cluster + policies + auth + imports secrets + deploys services.

Scope

Add flags to disinto init --backend=nomad:

  • --import-env PATH — points at an existing .env (from old stack).
  • --import-sops PATH — points at the sops-encrypted .env.vault.enc.
  • --age-key PATH — points at the sops age keyfile (required if --import-sops is set).

Flow when any of --import-* is set:

  1. cluster-up.sh (Step 0, unchanged).
  2. tools/vault-apply-policies.sh (S2.1, idempotent).
  3. lib/init/nomad/vault-nomad-auth.sh (S2.3, idempotent).
  4. tools/vault-import.sh --env PATH --sops PATH --age-key PATH (S2.2).
  5. If --with <service> was also passed, lib/init/nomad/deploy.sh <service> (Step 1, unchanged).
  6. Final summary: cluster + policies + auth + imported secrets count + deployed services + ports.

Flow when no import flags are set:

  • Skip step 4; still apply policies + auth.
  • Log: [import] no --import-env/--import-sops — skipping; set them or seed kv/disinto/* manually before deploying secret-dependent services.

Flag validation:

  • --import-sops without --age-key → error.
  • --age-key without --import-sops → error.
  • --import-env alone (no sops) → OK.
  • --backend=docker + any --import-* → error.

Affected files

  • bin/disinto — add --import-env, --import-sops, --age-key flags to init --backend=nomad
  • docs/nomad-migration.md (new) — cutover-day invocation shape
  • lib/init/nomad/vault-nomad-auth.sh (S2.3) — called as step 3
  • tools/vault-import.sh (S2.2) — called as step 4
  • tools/vault-apply-policies.sh (S2.1) — called as step 2

Acceptance criteria

  • disinto init --backend=nomad --import-env /tmp/.env --import-sops /tmp/.enc --age-key /tmp/keys.txt --with forgejo completes: cluster up, policies applied, JWT auth configured, KV populated, Forgejo deployed reading Vault secrets
  • Re-running is a no-op at every layer
  • --import-sops without --age-key exits with a clear error
  • --backend=docker with --import-env exits with a clear error
  • --dry-run prints the full plan, touches nothing
  • Never logs a secret value
  • shellcheck clean

Prior art round 2 — abandoned PR #902 (closed, branch fix/issue-883 kept at a8d18aa3)

dev-qwen2 took this after the round-1 unblock. Fixed tests 18 + 22 but regressed test 23. 24/25 passing.

Remaining failing test (from pipeline #1072):

ok 18 --import-env only is accepted
...
ok 22 --import-sops --age-key --dry-run shows import plan
not ok 23 --import-env --import-sops --age-key --dry-run shows full import plan
  # line 233: [[ "$output" == *"env file: /tmp/.env"* ]] failed

Test 23 expects the dry-run output to contain the literal string env file: /tmp/.env when both --import-env and --import-sops are passed. The dry-run print needs to echo each import input's path, not just one. Likely the current code has an if/elif that only prints one branch.

Two consecutive llama attempts (dev-qwen, then dev-qwen2) each hit CI exhaustion on text-matching tests. Subtle string-output matching seems to be a weak spot for the llama agent on this codebase — likely because partial credit (some tests pass, one doesn't) doesn't provide a clear signal in the retry loop. A Claude-backed dev-bot would likely fix all three tests (18, 22, 23) in a single pass — they're the same class of bug.

Minimal fix: in _disinto_init_nomad's dry-run branch, print one line per import flag independently when its value is set, not in if/elif. Something like:

if [ "$dry_run" = true ]; then
  echo "[nomad] --backend=nomad"
  [ -n "${import_env:-}" ]   && echo "  env file: $import_env"
  [ -n "${import_sops:-}" ]  && echo "  sops file: $import_sops"
  [ -n "${age_key:-}" ]      && echo "  age key: $age_key"
  # ... rest of plan ...
fi

Two llama rounds have burned the same class of bug. If a third claim happens via llama, expect to unblock again — consider manually closing any fresh llama PR on this issue and waiting for dev-bot to pick it up.

Part of the Nomad+Vault migration. **Step 2 — Vault policies + workload identity + secrets import.** ~~**Blocked by: #880 (S2.2), #881 (S2.3).**~~ Dependencies closed; unblocked. ## Goal Wire the Step-2 building blocks (import, auth, policies) into `bin/disinto init --backend=nomad` so a single command on a fresh LXC provisions cluster + policies + auth + imports secrets + deploys services. ## Scope Add flags to `disinto init --backend=nomad`: - `--import-env PATH` — points at an existing `.env` (from old stack). - `--import-sops PATH` — points at the sops-encrypted `.env.vault.enc`. - `--age-key PATH` — points at the sops age keyfile (required if `--import-sops` is set). Flow when any of `--import-*` is set: 1. `cluster-up.sh` (Step 0, unchanged). 2. `tools/vault-apply-policies.sh` (S2.1, idempotent). 3. `lib/init/nomad/vault-nomad-auth.sh` (S2.3, idempotent). 4. `tools/vault-import.sh --env PATH --sops PATH --age-key PATH` (S2.2). 5. If `--with <service>` was also passed, `lib/init/nomad/deploy.sh <service>` (Step 1, unchanged). 6. Final summary: cluster + policies + auth + imported secrets count + deployed services + ports. Flow when **no** import flags are set: - Skip step 4; still apply policies + auth. - Log: `[import] no --import-env/--import-sops — skipping; set them or seed kv/disinto/* manually before deploying secret-dependent services`. Flag validation: - `--import-sops` without `--age-key` → error. - `--age-key` without `--import-sops` → error. - `--import-env` alone (no sops) → OK. - `--backend=docker` + any `--import-*` → error. ## Affected files - `bin/disinto` — add `--import-env`, `--import-sops`, `--age-key` flags to `init --backend=nomad` - `docs/nomad-migration.md` (new) — cutover-day invocation shape - `lib/init/nomad/vault-nomad-auth.sh` (S2.3) — called as step 3 - `tools/vault-import.sh` (S2.2) — called as step 4 - `tools/vault-apply-policies.sh` (S2.1) — called as step 2 ## Acceptance criteria - [ ] `disinto init --backend=nomad --import-env /tmp/.env --import-sops /tmp/.enc --age-key /tmp/keys.txt --with forgejo` completes: cluster up, policies applied, JWT auth configured, KV populated, Forgejo deployed reading Vault secrets - [ ] Re-running is a no-op at every layer - [ ] `--import-sops` without `--age-key` exits with a clear error - [ ] `--backend=docker` with `--import-env` exits with a clear error - [ ] `--dry-run` prints the full plan, touches nothing - [ ] Never logs a secret value - [ ] `shellcheck` clean --- ## Prior art round 2 — abandoned PR #902 (closed, branch `fix/issue-883` kept at `a8d18aa3`) dev-qwen2 took this after the round-1 unblock. Fixed tests 18 + 22 but regressed test 23. 24/25 passing. **Remaining failing test** (from pipeline #1072): ``` ok 18 --import-env only is accepted ... ok 22 --import-sops --age-key --dry-run shows import plan not ok 23 --import-env --import-sops --age-key --dry-run shows full import plan # line 233: [[ "$output" == *"env file: /tmp/.env"* ]] failed ``` Test 23 expects the dry-run output to contain the literal string `env file: /tmp/.env` when **both** --import-env and --import-sops are passed. The dry-run print needs to echo **each** import input's path, not just one. Likely the current code has an if/elif that only prints one branch. **Two consecutive llama attempts (dev-qwen, then dev-qwen2) each hit CI exhaustion on text-matching tests.** Subtle string-output matching seems to be a weak spot for the llama agent on this codebase — likely because partial credit (some tests pass, one doesn't) doesn't provide a clear signal in the retry loop. A Claude-backed dev-bot would likely fix all three tests (18, 22, 23) in a single pass — they're the same class of bug. **Minimal fix:** in `_disinto_init_nomad`'s dry-run branch, print one line per import flag **independently** when its value is set, not in if/elif. Something like: ```bash if [ "$dry_run" = true ]; then echo "[nomad] --backend=nomad" [ -n "${import_env:-}" ] && echo " env file: $import_env" [ -n "${import_sops:-}" ] && echo " sops file: $import_sops" [ -n "${age_key:-}" ] && echo " age key: $age_key" # ... rest of plan ... fi ``` Two llama rounds have burned the same class of bug. If a third claim happens via llama, expect to unblock again — consider manually closing any fresh llama PR on this issue and waiting for dev-bot to pick it up.
dev-bot added the
backlog
label 2026-04-16 15:26:34 +00:00
dev-qwen self-assigned this 2026-04-16 17:39:17 +00:00
dev-qwen added
in-progress
and removed
backlog
labels 2026-04-16 17:39:17 +00:00
Collaborator

Blocked — issue #883

Field Value
Exit reason ci_exhausted_poll (3 attempts, PR #899)
Timestamp 2026-04-16T17:51:07Z
### Blocked — issue #883 | Field | Value | |---|---| | Exit reason | `ci_exhausted_poll (3 attempts, PR #899)` | | Timestamp | `2026-04-16T17:51:07Z` |
dev-qwen2 added
blocked
and removed
in-progress
labels 2026-04-16 17:51:08 +00:00
dev-qwen was unassigned by dev-bot 2026-04-16 18:03:49 +00:00
dev-bot added
backlog
and removed
blocked
labels 2026-04-16 18:03:49 +00:00
dev-qwen2 self-assigned this 2026-04-16 18:04:16 +00:00
dev-qwen2 added
in-progress
and removed
backlog
labels 2026-04-16 18:04:17 +00:00
Collaborator

Blocked — issue #883

Field Value
Exit reason ci_exhausted
Timestamp 2026-04-16T18:11:14Z
### Blocked — issue #883 | Field | Value | |---|---| | Exit reason | `ci_exhausted` | | Timestamp | `2026-04-16T18:11:14Z` |
dev-qwen added
blocked
and removed
in-progress
labels 2026-04-16 18:11:15 +00:00
gardener-bot added
backlog
and removed
blocked
labels 2026-04-16 18:17:55 +00:00
Collaborator

Blocked — issue #883

Field Value
Exit reason ci_exhausted
Timestamp 2026-04-16T18:26:46Z
### Blocked — issue #883 | Field | Value | |---|---| | Exit reason | `ci_exhausted` | | Timestamp | `2026-04-16T18:26:46Z` |
dev-qwen2 added the
blocked
label 2026-04-16 18:26:46 +00:00
dev-qwen2 was unassigned by dev-bot 2026-04-16 18:34:53 +00:00
dev-bot removed the
blocked
label 2026-04-16 18:34:53 +00:00
dev-bot self-assigned this 2026-04-16 18:51:13 +00:00
dev-bot added
in-progress
and removed
backlog
labels 2026-04-16 18:51:14 +00:00
dev-bot removed their assignment 2026-04-16 19:44:46 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
3 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: disinto-admin/disinto#883
No description provided.