fix: [nomad-step-0] S0.2 — install nomad + systemd unit + nomad/server.hcl/client.hcl (#822)
Lands the Nomad install + baseline HCL config for the single-node factory
dev box. Nothing is wired into `disinto init` yet — S0.4 does that.
- lib/init/nomad/install.sh: idempotent apt install pinned to
NOMAD_VERSION (default 1.9.5). Adds HashiCorp apt keyring and sources
list only if absent; fast-paths when the pinned version is already
installed.
- lib/init/nomad/systemd-nomad.sh: writes /etc/systemd/system/nomad.service
(rewrites only when content differs), creates /etc/nomad.d and
/var/lib/nomad, runs `systemctl enable nomad` WITHOUT starting.
- nomad/server.hcl: single-node combined server+client role. bootstrap_expect=1,
localhost bind, default ports pinned explicitly, UI enabled. No TLS/ACL —
factory dev box baseline.
- nomad/client.hcl: Docker task driver (allow_privileged=false, volumes
enabled) and host_volume pre-wiring for forgejo-data, woodpecker-data,
agent-data, project-repos, caddy-data, chat-history, ops-repo under
/srv/disinto/*.
Verified: `nomad config validate nomad/*.hcl` reports "Configuration is
valid!" (with expected TLS/bootstrap warnings for a dev box). Shellcheck
clean across the repo.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 06:04:02 +00:00
|
|
|
#!/usr/bin/env bash
|
|
|
|
|
# =============================================================================
|
|
|
|
|
# lib/init/nomad/systemd-nomad.sh — Idempotent systemd unit installer for Nomad
|
|
|
|
|
#
|
|
|
|
|
# Part of the Nomad+Vault migration (S0.2, issue #822). Writes
|
|
|
|
|
# /etc/systemd/system/nomad.service pointing at /etc/nomad.d/ and runs
|
|
|
|
|
# `systemctl enable nomad` WITHOUT starting the service — we don't launch
|
|
|
|
|
# the cluster until S0.4 wires everything together.
|
|
|
|
|
#
|
|
|
|
|
# Idempotency contract:
|
|
|
|
|
# - Existing unit file is NOT rewritten when on-disk content already
|
|
|
|
|
# matches the desired content (avoids spurious `daemon-reload`).
|
|
|
|
|
# - `systemctl enable` on an already-enabled unit is a no-op.
|
|
|
|
|
# - This script is safe to run unconditionally before every factory boot.
|
|
|
|
|
#
|
|
|
|
|
# Preconditions:
|
|
|
|
|
# - nomad binary installed (see lib/init/nomad/install.sh)
|
|
|
|
|
# - /etc/nomad.d/ will hold server.hcl / client.hcl (placed by S0.4)
|
|
|
|
|
#
|
|
|
|
|
# Usage:
|
|
|
|
|
# sudo lib/init/nomad/systemd-nomad.sh
|
|
|
|
|
#
|
|
|
|
|
# Exit codes:
|
|
|
|
|
# 0 success (unit installed + enabled, or already so)
|
|
|
|
|
# 1 precondition failure (not root, no systemctl, no nomad binary)
|
|
|
|
|
# =============================================================================
|
|
|
|
|
set -euo pipefail
|
|
|
|
|
|
|
|
|
|
UNIT_PATH="/etc/systemd/system/nomad.service"
|
|
|
|
|
NOMAD_CONFIG_DIR="/etc/nomad.d"
|
|
|
|
|
NOMAD_DATA_DIR="/var/lib/nomad"
|
|
|
|
|
|
|
|
|
|
log() { printf '[systemd-nomad] %s\n' "$*"; }
|
|
|
|
|
die() { printf '[systemd-nomad] ERROR: %s\n' "$*" >&2; exit 1; }
|
|
|
|
|
|
fix: [nomad-step-0] S0.3 — install vault + systemd auto-unseal + vault-init.sh (dev-persisted seal) (#823)
Adds the Vault half of the factory-dev-box bringup, landed but not started
(per the install-but-don't-start pattern used for nomad in #822):
- lib/init/nomad/install.sh — now also installs vault from the shared
HashiCorp apt repo. VAULT_VERSION pinned (1.18.5). Fast-path skips apt
entirely when both binaries are at their pins; partial upgrades only
touch the package that drifted.
- nomad/vault.hcl — single-node config: file storage backend at
/var/lib/vault/data, localhost listener on :8200, ui on, mlock kept on.
No TLS / HA / audit yet; those land in later steps.
- lib/init/nomad/systemd-vault.sh — writes /etc/systemd/system/vault.service
(Type=notify, ExecStartPost auto-unseals from /etc/vault.d/unseal.key,
CAP_IPC_LOCK granted for mlock), deploys nomad/vault.hcl to
/etc/vault.d/, creates /var/lib/vault/data (0700 root), enables the
unit without starting it. Idempotent via content-compare.
- lib/init/nomad/vault-init.sh — first-run init: spawns a temporary
`vault server` if not already reachable, runs operator-init with
key-shares=1/threshold=1, persists unseal.key + root.token (0400 root),
unseals once in-process, shuts down the temp server. Re-run detects
initialized + unseal.key present → no-op. Initialized but key missing
is a hard failure (can't recover).
lib/hvault.sh already defaults VAULT_TOKEN to /etc/vault.d/root.token
when the env var is absent, so no change needed there.
Seal model: the single unseal key lives on disk; seal-key theft equals
vault theft. Factory-dev-box-acceptable tradeoff — avoids running a
second Vault to auto-unseal the first.
Blocks S0.4 (#824).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 06:29:55 +00:00
|
|
|
# shellcheck source=lib-systemd.sh
|
|
|
|
|
. "$(dirname "${BASH_SOURCE[0]}")/lib-systemd.sh"
|
fix: [nomad-step-0] S0.2 — install nomad + systemd unit + nomad/server.hcl/client.hcl (#822)
Lands the Nomad install + baseline HCL config for the single-node factory
dev box. Nothing is wired into `disinto init` yet — S0.4 does that.
- lib/init/nomad/install.sh: idempotent apt install pinned to
NOMAD_VERSION (default 1.9.5). Adds HashiCorp apt keyring and sources
list only if absent; fast-paths when the pinned version is already
installed.
- lib/init/nomad/systemd-nomad.sh: writes /etc/systemd/system/nomad.service
(rewrites only when content differs), creates /etc/nomad.d and
/var/lib/nomad, runs `systemctl enable nomad` WITHOUT starting.
- nomad/server.hcl: single-node combined server+client role. bootstrap_expect=1,
localhost bind, default ports pinned explicitly, UI enabled. No TLS/ACL —
factory dev box baseline.
- nomad/client.hcl: Docker task driver (allow_privileged=false, volumes
enabled) and host_volume pre-wiring for forgejo-data, woodpecker-data,
agent-data, project-repos, caddy-data, chat-history, ops-repo under
/srv/disinto/*.
Verified: `nomad config validate nomad/*.hcl` reports "Configuration is
valid!" (with expected TLS/bootstrap warnings for a dev box). Shellcheck
clean across the repo.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 06:04:02 +00:00
|
|
|
|
fix: [nomad-step-0] S0.3 — install vault + systemd auto-unseal + vault-init.sh (dev-persisted seal) (#823)
Adds the Vault half of the factory-dev-box bringup, landed but not started
(per the install-but-don't-start pattern used for nomad in #822):
- lib/init/nomad/install.sh — now also installs vault from the shared
HashiCorp apt repo. VAULT_VERSION pinned (1.18.5). Fast-path skips apt
entirely when both binaries are at their pins; partial upgrades only
touch the package that drifted.
- nomad/vault.hcl — single-node config: file storage backend at
/var/lib/vault/data, localhost listener on :8200, ui on, mlock kept on.
No TLS / HA / audit yet; those land in later steps.
- lib/init/nomad/systemd-vault.sh — writes /etc/systemd/system/vault.service
(Type=notify, ExecStartPost auto-unseals from /etc/vault.d/unseal.key,
CAP_IPC_LOCK granted for mlock), deploys nomad/vault.hcl to
/etc/vault.d/, creates /var/lib/vault/data (0700 root), enables the
unit without starting it. Idempotent via content-compare.
- lib/init/nomad/vault-init.sh — first-run init: spawns a temporary
`vault server` if not already reachable, runs operator-init with
key-shares=1/threshold=1, persists unseal.key + root.token (0400 root),
unseals once in-process, shuts down the temp server. Re-run detects
initialized + unseal.key present → no-op. Initialized but key missing
is a hard failure (can't recover).
lib/hvault.sh already defaults VAULT_TOKEN to /etc/vault.d/root.token
when the env var is absent, so no change needed there.
Seal model: the single unseal key lives on disk; seal-key theft equals
vault theft. Factory-dev-box-acceptable tradeoff — avoids running a
second Vault to auto-unseal the first.
Blocks S0.4 (#824).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 06:29:55 +00:00
|
|
|
# ── Preconditions ────────────────────────────────────────────────────────────
|
|
|
|
|
systemd_require_preconditions "$UNIT_PATH"
|
fix: [nomad-step-0] S0.2 — install nomad + systemd unit + nomad/server.hcl/client.hcl (#822)
Lands the Nomad install + baseline HCL config for the single-node factory
dev box. Nothing is wired into `disinto init` yet — S0.4 does that.
- lib/init/nomad/install.sh: idempotent apt install pinned to
NOMAD_VERSION (default 1.9.5). Adds HashiCorp apt keyring and sources
list only if absent; fast-paths when the pinned version is already
installed.
- lib/init/nomad/systemd-nomad.sh: writes /etc/systemd/system/nomad.service
(rewrites only when content differs), creates /etc/nomad.d and
/var/lib/nomad, runs `systemctl enable nomad` WITHOUT starting.
- nomad/server.hcl: single-node combined server+client role. bootstrap_expect=1,
localhost bind, default ports pinned explicitly, UI enabled. No TLS/ACL —
factory dev box baseline.
- nomad/client.hcl: Docker task driver (allow_privileged=false, volumes
enabled) and host_volume pre-wiring for forgejo-data, woodpecker-data,
agent-data, project-repos, caddy-data, chat-history, ops-repo under
/srv/disinto/*.
Verified: `nomad config validate nomad/*.hcl` reports "Configuration is
valid!" (with expected TLS/bootstrap warnings for a dev box). Shellcheck
clean across the repo.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 06:04:02 +00:00
|
|
|
|
|
|
|
|
NOMAD_BIN="$(command -v nomad 2>/dev/null || true)"
|
|
|
|
|
[ -n "$NOMAD_BIN" ] \
|
|
|
|
|
|| die "nomad binary not found — run lib/init/nomad/install.sh first"
|
|
|
|
|
|
|
|
|
|
# ── Desired unit content ─────────────────────────────────────────────────────
|
|
|
|
|
# Upstream-recommended baseline (https://developer.hashicorp.com/nomad/docs/install/production/deployment-guide)
|
|
|
|
|
# trimmed for a single-node combined server+client dev box.
|
|
|
|
|
# - Wants=/After= network-online: nomad must have networking up.
|
|
|
|
|
# - User/Group=root: the Docker driver needs root to talk to dockerd.
|
|
|
|
|
# - LimitNOFILE/LimitNPROC=infinity: avoid Nomad's startup warning.
|
|
|
|
|
# - KillSignal=SIGINT: triggers Nomad's graceful shutdown path.
|
|
|
|
|
# - Restart=on-failure with a bounded burst to avoid crash-loops eating the
|
|
|
|
|
# journal when /etc/nomad.d/ is mis-configured.
|
|
|
|
|
read -r -d '' DESIRED_UNIT <<EOF || true
|
|
|
|
|
[Unit]
|
|
|
|
|
Description=Nomad
|
|
|
|
|
Documentation=https://developer.hashicorp.com/nomad/docs
|
|
|
|
|
Wants=network-online.target
|
|
|
|
|
After=network-online.target
|
|
|
|
|
|
|
|
|
|
# When Docker is present, ensure dockerd is up before nomad starts — the
|
|
|
|
|
# Docker task driver needs the daemon socket available at startup.
|
|
|
|
|
Wants=docker.service
|
|
|
|
|
After=docker.service
|
|
|
|
|
|
|
|
|
|
[Service]
|
|
|
|
|
Type=notify
|
|
|
|
|
User=root
|
|
|
|
|
Group=root
|
|
|
|
|
ExecReload=/bin/kill -HUP \$MAINPID
|
|
|
|
|
ExecStart=${NOMAD_BIN} agent -config=${NOMAD_CONFIG_DIR}
|
|
|
|
|
KillMode=process
|
|
|
|
|
KillSignal=SIGINT
|
|
|
|
|
LimitNOFILE=infinity
|
|
|
|
|
LimitNPROC=infinity
|
|
|
|
|
Restart=on-failure
|
|
|
|
|
RestartSec=2
|
|
|
|
|
StartLimitBurst=3
|
|
|
|
|
StartLimitIntervalSec=10
|
|
|
|
|
TasksMax=infinity
|
|
|
|
|
OOMScoreAdjust=-1000
|
|
|
|
|
|
|
|
|
|
[Install]
|
|
|
|
|
WantedBy=multi-user.target
|
|
|
|
|
EOF
|
|
|
|
|
|
|
|
|
|
# ── Ensure config + data dirs exist ──────────────────────────────────────────
|
|
|
|
|
# We do not populate /etc/nomad.d/ here (that's S0.4). We do create the
|
|
|
|
|
# directory so `nomad agent -config=/etc/nomad.d` doesn't error if the unit
|
|
|
|
|
# is started before hcl files are dropped in.
|
|
|
|
|
for d in "$NOMAD_CONFIG_DIR" "$NOMAD_DATA_DIR"; do
|
|
|
|
|
if [ ! -d "$d" ]; then
|
|
|
|
|
log "creating ${d}"
|
|
|
|
|
install -d -m 0755 "$d"
|
|
|
|
|
fi
|
|
|
|
|
done
|
|
|
|
|
|
fix: [nomad-step-0] S0.3 — install vault + systemd auto-unseal + vault-init.sh (dev-persisted seal) (#823)
Adds the Vault half of the factory-dev-box bringup, landed but not started
(per the install-but-don't-start pattern used for nomad in #822):
- lib/init/nomad/install.sh — now also installs vault from the shared
HashiCorp apt repo. VAULT_VERSION pinned (1.18.5). Fast-path skips apt
entirely when both binaries are at their pins; partial upgrades only
touch the package that drifted.
- nomad/vault.hcl — single-node config: file storage backend at
/var/lib/vault/data, localhost listener on :8200, ui on, mlock kept on.
No TLS / HA / audit yet; those land in later steps.
- lib/init/nomad/systemd-vault.sh — writes /etc/systemd/system/vault.service
(Type=notify, ExecStartPost auto-unseals from /etc/vault.d/unseal.key,
CAP_IPC_LOCK granted for mlock), deploys nomad/vault.hcl to
/etc/vault.d/, creates /var/lib/vault/data (0700 root), enables the
unit without starting it. Idempotent via content-compare.
- lib/init/nomad/vault-init.sh — first-run init: spawns a temporary
`vault server` if not already reachable, runs operator-init with
key-shares=1/threshold=1, persists unseal.key + root.token (0400 root),
unseals once in-process, shuts down the temp server. Re-run detects
initialized + unseal.key present → no-op. Initialized but key missing
is a hard failure (can't recover).
lib/hvault.sh already defaults VAULT_TOKEN to /etc/vault.d/root.token
when the env var is absent, so no change needed there.
Seal model: the single unseal key lives on disk; seal-key theft equals
vault theft. Factory-dev-box-acceptable tradeoff — avoids running a
second Vault to auto-unseal the first.
Blocks S0.4 (#824).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 06:29:55 +00:00
|
|
|
# ── Install + reload + enable (shared with systemd-vault.sh via lib-systemd) ─
|
|
|
|
|
systemd_install_unit "$UNIT_PATH" "nomad.service" "$DESIRED_UNIT"
|
fix: [nomad-step-0] S0.2 — install nomad + systemd unit + nomad/server.hcl/client.hcl (#822)
Lands the Nomad install + baseline HCL config for the single-node factory
dev box. Nothing is wired into `disinto init` yet — S0.4 does that.
- lib/init/nomad/install.sh: idempotent apt install pinned to
NOMAD_VERSION (default 1.9.5). Adds HashiCorp apt keyring and sources
list only if absent; fast-paths when the pinned version is already
installed.
- lib/init/nomad/systemd-nomad.sh: writes /etc/systemd/system/nomad.service
(rewrites only when content differs), creates /etc/nomad.d and
/var/lib/nomad, runs `systemctl enable nomad` WITHOUT starting.
- nomad/server.hcl: single-node combined server+client role. bootstrap_expect=1,
localhost bind, default ports pinned explicitly, UI enabled. No TLS/ACL —
factory dev box baseline.
- nomad/client.hcl: Docker task driver (allow_privileged=false, volumes
enabled) and host_volume pre-wiring for forgejo-data, woodpecker-data,
agent-data, project-repos, caddy-data, chat-history, ops-repo under
/srv/disinto/*.
Verified: `nomad config validate nomad/*.hcl` reports "Configuration is
valid!" (with expected TLS/bootstrap warnings for a dev box). Shellcheck
clean across the repo.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 06:04:02 +00:00
|
|
|
|
|
|
|
|
log "done — unit installed and enabled (NOT started; S0.4 brings the cluster up)"
|