[nomad-step-2] S2.3 — vault-nomad-auth.sh (enable JWT auth + roles + nomad workload identity) #881

Closed
opened 2026-04-16 15:26:33 +00:00 by dev-bot · 0 comments
Collaborator

Part of the Nomad+Vault migration. Step 2 — Vault policies + workload identity + secrets import.

Goal

Enable Vault's JWT auth method and configure Nomad workload identity so Nomad jobs can exchange their short-lived identity tokens for Vault tokens with the right policies attached — no shared VAULT_TOKEN in job env.

Scope

Create lib/init/nomad/vault-nomad-auth.sh:

  • Enable Vault JWT auth at path jwt-nomad:
    vault auth enable -path=jwt-nomad jwt
    
  • Configure the JWKS endpoint + issuer (Nomad's default workload-identity signer):
    vault write auth/jwt-nomad/config \
      jwks_url="http://127.0.0.1:4646/.well-known/jwks.json" \
      jwt_supported_algs="RS256" \
      default_role="default"
    
  • Create one Vault role per policy that a Nomad job will attach, e.g.:
    • vault write auth/jwt-nomad/role/service-forgejo ... bound audience ["vault.io"], bound_claims filtering on nomad_namespace/nomad_job_id, token_policies=["service-forgejo"], token_ttl=1h, token_max_ttl=24h.
    • Same shape for service-woodpecker, bot-*, dispatcher, and each runner-<SECRET>.
  • Update nomad/server.hcl (from S0.2) to add a vault stanza:
    vault {
      enabled               = true
      address               = "http://127.0.0.1:8200"
      default_identity {
        aud      = ["vault.io"]
        ttl      = "1h"
      }
    }
    
    Restart nomad once after this change lands (idempotent via systemctl kill -s SIGHUP).
  • Idempotent: if jwt-nomad auth is already enabled and config hash matches, no-op with [vault-auth] jwt-nomad already configured. Role diffs are per-role, same pattern.

Create companion script tools/vault-apply-roles.sh that reads the role list from a single declarative file vault/roles.yaml (or HCL equivalent) so the role-to-policy bindings aren't buried in shell. That file lists each Vault-role name + associated policy. This keeps S2.1 policies and S2.3 role bindings in sync at review time.

Acceptance criteria

  • On a Step-0-initialized cluster plus S2.1 policies applied:
    • vault-nomad-auth.sh enables JWT, writes roles, updates nomad config, reloads nomad.
    • nomad node status -self -verbose | grep -i vault shows vault integration reporting Connected.
    • A throwaway test job with vault { role = "service-forgejo" } can render a template stanza reading kv/disinto/shared/forgejo/* successfully (use a fixture KV entry; no real secret).
  • Second run: no-op (auth already enabled, config unchanged, roles match).
  • shellcheck clean.
  • Documented in vault/policies/AGENTS.md (extend from S2.1): how role names map to policy names, how to add a new service.

Non-goals

  • Not modifying existing service jobspecs to use Vault templates (S2.4 does forgejo; later steps do the rest).
  • Not rotating or revoking the root.token yet — root stays available for ops until Step 6 cutover.

Labels / meta

  • [nomad-step-2] S2.3 — no dependencies.
Part of the Nomad+Vault migration. **Step 2 — Vault policies + workload identity + secrets import.** ## Goal Enable Vault's JWT auth method and configure Nomad workload identity so Nomad jobs can exchange their short-lived identity tokens for Vault tokens with the right policies attached — no shared `VAULT_TOKEN` in job env. ## Scope Create `lib/init/nomad/vault-nomad-auth.sh`: - Enable Vault JWT auth at path `jwt-nomad`: ``` vault auth enable -path=jwt-nomad jwt ``` - Configure the JWKS endpoint + issuer (Nomad's default workload-identity signer): ``` vault write auth/jwt-nomad/config \ jwks_url="http://127.0.0.1:4646/.well-known/jwks.json" \ jwt_supported_algs="RS256" \ default_role="default" ``` - Create one Vault role per policy that a Nomad job will attach, e.g.: - `vault write auth/jwt-nomad/role/service-forgejo ...` bound audience `["vault.io"]`, bound_claims filtering on `nomad_namespace`/`nomad_job_id`, token_policies=`["service-forgejo"]`, token_ttl=1h, token_max_ttl=24h. - Same shape for `service-woodpecker`, `bot-*`, `dispatcher`, and each `runner-<SECRET>`. - Update `nomad/server.hcl` (from S0.2) to add a `vault` stanza: ``` vault { enabled = true address = "http://127.0.0.1:8200" default_identity { aud = ["vault.io"] ttl = "1h" } } ``` Restart nomad once after this change lands (idempotent via `systemctl kill -s SIGHUP`). - Idempotent: if `jwt-nomad` auth is already enabled and config hash matches, no-op with `[vault-auth] jwt-nomad already configured`. Role diffs are per-role, same pattern. Create companion script `tools/vault-apply-roles.sh` that reads the role list from a single declarative file `vault/roles.yaml` (or HCL equivalent) so the role-to-policy bindings aren't buried in shell. That file lists each Vault-role name + associated policy. This keeps S2.1 policies and S2.3 role bindings in sync at review time. ## Acceptance criteria - On a Step-0-initialized cluster plus S2.1 policies applied: - `vault-nomad-auth.sh` enables JWT, writes roles, updates nomad config, reloads nomad. - `nomad node status -self -verbose | grep -i vault` shows vault integration reporting `Connected`. - A throwaway test job with `vault { role = "service-forgejo" }` can render a `template` stanza reading `kv/disinto/shared/forgejo/*` successfully (use a fixture KV entry; no real secret). - Second run: no-op (auth already enabled, config unchanged, roles match). - `shellcheck` clean. - Documented in `vault/policies/AGENTS.md` (extend from S2.1): how role names map to policy names, how to add a new service. ## Non-goals - Not modifying existing service jobspecs to use Vault templates (S2.4 does forgejo; later steps do the rest). - Not rotating or revoking the `root.token` yet — root stays available for ops until Step 6 cutover. ## Labels / meta - `[nomad-step-2] S2.3` — no dependencies.
dev-bot added the
backlog
label 2026-04-16 15:26:33 +00:00
dev-bot self-assigned this 2026-04-16 16:29:45 +00:00
dev-bot added
in-progress
and removed
backlog
labels 2026-04-16 16:29:46 +00:00
dev-bot removed their assignment 2026-04-16 17:10:19 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: disinto-admin/disinto#881
No description provided.