disinto/vault/policies/AGENTS.md
Claude 8efef9f1bb
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/nomad-validate Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/nomad-validate Pipeline was successful
ci/woodpecker/pr/secret-scan Pipeline was successful
fix: [nomad-step-2] S2.3 — vault-nomad-auth.sh (enable JWT auth + roles + nomad workload identity) (#881)
Wires Nomad → Vault via workload identity so jobs can exchange their
short-lived JWT for a Vault token carrying the policies in
vault/policies/ — no shared VAULT_TOKEN in job env.

- `lib/init/nomad/vault-nomad-auth.sh` — idempotent script: enable jwt
  auth at path `jwt-nomad`, config JWKS/algs, apply roles, install
  server.hcl + SIGHUP nomad on change.
- `tools/vault-apply-roles.sh` — companion sync script (S2.1 sibling);
  reads vault/roles.yaml and upserts each Vault role under
  auth/jwt-nomad/role/<name> with created/updated/unchanged semantics.
- `vault/roles.yaml` — declarative role→policy→bound_claims map; one
  entry per vault/policies/*.hcl. Keeps S2.1 policies and S2.3 role
  bindings visible side-by-side at review time.
- `nomad/server.hcl` — adds vault stanza (enabled, address,
  default_identity.aud=["vault.io"], ttl=1h).
- `lib/hvault.sh` — new `hvault_get_or_empty` helper shared between
  vault-apply-policies.sh, vault-apply-roles.sh, and vault-nomad-auth.sh;
  reads a Vault endpoint and distinguishes 200 / 404 / other.
- `vault/policies/AGENTS.md` — extends S2.1 docs with JWT-auth role
  naming convention, token shape, and the "add new service" flow.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 16:44:59 +00:00

5.9 KiB
Raw Blame History

vault/policies/ — Agent Instructions

HashiCorp Vault ACL policies for the disinto factory. One .hcl file per policy; the basename (minus .hcl) is the Vault policy name applied to it. Synced into Vault by tools/vault-apply-policies.sh (idempotent — see the script header for the contract).

This directory is part of the Nomad+Vault migration (Step 2) — see issues #879#884. Policies attach to Nomad jobs via workload identity in S2.4; this PR only lands the files + apply script.

Naming convention

Prefix Audience KV scope
service-<name>.hcl Long-running platform services (forgejo, woodpecker) kv/data/disinto/shared/<name>/*
bot-<name>.hcl Per-agent jobs (dev, review, gardener, …) kv/data/disinto/bots/<name>/* + shared forge URL
runner-<TOKEN>.hcl Per-secret policy for vault-runner ephemeral dispatch exactly one kv/data/disinto/runner/<TOKEN> path
dispatcher.hcl Long-running edge dispatcher kv/data/disinto/runner/* + kv/data/disinto/shared/ops-repo/*

The KV mount name kv/ is the convention this migration uses (mounted as KV v2). Vault addresses KV v2 data at kv/data/<path> and metadata at kv/metadata/<path> — policies that need list always target the metadata path; reads target data.

Policy → KV path summary

Policy Reads
service-forgejo kv/data/disinto/shared/forgejo/*
service-woodpecker kv/data/disinto/shared/woodpecker/*
bot-<role> (dev, review, gardener, architect, planner, predictor, supervisor, vault, dev-qwen) kv/data/disinto/bots/<role>/* + kv/data/disinto/shared/forge/*
runner-<TOKEN> (GITHUB_TOKEN, CODEBERG_TOKEN, CLAWHUB_TOKEN, DEPLOY_KEY, NPM_TOKEN, DOCKER_HUB_TOKEN) kv/data/disinto/runner/<TOKEN> (exactly one)
dispatcher kv/data/disinto/runner/* + kv/data/disinto/shared/ops-repo/*

Why one policy per runner secret

vault-runner (Step 5) reads each action TOML's secrets = [...] list and composes only those runner-<NAME> policies onto the per-dispatch ephemeral token. Wildcards or batched policies would hand the runner more secrets than the action declared — defeats AD-006 (least-privilege per external action). Adding a new declarable secret = adding one new runner-<NAME>.hcl here + extending the SECRETS allow-list in vault-action validation.

Adding a new policy

  1. Drop a file matching one of the four naming patterns above. Use an existing file in the same family as the template — comment header, capability list, and KV path layout should match the family.
  2. Run tools/vault-apply-policies.sh --dry-run to confirm the new basename appears in the planned-work list with the expected SHA.
  3. Run tools/vault-apply-policies.sh against a Vault instance to create it; re-run to confirm it reports unchanged.
  4. The CI fmt + validate step lands in S2.6 (#884). Until then vault policy fmt <file> locally is the fastest sanity check.

JWT-auth roles (S2.3)

Policies are inert until a Vault token carrying them is minted. In this migration that mint path is JWT auth — Nomad jobs exchange their workload-identity JWT for a Vault token via auth/jwt-nomad/role/<name>token_policies = ["<policy>"]. The role bindings live in ../roles.yaml; the script that enables the auth method + writes the config + applies roles is lib/init/nomad/vault-nomad-auth.sh. The applier is tools/vault-apply-roles.sh.

Role → policy naming convention

Role name == policy name, 1:1. vault/roles.yaml carries one entry per vault/policies/*.hcl file:

roles:
  - name:      service-forgejo      # Vault role
    policy:    service-forgejo      # ACL policy attached to minted tokens
    namespace: default              # bound_claims.nomad_namespace
    job_id:    forgejo              # bound_claims.nomad_job_id

The role name is what jobspecs reference via vault { role = "..." } — keep it identical to the policy basename so an S2.1↔S2.3 drift (new policy without a role, or vice versa) shows up in one directory review, not as a runtime "permission denied" at job placement.

bound_claims.nomad_job_id is the actual job "..." name in the jobspec, which may differ from the policy name (e.g. policy service-forgejo binds to job forgejo). Update it when each bot's or runner's jobspec lands.

Adding a new service

  1. Write vault/policies/<name>.hcl using the naming-table family that fits (service-, bot-, runner-, or standalone).
  2. Add a matching entry to vault/roles.yaml with all four fields (name, policy, namespace, job_id).
  3. Apply both — either in one shot via lib/init/nomad/vault-nomad-auth.sh (policies → roles → nomad SIGHUP), or granularly via tools/vault-apply-policies.sh + tools/vault-apply-roles.sh.
  4. Reference the role in the consuming jobspec's vault { role = "<name>" }.

Token shape

All roles share the same token shape, hardcoded in tools/vault-apply-roles.sh:

Field Value
bound_audiences ["vault.io"] — matches default_identity.aud in nomad/server.hcl
token_type service — auto-revoked when the task exits
token_ttl 1h
token_max_ttl 24h

Bumping any of these is a knowing, repo-wide change. Per-role overrides would let one service's tokens outlive the others — add a field to vault/roles.yaml and the applier at the same time if that ever becomes necessary.

What this directory does NOT own

  • Attaching policies to Nomad jobs. That's S2.4 (#882) via the jobspec template { vault { policies = […] } } stanza — the role name in vault { role = "..." } is what binds the policy.
  • Writing the secret values themselves. That's S2.2 (#880) via tools/vault-import.sh.
  • CI policy fmt + validate + roles.yaml check. That's S2.6 (#884).