fix: [nomad-step-2] S2.6 — CI: vault policy fmt + validate + roles.yaml check (#884) #903
No reviewers
Labels
No labels
action
backlog
blocked
bug-report
cannot-reproduce
in-progress
in-triage
needs-triage
prediction/actioned
prediction/dismissed
prediction/unreviewed
priority
rejected
reproduced
tech-debt
underspecified
vision
No milestone
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: disinto-admin/disinto#903
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "fix/issue-884-1"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Fixes #884
Changes
Extend .woodpecker/nomad-validate.yml with three new fail-closed steps that guard every artifact under vault/policies/ and vault/roles.yaml before it can land: 4. vault-policy-fmt — cp+fmt+diff idempotence check (vault 1.18.5 has no `policy fmt -check` flag, so we build the non-destructive check out of `vault policy fmt` on a /tmp copy + diff against the original) 5. vault-policy-validate — HCL syntax + capability validation via `vault policy write` against an inline dev-mode Vault server (no offline `policy validate` subcommand exists; dev-mode writes are ephemeral so this is a validator, not a deploy) 6. vault-roles-validate — yamllint + PyYAML-based role→policy reference check (every role's `policy:` field must match a vault/policies/*.hcl basename; also checks the four required fields name/policy/namespace/job_id) Secret-scan coverage for vault/policies/*.hcl is already provided by the P11 gate (.woodpecker/secret-scan.yml) via its `vault/**/*` trigger path — this pipeline intentionally does NOT duplicate that gate to avoid the inline-heredoc / YAML-parse failure mode that sank the prior attempt at this issue (PR #896). Trigger paths extended: `vault/policies/**` and `vault/roles.yaml`. `lib/init/nomad/vault-*.sh` is already covered by the existing `lib/init/nomad/**` glob. Docs: nomad/AGENTS.md and vault/policies/AGENTS.md updated with the policy lifecycle, the CI enforcement table, and the common failure modes authors will see. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>AI Review
Summary
CI pipeline extended from 5 to 8 steps: vault-policy-fmt (format idempotence), vault-policy-validate (HCL syntax via inline dev Vault), and vault-roles-validate (yamllint + role→policy reference check). Documentation in nomad/AGENTS.md and vault/policies/AGENTS.md updated consistently throughout.
Findings
.woodpecker/nomad-validate.yml:172:vault policy fmt "$tmp" >/dev/null 2>&1silences stderr. On a completely malformed HCL file,vault policy fmtwill exit non-zero andset -eaborts the loop — but with no diagnostic visible in CI logs. The author sees a failed step with no message. Step 5 would have caught it with a clear Vault parser error, but step 5 never runs. Fix: drop the2>&1so only stdout is suppressed (no noise on success) while stderr remains visible for failure cases. Non-blocking because CI still blocks merge, and the scenario requires a fully-unparseable policy file.Verification
[ -f "$f" ] || continueguard handles the no-match POSIX glob case properly.trap ... EXIT INT TERMfires onset -eabort (EXIT trap is correct here; the RETURN-trap gotcha only applies inside functions). Server is reliably killed on any step exit.vault statusinside anifcondition is exempt fromset -e, so the readiness poll does not abort early on transient non-zero returns during startup.os.listdir('vault/policies')would throwFileNotFoundErrorif the directory is absent, but the trigger paths (vault/policies/**) guarantee the directory exists whenever this pipeline runs. Edge-case only ifvault/roles.yamlexists with no siblingvault/policies/directory — not a real scenario given current repo layout.vault/policies/have matching entries inroles.yamland vice versa.vault/policies/**,vault/roles.yaml) are correct; secret-scan delegation to P11 is properly noted and not duplicated.Verdict
APPROVE — Three new CI steps are logically sound and well-documented; one minor stderr-suppression DX issue noted but non-blocking.
Reviewed at
6e73c6d| AGENTS.mdAI Review: APPROVE — Three new CI steps are logically sound and well-documented; one minor stderr-suppression DX issue noted but non-blocking.