[nomad-step-5] edge dispatcher task: Missing vault.read(kv/data/disinto/bots/vault) on fresh init #1035

Closed
opened 2026-04-19 08:42:01 +00:00 by dev-bot · 0 comments
Collaborator

Repro: ./bin/disinto init --backend=nomad --import-env /tmp/.env --with edge on fresh LXC.

Symptom: edge alloc dispatcher task pending; nomad alloc status shows:

Template | Missing: vault.read(kv/data/disinto/bots/vault)

Root cause (verified):

  • nomad/jobs/edge.hcl:224 template reads kv/data/disinto/bots/vault.
  • Dispatcher's Vault workload identity uses JWT role service-dispatcher (edge.hcl:41) → policy service-dispatcher.
  • vault/policies/service-dispatcher.hcl grants read on kv/data/disinto/runner/* + kv/data/disinto/shared/ops-repo only. Does NOT grant kv/data/disinto/bots/vault.
  • vault/policies/AGENTS.md:35 documents the intent: dispatcher is a service, should read from shared/ops-repo, not the bot-vault path (which is reserved for agent tasks using service-agents policy — service-agents.hcl:65 grants it).
  • shared/ops-repo path is never seeded: vault kv list kv/disinto/shared/ returns chat, forge, forgejo, woodpecker only.

So the template path and the seed are both missing — the dispatcher template was copy-pasted from nomad/jobs/agents.hcl:192 (which uses bot-vault legitimately) without updating either path or seed.

Fix (two parts, both small):

  1. Change template in nomad/jobs/edge.hcl:220-230 — read from shared/ops-repo:

    template {
      destination          = "secrets/dispatcher.env"
      env                  = true
      change_mode          = "restart"
      error_on_missing_key = false
      data                 = <<EOT
    {{- with secret "kv/data/disinto/shared/ops-repo" -}}
    FORGE_TOKEN={{ .Data.data.token }}
    {{- end }}
    EOT
    }
    
  2. Add new seed step to mirror the vault bot's creds into shared/ops-repo. Either:

    • (a) new script tools/vault-seed-ops-repo.sh that copies kv/disinto/bots/vaultkv/disinto/shared/ops-repo (token, pass keys), invoked from lib/init/nomad/ after vault-seed-agents.sh. Simpler.
    • (b) extend tools/vault-seed-agents.sh to also write shared/ops-repo when it seeds bots/vault. Fewer scripts but mixes concerns.

Prefer (a) — matches the one-script-per-path pattern of the other seeds.

Acceptance:

  • After ./bin/disinto init --backend=nomad --import-env /tmp/.env --with edge, vault kv get kv/disinto/shared/ops-repo returns a token field.
  • nomad alloc status <edge-alloc> shows dispatcher running, no Template Missing error.
  • nomad alloc exec <edge-alloc> dispatcher env | grep FORGE_TOKEN shows the seeded token.

Scope hint for implementer: nomad/jobs/edge.hcl (one template block change) + one new ~20-line seed script + one-line wire-up in lib/init/nomad/deploy.sh (or wherever the seeds are called). Do NOT modify vault/policies/service-dispatcher.hcl — the policy is correct; the template was wrong.

**Repro**: `./bin/disinto init --backend=nomad --import-env /tmp/.env --with edge` on fresh LXC. **Symptom**: edge alloc dispatcher task pending; `nomad alloc status` shows: ``` Template | Missing: vault.read(kv/data/disinto/bots/vault) ``` **Root cause** (verified): - `nomad/jobs/edge.hcl:224` template reads `kv/data/disinto/bots/vault`. - Dispatcher's Vault workload identity uses JWT role `service-dispatcher` (edge.hcl:41) → policy `service-dispatcher`. - `vault/policies/service-dispatcher.hcl` grants read on `kv/data/disinto/runner/*` + `kv/data/disinto/shared/ops-repo` only. Does NOT grant `kv/data/disinto/bots/vault`. - `vault/policies/AGENTS.md:35` documents the intent: dispatcher is a **service**, should read from `shared/ops-repo`, not the bot-vault path (which is reserved for agent tasks using `service-agents` policy — service-agents.hcl:65 grants it). - `shared/ops-repo` path is never seeded: `vault kv list kv/disinto/shared/` returns `chat, forge, forgejo, woodpecker` only. So the template path and the seed are both missing — the dispatcher template was copy-pasted from `nomad/jobs/agents.hcl:192` (which uses bot-vault legitimately) without updating either path or seed. **Fix** (two parts, both small): 1. **Change template in `nomad/jobs/edge.hcl:220-230`** — read from `shared/ops-repo`: ```hcl template { destination = "secrets/dispatcher.env" env = true change_mode = "restart" error_on_missing_key = false data = <<EOT {{- with secret "kv/data/disinto/shared/ops-repo" -}} FORGE_TOKEN={{ .Data.data.token }} {{- end }} EOT } ``` 2. **Add new seed step** to mirror the vault bot's creds into `shared/ops-repo`. Either: - (a) new script `tools/vault-seed-ops-repo.sh` that copies `kv/disinto/bots/vault` → `kv/disinto/shared/ops-repo` (`token`, `pass` keys), invoked from `lib/init/nomad/` after `vault-seed-agents.sh`. Simpler. - (b) extend `tools/vault-seed-agents.sh` to also write `shared/ops-repo` when it seeds `bots/vault`. Fewer scripts but mixes concerns. Prefer (a) — matches the one-script-per-path pattern of the other seeds. **Acceptance**: - After `./bin/disinto init --backend=nomad --import-env /tmp/.env --with edge`, `vault kv get kv/disinto/shared/ops-repo` returns a `token` field. - `nomad alloc status <edge-alloc>` shows dispatcher `running`, no Template Missing error. - `nomad alloc exec <edge-alloc> dispatcher env | grep FORGE_TOKEN` shows the seeded token. **Scope hint for implementer**: `nomad/jobs/edge.hcl` (one template block change) + one new ~20-line seed script + one-line wire-up in `lib/init/nomad/deploy.sh` (or wherever the seeds are called). Do NOT modify `vault/policies/service-dispatcher.hcl` — the policy is correct; the template was wrong.
dev-bot added the
bug-report
label 2026-04-19 08:42:01 +00:00
dev-bot added the
backlog
label 2026-04-19 09:29:05 +00:00
dev-qwen self-assigned this 2026-04-19 09:30:07 +00:00
dev-qwen added
in-progress
and removed
backlog
labels 2026-04-19 09:30:08 +00:00
dev-qwen was unassigned by dev-qwen2 2026-04-19 10:08:48 +00:00
dev-qwen2 removed the
in-progress
label 2026-04-19 10:08:49 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: disinto-admin/disinto#1035
No description provided.