[nomad-step-3] S3.4 — wire --with woodpecker + deploy ordering + OAuth seed #937

Closed
opened 2026-04-17 05:13:16 +00:00 by dev-bot · 4 comments
Collaborator

Part of the Nomad+Vault migration. Step 3 — Woodpecker server + agent. Blocked by: #934 (S3.1), #935 (S3.2), #936 (S3.3).

Goal

Wire --with woodpecker (and --with forgejo,woodpecker) into bin/disinto init --backend=nomad so a single command stands up Forgejo + Woodpecker on a fresh LXC.

Scope

In bin/disinto:

  • Add woodpecker to the known-services list (currently only forgejo).
  • deploy.sh already accepts a list of services. Ensure it handles dependency ordering: woodpecker-server must deploy after forgejo is healthy; woodpecker-agent after woodpecker-server. Add ordering metadata — either:
    • Hardcode: DEPLOY_ORDER="forgejo woodpecker-server woodpecker-agent" and deploy in this order, skipping services not in the --with list.
    • Or convention: jobspec filenames prefixed 01-forgejo.hcl, 02-woodpecker-server.hcl, etc.
      Prefer the explicit DEPLOY_ORDER approach — simpler, greppable.
  • --with woodpecker implies --with forgejo,woodpecker (woodpecker without forgejo is nonsensical). If the user passes --with woodpecker alone, auto-include forgejo and log a note.
  • Before deploying woodpecker, call tools/vault-seed-woodpecker.sh (from the S2-fix-F convention: vault-seed-<svc>.sh auto-invoked per service).

Acceptance criteria

Fresh LXC + clone + .env:

./bin/disinto init --backend=nomad --import-env /tmp/.env --with forgejo,woodpecker
  • Cluster up, policies + auth + import + seeds.
  • Forgejo healthy at :3000.
  • Woodpecker server healthy at :8000.
  • Woodpecker agent visible in WP UI.
  • "Login with Forgejo" OAuth flow works end-to-end.
  • --with woodpecker (no forgejo) auto-includes forgejo with a log line.
  • Re-running is idempotent.
  • --dry-run prints the full plan including seed + deploy order.

Non-goals

  • No CI pipeline execution test (that needs a repo + webhook — manual cutover-day check).
  • No agents/edge/chat jobspecs (Steps 4–6).

Labels / meta


Prior art round 1 — abandoned PR #941 (closed, branch fix/issue-937 kept)

dev-qwen took this at 06:14, CI-exhausted at 06:22 (8min, 3 attempts). Real test failure, not WP flake.

Failing test (pipeline #1173, nomad-validate / bats-init-nomad):

not ok 17 disinto init --backend=nomad --with unknown-service errors with unknown service
  # line 216: [ "$status" -ne 0 ] failed

31/32 pass. The unknown-service validation broke when adding woodpecker to the known-services list. Likely the validation regex/case-match was updated but now accepts everything, or the error path exits 0 instead of 1.

Fix: check bin/disinto's service-validation block — search for "unknown service" or the known-services list. Confirm --with bogus-svc exits non-zero. Probably a regex or case missing a *) error default after adding woodpecker.


Prior art round 2 — abandoned PR #942 (closed, branch kept)

dev-qwen abandoned #937 a second time (CI-exhausted). Same test #17 failure: --with unknown-service exits 0 instead of non-zero. Force-assigning to dev-bot (Claude) per agreed policy.

Fix remains: find the service-validation block in bin/disinto that lists known services. Ensure the * / else branch exits non-zero for unknown values. Check that adding woodpecker didn't remove the default error case from a case/if chain.

Part of the Nomad+Vault migration. **Step 3 — Woodpecker server + agent.** **Blocked by: #934 (S3.1), #935 (S3.2), #936 (S3.3).** ## Goal Wire `--with woodpecker` (and `--with forgejo,woodpecker`) into `bin/disinto init --backend=nomad` so a single command stands up Forgejo + Woodpecker on a fresh LXC. ## Scope In `bin/disinto`: - Add `woodpecker` to the known-services list (currently only `forgejo`). - `deploy.sh` already accepts a list of services. Ensure it handles **dependency ordering**: `woodpecker-server` must deploy after `forgejo` is healthy; `woodpecker-agent` after `woodpecker-server`. Add ordering metadata — either: - Hardcode: `DEPLOY_ORDER="forgejo woodpecker-server woodpecker-agent"` and deploy in this order, skipping services not in the `--with` list. - Or convention: jobspec filenames prefixed `01-forgejo.hcl`, `02-woodpecker-server.hcl`, etc. Prefer the explicit `DEPLOY_ORDER` approach — simpler, greppable. - `--with woodpecker` implies `--with forgejo,woodpecker` (woodpecker without forgejo is nonsensical). If the user passes `--with woodpecker` alone, auto-include forgejo and log a note. - Before deploying woodpecker, call `tools/vault-seed-woodpecker.sh` (from the S2-fix-F convention: `vault-seed-<svc>.sh` auto-invoked per service). ## Acceptance criteria Fresh LXC + clone + `.env`: ``` ./bin/disinto init --backend=nomad --import-env /tmp/.env --with forgejo,woodpecker ``` - Cluster up, policies + auth + import + seeds. - Forgejo healthy at `:3000`. - Woodpecker server healthy at `:8000`. - Woodpecker agent visible in WP UI. - "Login with Forgejo" OAuth flow works end-to-end. - `--with woodpecker` (no forgejo) auto-includes forgejo with a log line. - Re-running is idempotent. - `--dry-run` prints the full plan including seed + deploy order. ## Non-goals - No CI pipeline execution test (that needs a repo + webhook — manual cutover-day check). - No agents/edge/chat jobspecs (Steps 4–6). ## Labels / meta - `[nomad-step-3] S3.4` — blocked by #934, #935, #936. --- ## Prior art round 1 — abandoned PR #941 (closed, branch `fix/issue-937` kept) dev-qwen took this at 06:14, CI-exhausted at 06:22 (8min, 3 attempts). Real test failure, not WP flake. **Failing test** (pipeline #1173, `nomad-validate / bats-init-nomad`): ``` not ok 17 disinto init --backend=nomad --with unknown-service errors with unknown service # line 216: [ "$status" -ne 0 ] failed ``` 31/32 pass. The unknown-service validation broke when adding `woodpecker` to the known-services list. Likely the validation regex/case-match was updated but now accepts everything, or the error path exits 0 instead of 1. **Fix:** check `bin/disinto`'s service-validation block — search for "unknown service" or the known-services list. Confirm `--with bogus-svc` exits non-zero. Probably a regex or `case` missing a `*) error` default after adding `woodpecker`. --- ## Prior art round 2 — abandoned PR #942 (closed, branch kept) dev-qwen abandoned #937 a second time (CI-exhausted). Same test #17 failure: `--with unknown-service` exits 0 instead of non-zero. Force-assigning to dev-bot (Claude) per agreed policy. Fix remains: find the service-validation block in `bin/disinto` that lists known services. Ensure the `*` / `else` branch exits non-zero for unknown values. Check that adding `woodpecker` didn't remove the default error case from a `case`/`if` chain.
dev-bot added the
backlog
label 2026-04-17 05:13:16 +00:00
dev-qwen self-assigned this 2026-04-17 06:09:16 +00:00
dev-qwen added
in-progress
and removed
backlog
labels 2026-04-17 06:09:16 +00:00
Collaborator

Blocked — issue #937

Field Value
Exit reason ci_exhausted_poll (3 attempts, PR #941)
Timestamp 2026-04-17T06:19:52Z
### Blocked — issue #937 | Field | Value | |---|---| | Exit reason | `ci_exhausted_poll (3 attempts, PR #941)` | | Timestamp | `2026-04-17T06:19:52Z` |
dev-qwen2 added
blocked
and removed
in-progress
labels 2026-04-17 06:19:52 +00:00
Collaborator

Blocked — issue #937

Field Value
Exit reason ci_exhausted
Timestamp 2026-04-17T06:22:03Z
### Blocked — issue #937 | Field | Value | |---|---| | Exit reason | `ci_exhausted` | | Timestamp | `2026-04-17T06:22:03Z` |
dev-qwen was unassigned by dev-bot 2026-04-17 06:25:34 +00:00
dev-bot added
backlog
and removed
blocked
labels 2026-04-17 06:25:34 +00:00
dev-qwen self-assigned this 2026-04-17 06:26:04 +00:00
dev-qwen added
in-progress
and removed
backlog
labels 2026-04-17 06:26:04 +00:00
Collaborator

Blocked — issue #937

Field Value
Exit reason ci_exhausted_poll (3 attempts, PR #942)
Timestamp 2026-04-17T06:42:10Z
### Blocked — issue #937 | Field | Value | |---|---| | Exit reason | `ci_exhausted_poll (3 attempts, PR #942)` | | Timestamp | `2026-04-17T06:42:10Z` |
dev-qwen2 added
blocked
and removed
in-progress
labels 2026-04-17 06:42:10 +00:00
dev-qwen was unassigned by dev-bot 2026-04-17 06:45:13 +00:00
dev-bot self-assigned this 2026-04-17 06:45:13 +00:00
dev-bot added
in-progress
and removed
blocked
labels 2026-04-17 06:45:13 +00:00
Collaborator

Blocked — issue #937

Field Value
Exit reason ci_exhausted
Timestamp 2026-04-17T07:02:20Z
### Blocked — issue #937 | Field | Value | |---|---| | Exit reason | `ci_exhausted` | | Timestamp | `2026-04-17T07:02:20Z` |
dev-qwen added
blocked
and removed
in-progress
labels 2026-04-17 07:02:21 +00:00
dev-bot was unassigned by dev-qwen2 2026-04-17 07:05:35 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
3 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: disinto-admin/disinto#937
No description provided.