Commit graph

191 commits

Author SHA1 Message Date
dev-qwen2
0243f546da fix: edge-control: deregister has no ownership check — any authorized SSH key can take over any project (#1091)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/nomad-validate Pipeline was successful
Require the caller to prove ownership on deregister by providing the
pubkey that was used during registration. The stored pubkey is loaded
from registry.json and compared byte-for-byte against the supplied key.

Changes:
- Add get_pubkey() helper to lib/ports.sh
- Update do_deregister() to verify caller pubkey before removing project
- Update SSH protocol to "deregister <project> <pubkey>"
- Update bin/disinto CLI to read tunnel keypair and pass pubkey
- Return {"error":"pubkey mismatch"} on failure (no pubkey leakage)
- Add unit tests for both success and failure paths

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-20 19:12:31 +00:00
Agent
02f8e13f33 fix: feat: configure Forgejo ROOT_URL for /forge/ subpath routing (#1080)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/nomad-validate Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/edge-subpath Pipeline was successful
ci/woodpecker/pr/nomad-validate Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful
Move FORGEJO_ROOT_URL and WOODPECKER_HOST configuration to BEFORE
generate_compose so the .env file is available for variable substitution.

When EDGE_TUNNEL_FQDN is set with subpath routing mode, the .env file
now gets FORGEJO_ROOT_URL=https://<fqdn>/forge/ written before
docker-compose.yml is generated, ensuring the subpath is included in
the generated compose file.

This fixes the 404 on /forge/ by ensuring Forgejo's ROOT_URL includes
the /forge/ prefix so its internal router recognizes the subpath.

The Caddyfile already correctly does NOT strip the prefix - it passes
the full /forge/... path to forgejo:3000.
2026-04-20 15:13:01 +00:00
Claude
78a295f567 fix: vision(#623): automate subdomain fallback pivot if subpath routing fails (#1028)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/nomad-validate Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/edge-subpath Pipeline was successful
ci/woodpecker/pr/nomad-validate Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-20 11:12:20 +00:00
dev-qwen2
95bacbbfa4 fix: resolve all CI review blockers for forgejo admin bootstrap (#1069)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/nomad-validate Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/nomad-validate Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful
2026-04-20 08:35:40 +00:00
dev-qwen2
23e47e3820 fix: bug: disinto init --backend=nomad — does not bootstrap Forgejo admin user (#1069)
Some checks failed
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/nomad-validate Pipeline was successful
ci/woodpecker/pr/ci Pipeline failed
ci/woodpecker/pr/nomad-validate Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful
2026-04-20 08:06:06 +00:00
Agent
99fe90ae27 fix: tool: disinto backup import — idempotent restore on fresh Nomad cluster (#1058)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/nomad-validate Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/nomad-validate Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful
2026-04-19 21:28:02 +00:00
Claude
c287ec0626 fix: tool: disinto backup create — export Forgejo issues + disinto-ops git bundle (#1057)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-19 20:29:44 +00:00
Agent
cd778c4775 fix: [nomad-step-5] edge dispatcher task: Missing vault.read(kv/data/disinto/bots/vault) on fresh init (#1035)
Some checks failed
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/nomad-validate Pipeline was successful
ci/woodpecker/pr/ci Pipeline failed
ci/woodpecker/pr/nomad-validate Pipeline was successful
ci/woodpecker/pr/secret-scan Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline failed
2026-04-19 09:35:27 +00:00
Agent
fa7fb60415 fix: [nomad-step-5] S5-fix-4 — staging health check 404: host volume empty, needs content seeded (#1010)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/nomad-validate Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/nomad-validate Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful
2026-04-18 11:22:39 +00:00
Agent
f2bafbc190 fix: [nomad-step-5] S5-fix-1 — chat/edge image build context should be docker/<svc>/ not repo root (#1004)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/nomad-validate Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/nomad-validate Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful
2026-04-18 10:02:23 +00:00
Claude
acd6240ec4 fix: [nomad-step-5] S5.5 — wire --with edge,staging,chat + vault-runner + full deploy ordering (#992)
Some checks failed
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/nomad-validate Pipeline was successful
ci/woodpecker/pr/ci Pipeline failed
ci/woodpecker/pr/nomad-validate Pipeline was successful
ci/woodpecker/pr/secret-scan Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline failed
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-18 09:01:54 +00:00
Claude
da93748fee fix: [nomad-step-5] S5.2 — nomad/jobs/staging.hcl + chat.hcl (#989)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/nomad-validate Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/nomad-validate Pipeline was successful
ci/woodpecker/pr/secret-scan Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful
Add lightweight Nomad service jobs for the staging file server and
Claude chat UI. Key changes:

- nomad/jobs/staging.hcl: caddy:alpine file-server mounting docker/
  as /srv/site (read-only), no Vault integration needed
- nomad/jobs/chat.hcl: custom disinto/chat:local image with sandbox
  hardening (cap_drop ALL, tmpfs, pids_limit 128, security_opt),
  Vault-templated OAuth secrets from kv/disinto/shared/chat
- nomad/client.hcl: add site-content host volume for staging
- vault/policies/service-chat.hcl + vault/roles.yaml: read-only
  access to chat secrets via workload identity
- bin/disinto: wire staging+chat into build, deploy order, seed
  mapping, summary, and service validation
- tests/disinto-init-nomad.bats: update known-services assertion

Fixes prior art issue where security_opt and pids_limit were placed
at task level instead of inside docker driver config block.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-18 08:01:48 +00:00
dev-qwen2
4a07049383 fix: [nomad-step-4] S4-fix-7 — agents.hcl must use :local tag not :latest (Nomad always pulls :latest) (#986)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/nomad-validate Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/nomad-validate Pipeline was successful
ci/woodpecker/pr/secret-scan Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful
2026-04-18 06:11:33 +00:00
dev-qwen2
0c767d9fee fix: [nomad-step-4] S4-fix-2 — build disinto/agents:latest locally before deploy (no registry) (#972)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/nomad-validate Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/nomad-validate Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful
2026-04-17 15:47:52 +00:00
dev-qwen2
b9588073ad fix: tech-debt: init --dry-run shows batch seed→deploy but real run is interleaved (#950)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/nomad-validate Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/nomad-validate Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful
2026-04-17 15:21:47 +00:00
dev-qwen2
155ec85a3e fix: [nomad-step-4] S4.2 — wire --with agents + deploy ordering (#956)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/nomad-validate Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/nomad-validate Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful
2026-04-17 10:55:13 +00:00
dev-qwen2
8fb173763c fix: [nomad-step-3] S3-fix-2 — wp-oauth REPO_ROOT still wrong + seed/deploy must interleave (#948)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/nomad-validate Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/nomad-validate Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful
2026-04-17 08:24:00 +00:00
Claude
64cadf8a7d fix: [nomad-step-3] S3.4 — wire --with woodpecker + deploy ordering + OAuth seed (#937)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/nomad-validate Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/nomad-validate Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 06:53:40 +00:00
Claude
f214080280 fix: [review-r1] seed loop sudo invocation bypasses sudoers env_reset (#929)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/nomad-validate Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/nomad-validate Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful
`sudo -n "VAULT_ADDR=$vault_addr" -- "$seed_script"` passed
VAULT_ADDR as a sudoers env-assignment argument. With the default
`env_reset=on` policy (almost all distros), sudo silently discards
env assignments unless the variable is in `env_keep` — and
VAULT_ADDR is not. The seeder then hit its own precondition check
at vault-seed-forgejo.sh:109 and died with "VAULT_ADDR unset",
breaking the fresh-LXC non-root acceptance path the PR was written
to close.

Fix: run `env` as the command under sudo — `sudo -n -- env
"VAULT_ADDR=$vault_addr" "$seed_script"` — so VAULT_ADDR is set in
the child process directly, unaffected by sudoers env handling.
The root (non-sudo) branch already used shell-level env assignment
and was correct.

Adds a grep-level regression guard that pins the `env VAR=val`
invocation and negative-asserts the unsafe bare-argument form.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 22:14:05 +00:00
Claude
5e83ecc2ef fix: [nomad-step-2] S2-fix-F — wire tools/vault-seed-<svc>.sh into bin/disinto --with <svc> (#928)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/nomad-validate Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/nomad-validate Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful
`tools/vault-seed-forgejo.sh` existed and worked, but `bin/disinto init
--backend=nomad --with forgejo` never invoked it, so a fresh LXC with an
empty Vault hit `Template Missing: vault.read(kv/data/disinto/shared/
forgejo)` and the forgejo alloc timed out inside deploy.sh's 240s
healthy_deadline — operator had to run the seeder + `nomad alloc
restart` by hand to recover.

In `_disinto_init_nomad`, after `vault-import.sh` (or its skip branch)
and before `deploy.sh`, iterate `--with <svc>` and auto-invoke
`tools/vault-seed-<svc>.sh` when the file exists + is executable.
Services without a seeder are silently skipped — Step 3+ services
(woodpecker, chat, etc.) can ship their own seeder without touching
`bin/disinto`. VAULT_ADDR is passed explicitly because cluster-up.sh
writes the profile.d export during this same init run (current shell
hasn't sourced it yet) and `vault-seed-forgejo.sh` — unlike its
sibling vault-* scripts — requires the caller to set VAULT_ADDR
instead of defaulting it via `_hvault_default_env`. Mirror the loop in
the --dry-run plan so the operator-visible plan matches the real run.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 22:00:13 +00:00
Claude
0b994d5d6f fix: [nomad-step-2] S2-fix — 4 bugs block Step 2 verification: kv/ mount missing, VAULT_ADDR, --sops required, template fallback (#912)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/nomad-validate Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/nomad-validate Pipeline was successful
ci/woodpecker/pr/secret-scan Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful
Post-Step-2 verification on a fresh LXC uncovered 4 stacked bugs blocking
the `disinto init --backend=nomad --import-env ... --with forgejo` hero
command. Root cause is #1; #2-#4 surface as the operator walks past each.

1. kv/ secret engine never enabled — every policy, role, import write,
   and template read references kv/disinto/* and 403s without the mount.
   Adds lib/init/nomad/vault-engines.sh (idempotent POST sys/mounts/kv)
   wired into `_disinto_init_nomad` before vault-apply-policies.sh.

2. VAULT_ADDR/VAULT_TOKEN not exported in the init process. Extracts the
   5-line default-and-resolve block into `_hvault_default_env` in
   lib/hvault.sh and sources it from vault-engines.sh, vault-nomad-auth.sh,
   vault-apply-policies.sh, vault-apply-roles.sh, and vault-import.sh. One
   definition, zero copies — avoids the 5-line sliding-window duplicate
   gate that failed PRs #917/#918.

3. vault-import.sh required --sops; spec (#880) says --env alone must
   succeed. Flag validation now: --sops requires --age-key, --age-key
   requires --sops, --env alone imports only the plaintext half.

4. forgejo.hcl template blocks forever when kv/disinto/shared/forgejo is
   absent or missing a key. Adds `error_on_missing_key = false` so the
   existing `with ... else ...` fallback emits placeholders instead of
   hanging on template-pending.

vault-engines.sh parser uses a while/shift shape distinct from
vault-apply-policies.sh (flat case) and vault-apply-roles.sh (if/elif
ladder) so the three sibling flag parsers hash differently under the
repo-wide duplicate detector.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 21:10:59 +00:00
Claude
ece5d9b6cc fix: [nomad-step-2] S2.5 review — gate policies/auth/import on --empty; reject --empty + --import-* (#883)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/nomad-validate Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/nomad-validate Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful
Addresses review #907 blocker: docs/nomad-migration.md claimed
--empty "skips policies/auth/import/deploy" but _disinto_init_nomad
had no $empty gate around those blocks — operators reaching the
"cluster-only escape hatch" would still invoke vault-apply-policies.sh
and vault-nomad-auth.sh, contradicting the runbook.

Changes:
- _disinto_init_nomad: exit 0 immediately after cluster-up when
  --empty is set, in both dry-run and real-run branches. Only the
  cluster-up plan appears; no policies, no auth, no import, no
  deploy. Matches the docs.
- disinto_init: reject --empty combined with any --import-* flag.
  --empty discards the import step, so the combination silently
  does nothing (worse failure mode than a clear error up front).
  Symmetric to the existing --empty vs --with check.
- Pre-flight existence check for policies/auth scripts now runs
  unconditionally on the non-empty path (previously gated on
  --import-*), matching the unconditional invocation. Import-script
  check stays gated on --import-*.

Non-blocking observation also addressed: the pre-flight guard
comment + actual predicate were inconsistent ("unconditionally
invoke policies+auth" but only checked on import). Now the
predicate matches: [ "$empty" != "true" ] gates policies/auth,
and an inner --import-* guard gates the import script.

Tests (+3):
- --empty --dry-run shows no S2.x sections (negative assertions)
- --empty --import-env rejected
- --empty --import-sops --age-key rejected

30/30 nomad tests pass; shellcheck clean.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 19:25:32 +00:00
Claude
aa3782748d fix: [nomad-step-2] S2.5 — bin/disinto init --import-env / --import-sops / --age-key wire-up (#883)
Wire the Step-2 building blocks (import, auth, policies) into
`disinto init --backend=nomad` so a single command on a fresh LXC
provisions cluster + policies + auth + imports secrets + deploys
services.

Adds three flags to `disinto init --backend=nomad`:
  --import-env PATH   plaintext .env from old stack
  --import-sops PATH  sops-encrypted .env.vault.enc (requires --age-key)
  --age-key PATH      age keyfile to decrypt --import-sops

Flow: cluster-up.sh → vault-apply-policies.sh → vault-nomad-auth.sh →
(optional) vault-import.sh → deploy.sh. Policies + auth run on every
nomad real-run path (idempotent); import runs only when --import-* is
set; all layers safe to re-run.

Flag validation:
  --import-sops without --age-key → error
  --age-key without --import-sops → error
  --import-env alone (no sops)    → OK
  --backend=docker + any --import-* → error

Dry-run prints a five-section plan (cluster-up + policies + auth +
import + deploy) with every argv that would be executed; touches
nothing, logs no secret values.

Dry-run output prints one line per --import-* flag that is actually
set — not in an if/elif chain — so all three paths appear when all
three flags are passed. Prior attempts regressed this invariant.

Tests:
  tests/disinto-init-nomad.bats +10 cases covering flag validation,
  dry-run plan shape (each flag prints its own path), policies+auth
  always-on (without --import-*), and --flag=value form.

Docs: docs/nomad-migration.md new file — cutover-day runbook with
invocation shape, flag summary, idempotency contract, dry-run, and
secret-hygiene notes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 19:25:32 +00:00
dev-qwen2
28eb182487 fix: Two parallel activation paths for llama agents (ENABLE_LLAMA_AGENT vs [agents.X] TOML) (#846) 2026-04-16 19:05:46 +00:00
3465319ac5 Merge pull request 'fix: [nomad-step-1] S1.3 — wire --with forgejo into bin/disinto init --backend=nomad (#842)' (#868) from fix/issue-842-1 into main
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/nomad-validate Pipeline was successful
2026-04-16 12:50:49 +00:00
Agent
a3eb33ccf7 fix: _validate_env_vars skips Anthropic-backend agents + missing sed escaping
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/nomad-validate Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/nomad-validate Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful
- bin/disinto: Remove '[ -n "$base_url" ] || continue' guard that caused
  all Anthropic-backend agents to be silently skipped during validation.
  The base_url check is now scoped only to backend-credential selection.

- lib/hire-agent.sh: Add sed escaping for ANTHROPIC_BASE_URL value before
  sed substitution (same pattern as ANTHROPIC_API_KEY at line 256).

Fixes AI review BLOCKER and MINOR issues on PR #866.
2026-04-16 12:29:00 +00:00
Agent
53a1fe397b fix: hire-an-agent does not persist per-agent secrets to .env (#847) 2026-04-16 12:29:00 +00:00
Claude
a835517aea fix: [nomad-step-1] S1.3 — restore --empty guard + drop hardcoded deploy --dry-run (#842)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/nomad-validate Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/nomad-validate Pipeline was successful
ci/woodpecker/pr/secret-scan Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful
Picks up from abandoned PR #859 (branch fix/issue-842 @ 6408023). Two
bugs in the prior art:

1. The `--empty is only valid with --backend=nomad` guard was removed
   when the `--with`/mutually-exclusive guards were added. This regressed
   test #6 in tests/disinto-init-nomad.bats:102 — `disinto init
   --backend=docker --empty --dry-run` was exiting 0 instead of failing.
   Restored alongside the new guards.

2. `_disinto_init_nomad` unconditionally appended `--dry-run` to the
   real-run deploy_cmd, so even `disinto init --backend=nomad --with
   forgejo` (no --dry-run) would only echo the deploy plan instead of
   actually running nomad job run. That violates the issue's acceptance
   criteria ("Forgejo job deploys", "curl http://localhost:3000/api/v1/version
   returns 200"). Removed.

All 17 tests in tests/disinto-init-nomad.bats now pass; shellcheck clean.
2026-04-16 12:21:28 +00:00
Agent
719fdaeac4 fix: [nomad-step-1] S1.3 — wire --with forgejo into bin/disinto init --backend=nomad (#842) 2026-04-16 12:19:51 +00:00
Claude
72ed1f112d fix: [nomad-step-0] S0.1-fix — bin/disinto swallows --backend=nomad as repo_url positional (#835)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/nomad-validate Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/nomad-validate Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful
Why: disinto_init() consumed $1 as repo_url before the argparse loop ran,
so `disinto init --backend=nomad --empty` had --backend=nomad swallowed
into repo_url, backend stayed at its "docker" default, and the --empty
validation then produced the nonsense "--empty is only valid with
--backend=nomad" error — flagged during S0.1 end-to-end verification on
a fresh LXC. nomad backend takes no positional anyway; the LXC already
has the repo cloned by the operator.

Change: only consume $1 as repo_url if it doesn't start with "--", then
defer the "repo URL required" check to after argparse (so the docker
path still errors with a helpful message on a missing positional, not
"Unknown option: --backend=docker").

Verified acceptance criteria:
  1. init --backend=nomad --empty             → dispatches to nomad
  2. init --backend=nomad --empty --dry-run   → 9-step plan, exit 0
  3. init <repo-url>                          → docker path unchanged
  4. init                                     → "repo URL required"
  5. init --backend=docker                    → "repo URL required"
                                                (not "Unknown option")
  6. shellcheck clean

Tests: 4 new regression cases in tests/disinto-init-nomad.bats covering
flag-first nomad invocation (both --flag=value and --flag value forms),
no-args docker default, and --backend=docker missing-positional error
path. Full suite: 10/10 pass.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 09:19:36 +00:00
Claude
5150f8c486 fix: [nomad-step-0] S0.5 — Woodpecker CI validation for nomad/vault artifacts (#825)
Some checks failed
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/nomad-validate Pipeline failed
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/nomad-validate Pipeline failed
ci/woodpecker/pr/secret-scan Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline failed
Locks in static validation for every Nomad+Vault artifact before it can
merge. Four fail-closed steps in .woodpecker/nomad-validate.yml, gated
to PRs touching nomad/, lib/init/nomad/, or bin/disinto:

  1. nomad config validate nomad/server.hcl nomad/client.hcl
  2. vault operator diagnose -config=nomad/vault.hcl -skip=storage -skip=listener
  3. shellcheck --severity=warning lib/init/nomad/*.sh bin/disinto
  4. bats tests/disinto-init-nomad.bats — dispatcher smoke tests

bin/disinto picks up pre-existing SC2120 warnings on three passthrough
wrappers (generate_agent_docker, generate_caddyfile, generate_staging_index);
annotated with shellcheck disable=SC2120 so the new pipeline is clean
without narrowing the warning for future code.

Pinned image versions (hashicorp/nomad:1.9.5, hashicorp/vault:1.18.5)
match lib/init/nomad/install.sh — bump both or neither.

nomad/AGENTS.md documents the stack layout, how to add a jobspec in
Step 1, how CI validates it, and the two-place version pinning rule.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 07:54:06 +00:00
Claude
d2c6b33271 fix: [nomad-step-0] S0.4 — disinto init --backend=nomad --empty orchestrator (cluster-up) (#824)
Some checks failed
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/pr/ci Pipeline failed
ci/woodpecker/pr/smoke-init Pipeline failed
Wires S0.1–S0.3 into a single idempotent bring-up script and replaces
the S0.1 stub in _disinto_init_nomad so `disinto init --backend=nomad
--empty` produces a running empty single-node cluster on a fresh box.

lib/init/nomad/cluster-up.sh (new):
  1. install.sh                (nomad + vault binaries)
  2. systemd-nomad.sh          (unit + enable, not started)
  3. systemd-vault.sh          (unit + vault.hcl + enable)
  4. host-volume dirs under /srv/disinto/* (matching nomad/client.hcl)
  5. /etc/nomad.d/{server,client}.hcl (content-compare before write)
  6. vault-init.sh             (first-run init + unseal + persist keys)
  7. systemctl start vault     (poll until unsealed; fail-fast on
                                is-failed)
  8. systemctl start nomad     (poll until ≥1 node ready)
  9. /etc/profile.d/disinto-nomad.sh (VAULT_ADDR + NOMAD_ADDR for
                                      interactive shells)
  Re-running on a healthy box is a no-op — each sub-step is itself
  idempotent and steps 7/8 fast-path when already active + healthy.
  `--dry-run` prints the full step list and exits 0.

bin/disinto:
  - _disinto_init_nomad: replaces the S0.1 stub. Invokes cluster-up.sh
    directly (as root) or via `sudo -n` otherwise. Both `--empty` and
    the default (no flag) call cluster-up.sh today; Step 1 will branch
    on $empty to gate job deployment. --dry-run forwards through.
  - disinto_init: adds `--empty` flag parsing; rejects `--empty`
    combined with `--backend=docker` explicitly instead of silently
    ignoring it.
  - usage: documents `--empty` and drops the "stub, S0.1" annotation
    from --backend.

Closes #824.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 07:22:15 +00:00
Claude
de00400bc4 fix: [nomad-step-0] S0.1 — add --backend=nomad flag + stub to bin/disinto init (#821)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful
Lands the dispatch entry point for the Nomad+Vault migration. The docker
path remains the default and is byte-for-byte unchanged. The new
`--backend=nomad` value routes to a `_disinto_init_nomad` stub that fails
loud (exit 99) so no silent misrouting can happen while S0.2–S0.5 fill in
the real implementation. With `--dry-run --backend=nomad` the stub reports
status and exits 0 so dry-run callers (P7) don't see a hard failure.

- New `--backend <value>` flag (accepts `docker` | `nomad`); supports
  both `--backend nomad` and `--backend=nomad` forms.
- Invalid backend values are rejected with a clear error.
- `_disinto_init_nomad` lives next to `disinto_init` so future S0.x
  issues only need to fill in this function — flag parsing and dispatch
  stay frozen.
- `--help` lists the flag and both values.
- `shellcheck bin/disinto` introduces no new findings beyond the
  pre-existing baseline.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 05:43:35 +00:00
Claude
9d8f322005 fix: [nomad-prep] P7 — make disinto init idempotent + add --dry-run (#800)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful
Make `disinto init` safe to re-run on the same box:

- Store admin token as FORGE_ADMIN_TOKEN in .env; preserve on re-run
  (previously deleted and recreated every run, churning DB state)
- Fix human token creation: use admin_pass for basic-auth since
  human_user == admin_user (previously used a random password that
  never matched the actual user password, so HUMAN_TOKEN was never
  created successfully)
- Preserve HUMAN_TOKEN in .env on re-run (same pattern as bot tokens)
- Bot tokens were already idempotent (preserved unless --rotate-tokens)

Add --dry-run flag that reports every intended action (file writes,
API calls, docker commands) based on current state, then exits 0
without touching state. Useful for CI gating and cutover confidence.

Update smoke test:
- Add dry-run test (verifies exit 0 and no .env modification)
- Add idempotency state diff (verifies .env is unchanged on re-run)
- Verify FORGE_ADMIN_TOKEN and HUMAN_TOKEN are stored in .env

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 22:37:22 +00:00
Claude
f90702f930 fix: infra: _regen_file does not restore stash if generator fails — compose file lost at temp path (#784)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 18:55:51 +00:00
Claude
88676e65ae fix: feat: consolidate secret stores — single granular secrets/*.enc, deprecate .env.vault.enc (#777)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 18:35:03 +00:00
Claude
5dda6dc8e9 fix: feat: disinto secrets add — accept piped stdin for non-interactive imports (#776)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 14:08:28 +00:00
Claude
53ce7ad475 fix: infra: disinto up should regenerate compose/Caddyfile from lib/generators.sh and reconcile orphans before docker compose up -d (#770)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful
- Add `_regen_file` helper that idempotently regenerates a file: moves
  existing file aside, runs the generator, compares output byte-for-byte,
  and either restores the original (preserving mtime) or keeps the new
  version with a `.prev` backup.
- `disinto_up` now calls `generate_compose` and `generate_caddyfile`
  before bringing the stack up, ensuring generator changes are applied.
- Pass `--build --remove-orphans` to `docker compose up -d` so image
  rebuilds and orphan container cleanup happen automatically.
- Add `--no-regen` escape hatch that skips regeneration and prints a
  warning for operators debugging generators or testing hand-edits.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 11:12:38 +00:00
d0c0ef724a Merge pull request 'fix: infra: agents-llama (local-Qwen dev agent) is hand-added to docker-compose.yml — move into lib/generators.sh as a flagged service (#769)' (#780) from fix/issue-769 into main
Some checks failed
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/ops-filer Pipeline failed
2026-04-15 10:09:43 +00:00
Claude
0104ac06a8 fix: infra: agents-llama (local-Qwen dev agent) is hand-added to docker-compose.yml — move into lib/generators.sh as a flagged service (#769)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 09:58:44 +00:00
Claude
92f19cb2b3 feat: publish versioned agent images — compose should use image: not build: (#429)
- Generated compose now uses `image: ghcr.io/disinto/{agents,edge}` instead
  of `build:` directives; `disinto init --build` restores local-build mode
- Add VOLUME declarations to agents, reproduce, and edge Dockerfiles
- Add CI pipeline (.woodpecker/publish-images.yml) to build and push images
  to ghcr.io/disinto on tag events
- Mount projects/, .env, and state/ into agents container for runtime config
- Skip pre-build binary download when compose uses registry images

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 09:24:05 +00:00
Claude
30e19f71e2 fix: vision(#623): Forgejo OAuth gate for disinto-chat (#708)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful
Gate /chat/* behind Forgejo OAuth2 authorization-code flow.

- Extract generic _create_forgejo_oauth_app() helper in lib/ci-setup.sh;
  Woodpecker OAuth becomes a thin wrapper, chat gets its own app.
- bin/disinto init now creates TWO OAuth apps (woodpecker-ci + disinto-chat)
  and writes CHAT_OAUTH_CLIENT_ID / CHAT_OAUTH_CLIENT_SECRET to .env.
- docker/chat/server.py: new routes /chat/login (→ Forgejo authorize),
  /chat/oauth/callback (code→token exchange, user allowlist check, session
  cookie). All other /chat/* routes require a valid session or redirect to
  /chat/login. Session store is in-memory with 24h TTL.
- lib/generators.sh: pass FORGE_URL, CHAT_OAUTH_CLIENT_ID,
  CHAT_OAUTH_CLIENT_SECRET, EDGE_TUNNEL_FQDN, DISINTO_CHAT_ALLOWED_USERS
  to the chat container environment.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 01:52:16 +00:00
Claude
bfdf252239 fix: vision(#623): Caddy subpath routing skeleton + Forgejo/Woodpecker host reconfig (#704)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful
2026-04-11 23:48:54 +00:00
Claude
6589c761ba fix: refactor: lib/env.sh — split into a defined-surface shared lib; entrypoints own context-specific paths (#674)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 13:21:30 +00:00
Claude
59e71a285b fix: disinto init: bootstrap shared CLAUDE_CONFIG_DIR for OAuth lock coherence (#641)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 20:15:35 +00:00
Claude
cd115a51a3 fix: edge control critical bugs - .env dedup, authorized_keys, Caddy routes
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful
- Fix .env write in edge register to use single grep -Ev + mv pattern (not three-pass append)
- Fix register.sh to source authorized_keys.sh and call rebuild_authorized_keys directly
- Fix caddy.sh remove_route to use jq to find route index by host match
- Fix authorized_keys.sh operator precedence: { [ -z ] || [ -z ]; } && continue
- Fix install.sh Caddyfile to use { admin localhost:2019 } global options
- Fix deregister and status SSH to use StrictHostKeyChecking=accept-new
2026-04-10 19:26:41 +00:00
Claude
cf3c63bf68 fix: SSH accept-new and DOMAIN_SUFFIX configuration for edge control
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful
- Changed SSH StrictHostKeyChecking from 'no' to 'accept-new' for better security
- Fixed .env write logic with proper deduplication before appending
- Fixed deregister .env cleanup to use single grep pattern
- Added --domain-suffix option to install.sh
- Removed no-op DOMAIN_SUFFIX sed from install.sh
- Changed cp -n to cp for idempotent script updates
- Fixed authorized_keys.sh SCRIPT_DIR to point to lib/
- Fixed Caddy route management to use POST /routes instead of /load
- Fixed Caddy remove_route to find route by host match, not hardcoded index
2026-04-10 19:09:43 +00:00
Claude
637ea66a5a fix: feat: disinto edge command + SSH-forced-command control plane in tools/edge-control/ (#621)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful
2026-04-10 18:45:06 +00:00
Claude
fd67a6afc6 fix: feat: disinto init — prompt for disinto-admin password instead of hardcoding it (#620)
Some checks failed
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline failed
2026-04-10 18:19:16 +00:00
Claude
c3074e83fc fix: fix: agents container should clone project repo on first startup; treat init's host clone as operator-side only (#605)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful
2026-04-10 17:19:33 +00:00