hire-an-agent now adds the new Forgejo user as a `write` collaborator on
`$FORGE_REPO` right after the token step, mirroring the collaborator setup
lib/forge-setup.sh applies to the canonical bot users. Without this, a
freshly hired agent's PATCH to assign itself an issue returned 403 Forbidden
and the dev-agent polled forever logging "claim lost to <none>".
issue_claim() now captures the PATCH HTTP status via `-w '%{http_code}'`
instead of swallowing failures with `curl -sf ... || return 1`. A 403 (or
any non-2xx) now surfaces a distinct log line naming the code — the missing
collaborator root cause would have been diagnosable in seconds instead of
minutes.
Also updates the lib-issue-claim bats mock to handle the new `-w` flag and
adds a regression test covering the HTTP-error log surfacing path.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Why: disinto_init() consumed $1 as repo_url before the argparse loop ran,
so `disinto init --backend=nomad --empty` had --backend=nomad swallowed
into repo_url, backend stayed at its "docker" default, and the --empty
validation then produced the nonsense "--empty is only valid with
--backend=nomad" error — flagged during S0.1 end-to-end verification on
a fresh LXC. nomad backend takes no positional anyway; the LXC already
has the repo cloned by the operator.
Change: only consume $1 as repo_url if it doesn't start with "--", then
defer the "repo URL required" check to after argparse (so the docker
path still errors with a helpful message on a missing positional, not
"Unknown option: --backend=docker").
Verified acceptance criteria:
1. init --backend=nomad --empty → dispatches to nomad
2. init --backend=nomad --empty --dry-run → 9-step plan, exit 0
3. init <repo-url> → docker path unchanged
4. init → "repo URL required"
5. init --backend=docker → "repo URL required"
(not "Unknown option")
6. shellcheck clean
Tests: 4 new regression cases in tests/disinto-init-nomad.bats covering
flag-first nomad invocation (both --flag=value and --flag value forms),
no-args docker default, and --backend=docker missing-positional error
path. Full suite: 10/10 pass.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Forgejo's assignees PATCH is last-write-wins, so two dev agents polling
concurrently could both observe .assignee == null at the pre-check, both
PATCH, and the loser would silently "succeed" and proceed to implement
the same issue — colliding at the PR/branch stage.
Re-read the assignee after the PATCH and bail out if it isn't self.
Label writes are moved AFTER this verification so a losing claim leaves
no stray in-progress label to roll back.
Adds tests/lib-issue-claim.bats covering the three paths:
- happy path (single agent, re-read confirms self)
- lost race (re-read shows another agent — returns 1, no labels added)
- pre-check skip (initial GET already shows another agent)
Prerequisite for the LLAMA_BOTS parametric refactor that will run N
dev containers against the same project.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The bin/disinto flag loop has separate cases for `--backend value`
(space-separated) and `--backend=value`; a regression in either would
silently route to the docker default path. Per the "stub-first dispatch"
lesson, silent misrouting during a migration is the worst failure mode —
covering both forms closes that gap.
Also triggers a retry of the smoke-init pipeline step, which hit a known
Forgejo branch-indexing flake on pipeline #913 (same flake cleared on
retry for PR #829 pipelines #906 → #908); unrelated to the nomad-validate
changes, which went all-green in #913.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Locks in static validation for every Nomad+Vault artifact before it can
merge. Four fail-closed steps in .woodpecker/nomad-validate.yml, gated
to PRs touching nomad/, lib/init/nomad/, or bin/disinto:
1. nomad config validate nomad/server.hcl nomad/client.hcl
2. vault operator diagnose -config=nomad/vault.hcl -skip=storage -skip=listener
3. shellcheck --severity=warning lib/init/nomad/*.sh bin/disinto
4. bats tests/disinto-init-nomad.bats — dispatcher smoke tests
bin/disinto picks up pre-existing SC2120 warnings on three passthrough
wrappers (generate_agent_docker, generate_caddyfile, generate_staging_index);
annotated with shellcheck disable=SC2120 so the new pipeline is clean
without narrowing the warning for future code.
Pinned image versions (hashicorp/nomad:1.9.5, hashicorp/vault:1.18.5)
match lib/init/nomad/install.sh — bump both or neither.
nomad/AGENTS.md documents the stack layout, how to add a jobspec in
Step 1, how CI validates it, and the two-place version pinning rule.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Make `disinto init` safe to re-run on the same box:
- Store admin token as FORGE_ADMIN_TOKEN in .env; preserve on re-run
(previously deleted and recreated every run, churning DB state)
- Fix human token creation: use admin_pass for basic-auth since
human_user == admin_user (previously used a random password that
never matched the actual user password, so HUMAN_TOKEN was never
created successfully)
- Preserve HUMAN_TOKEN in .env on re-run (same pattern as bot tokens)
- Bot tokens were already idempotent (preserved unless --rotate-tokens)
Add --dry-run flag that reports every intended action (file writes,
API calls, docker commands) based on current state, then exits 0
without touching state. Useful for CI gating and cutover confidence.
Update smoke test:
- Add dry-run test (verifies exit 0 and no .env modification)
- Add idempotency state diff (verifies .env is unchanged on re-run)
- Verify FORGE_ADMIN_TOKEN and HUMAN_TOKEN are stored in .env
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fix off-by-one in mock admin/users/{username}/repos path extraction
(parts[4] was 'users', not the username — should be parts[5])
- Change _install_cron_impl to return 1 instead of exit 1 when crontab
is missing, so cron failure doesn't abort entire init
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add cleanup trap to smoke-init.sh that kills all mock-forgejo.py processes
on exit (success or failure). Also ensure cleanup at test start removes
any leftover processes from prior runs.
In .woodpecker/smoke-init.yml:
- Store the PID of the mock-forgejo.py background process
- Kill the process after smoke test completes
This prevents accumulation of orphaned Python processes that caused
OOM issues (2881 processes consuming 7.45GB RAM).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Fix collaborator PUT: use parts[7] instead of parts[6]
- Fix user/repos: store username in token object and use it for lookup
- Fix /mock/shutdown: strip leading slash unconditionally
- Fix SIGTERM: call server.shutdown() in a thread
- Use socket module constants for setsockopt
- Remove duplicate pattern
The mock docker in smoke-init.sh only handled 'admin user create' and
'admin user list'. Add a 'change-password' handler that PATCHes the
user via the Forgejo admin API to clear must_change_password.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The mock crontab file was not being created despite PATH precedence
working correctly. Replace the mock with the real BusyBox crontab
already available in the Forgejo Alpine image. Verify cron entries
via 'crontab -l' output instead of checking a mock state file.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Forgejo's admin API POST /admin/users may not honor
must_change_password:false in the request body. Previously only admin
users got a PATCH (to set admin:true), which incidentally cleared
must_change_password. Bot users had no PATCH, so basic auth for token
creation returned 401.
Now every mock-created user gets a PATCH to explicitly set
must_change_password:false, fixing bot token creation.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The install endpoint POST returned 404 because FORGEJO__database__DB_TYPE
env var auto-configured Forgejo, bypassing install mode.
Fix: run the Forgejo image as the step container instead of a service.
This gives CLI access to `forgejo admin user create` for bootstrap
admin setup — no install endpoint needed.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add tests/smoke-init.sh — an end-to-end smoke test that runs
disinto init --bare --yes against a real Forgejo instance
(started as a Woodpecker service container).
The test validates:
- Forgejo API responds after init
- Admin and bot users created with tokens
- Repo created with labels on Forgejo
- Project TOML generated correctly
- .env written with FORGE_TOKEN and FORGE_REVIEW_TOKEN
- Cron entries installed (dev-poll, review-poll, gardener)
Uses mock binaries for docker (routes user creation to Forgejo
admin API), claude, tmux, and crontab to run in CI without
Docker-in-Docker.
Wired into CI via .woodpecker/smoke-init.yml (separate pipeline
with Forgejo service, runs on push and pull_request).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>