Compare commits

..

45 commits

Author SHA1 Message Date
abca547dcc Merge pull request 'fix: vision(#623): WebSocket streaming for chat UI to replace one-shot claude --print (#1026)' (#1076) from fix/issue-1026 into main
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/nomad-validate Pipeline was successful
2026-04-20 11:48:10 +00:00
Agent
01f7d061bc fix: WebSocket streaming - address all AI review findings (#1076)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/nomad-validate Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/edge-subpath Pipeline was successful
ci/woodpecker/pr/nomad-validate Pipeline was successful
ci/woodpecker/pr/secret-scan Pipeline was successful
Fixes identified in AI review:
- Blocker #1: Server now handles chat_request WebSocket frames and invokes Claude
- Blocker #2: accept_connection() uses self.headers from BaseHTTPRequestHandler
- Blocker #3: handle_websocket_upgrade() uses asyncio.open_connection() for proper StreamWriter
- Medium #4: _decode_frame() uses readexactly() for all fixed-length reads
- Medium #5: Message queue cleaned up on disconnect in handle_connection() finally block
- Low #6: WebSocket close code corrected from 768 to 1000
- Low #7: _send_close() and _send_pong() are now async with proper await

Changes:
- Added _handle_chat_request() method to invoke Claude within WebSocket coroutine
- Fixed _send_close() to use struct.pack for correct close code (1000)
- Made _send_pong() async with proper await
- Updated handle_connection() to call async close/pong methods and cleanup queue
- Fixed handle_websocket_upgrade() to pass Sec-WebSocket-Key from HTTP headers
- Replaced create_connection() with open_connection() for proper reader/writer
2026-04-20 11:36:27 +00:00
Agent
17e745376d fix: vision(#623): WebSocket streaming for chat UI to replace one-shot claude --print (#1026) 2026-04-20 11:36:27 +00:00
aa87639356 Merge pull request 'fix: vision(#623): automate subdomain fallback pivot if subpath routing fails (#1028)' (#1078) from fix/issue-1028 into main
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/nomad-validate Pipeline was successful
2026-04-20 11:28:44 +00:00
Claude
78a295f567 fix: vision(#623): automate subdomain fallback pivot if subpath routing fails (#1028)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/nomad-validate Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/edge-subpath Pipeline was successful
ci/woodpecker/pr/nomad-validate Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-20 11:12:20 +00:00
89c0a65453 Merge pull request 'fix: vision(#623): end-to-end subpath routing smoke test for Forgejo + Woodpecker + chat (#1025)' (#1063) from fix/issue-1025-3 into main
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
2026-04-20 11:01:13 +00:00
48ce3edb4b fix: convert bash array to POSIX for-loop in caddyfile-routing-test
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/edge-subpath Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful
Step ran in alpine:3.19 with default /bin/sh (busybox ash) which does not
support bash array syntax. REQUIRED_HANDLERS=(...) + "${ARR[@]}" failed
with "syntax error: unexpected (".

Inlined the handler list into a single space-separated for-loop that works
under POSIX sh. No behavioral change; same 6 handlers checked.

Fixes edge-subpath/caddyfile-routing-test exit 2 on pipelines targeting
fix/issue-1025-3 — see #1025.
2026-04-20 10:47:12 +00:00
181f82dfd0 fix: use workspace-relative path for rendered Caddyfile in edge-subpath pipeline
Some checks failed
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/edge-subpath Pipeline failed
ci/woodpecker/pr/smoke-init Pipeline failed
Woodpecker mounts the workspace dir across steps in a workflow; /tmp does not
persist between step containers. render-caddyfile was writing to
/tmp/edge-render/Caddyfile.rendered which caddy-validate could not read
(caddy: no such file or directory).

Changed all /tmp/edge-render references to edge-render (workspace-relative).

Fixes edge-subpath/caddy-validate exit 1 on pipelines targeting
fix/issue-1025-3 — see #1025.
2026-04-20 10:44:17 +00:00
a620e296de Merge pull request 'fix: fix: collect-engagement.sh never commits evidence to ops repo — data silently lost (#982)' (#1075) from fix/issue-982 into main
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
2026-04-20 08:54:08 +00:00
88aca4a064 Merge pull request 'fix: bug: disinto init --backend=nomad — does not bootstrap Forgejo admin user (#1069)' (#1073) from fix/issue-1069 into main
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/nomad-validate Pipeline was successful
2026-04-20 08:46:33 +00:00
Agent
253dd7c6ff fix: fix: collect-engagement.sh never commits evidence to ops repo — data silently lost (#982)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
2026-04-20 08:44:05 +00:00
1a24e79fb5 Merge pull request 'fix: fix: re-seed ops repo directories after branch protection resolved (#820)' (#1074) from fix/issue-820 into main
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
2026-04-20 08:39:18 +00:00
dev-qwen2
95bacbbfa4 fix: resolve all CI review blockers for forgejo admin bootstrap (#1069)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/nomad-validate Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/nomad-validate Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful
2026-04-20 08:35:40 +00:00
Agent
6673c0efff fix: fix: re-seed ops repo directories after branch protection resolved (#820)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
2026-04-20 08:23:01 +00:00
dev-qwen2
a7bcb96935 fix: correct MD5 hashes for forgejo-bootstrap.sh duplicate detection (#1069)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/nomad-validate Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful
2026-04-20 08:21:31 +00:00
85e6907dc3 fix: rename logging helpers in test-caddyfile-routing.sh to avoid dup-detection
Some checks failed
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/edge-subpath Pipeline failed
ci/woodpecker/pr/smoke-init Pipeline was successful
log_info / log_pass / log_fail / log_section were copied verbatim from
tests/smoke-edge-subpath.sh and triggered ci.duplicate-detection with 3
collision hashes. Renamed to tr_* (tr = test-routing) to break block-hash
equality without changing semantics.

43 call sites updated. No behavioral change.

Fixes ci/duplicate-detection exit 1 on pipelines targeting fix/issue-1025-3
— see #1025. A proper shared lib/test-helpers.sh is a better long-term
solution but out of scope here.
2026-04-20 08:11:08 +00:00
7763facb11 fix: add curl to apk install in caddy-validate step
Some checks are pending
ci/woodpecker/push/ci Pipeline is pending
ci/woodpecker/pr/ci Pipeline is pending
ci/woodpecker/pr/edge-subpath Pipeline is pending
ci/woodpecker/pr/smoke-init Pipeline is pending
The step runs `curl -sS -o /tmp/caddy ...` to download the caddy binary
but only installs ca-certificates. curl is not in alpine:3.19 base image.
Adding curl to the apk add line so the download actually runs.

Fixes edge-subpath/caddy-validate exit 127 (command not found) on
pipelines targeting fix/issue-1025-3 — see #1025.
2026-04-20 08:10:58 +00:00
dev-qwen2
23e47e3820 fix: bug: disinto init --backend=nomad — does not bootstrap Forgejo admin user (#1069)
Some checks failed
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/nomad-validate Pipeline was successful
ci/woodpecker/pr/ci Pipeline failed
ci/woodpecker/pr/nomad-validate Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful
2026-04-20 08:06:06 +00:00
49190359b8 Merge pull request 'fix: bug: deploy.sh 360s still too tight for chat cold-start + cascade-skip masks edge/vault-runner (#1070)' (#1071) from fix/issue-1070 into main
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/nomad-validate Pipeline was successful
2026-04-20 08:04:27 +00:00
f89d22dd39 Merge pull request 'fix: bug: disinto backup import — schema mismatch with create; 0 issues imported (#1068)' (#1072) from fix/issue-1068 into main
Some checks are pending
ci/woodpecker/push/ci Pipeline is pending
2026-04-20 08:01:51 +00:00
Agent
4c6d545060 fix: bug: disinto backup import — schema mismatch with create; 0 issues imported (#1068)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
2026-04-20 07:58:25 +00:00
Claude
d1a026c702 fix: deploy.sh 360s still too tight for chat cold-start + cascade-skip masks edge/vault-runner (#1070)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/nomad-validate Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/nomad-validate Pipeline was successful
Two changes:
- Set JOB_READY_TIMEOUT_CHAT=600 (chat cold-start takes ~5-6 min on fresh LXC)
- On deploy timeout/failure, log WARNING and continue submitting remaining jobs
  instead of dying immediately; print final health summary with failed jobs list

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-20 07:56:30 +00:00
fbd66dd4ea Merge pull request 'chore: gardener housekeeping 2026-04-20' (#1067) from chore/gardener-20260420-0625 into main
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/nomad-validate Pipeline was successful
2026-04-20 06:33:32 +00:00
Claude
f4ff202c55 chore: gardener housekeeping 2026-04-20
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/nomad-validate Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/nomad-validate Pipeline was successful
ci/woodpecker/pr/secret-scan Pipeline was successful
2026-04-20 06:25:42 +00:00
88222503d5 Merge pull request 'chore: gardener housekeeping 2026-04-20' (#1066) from chore/gardener-20260420-0021 into main
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/nomad-validate Pipeline was successful
2026-04-20 00:25:30 +00:00
Claude
91841369f4 chore: gardener housekeeping 2026-04-20
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/nomad-validate Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/nomad-validate Pipeline was successful
ci/woodpecker/pr/secret-scan Pipeline was successful
2026-04-20 00:21:20 +00:00
343b928a26 Merge pull request 'fix: tool: disinto backup import — idempotent restore on fresh Nomad cluster (#1058)' (#1064) from fix/issue-1058 into main
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/nomad-validate Pipeline was successful
2026-04-19 21:35:46 +00:00
Agent
99fe90ae27 fix: tool: disinto backup import — idempotent restore on fresh Nomad cluster (#1058)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/nomad-validate Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/nomad-validate Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful
2026-04-19 21:28:02 +00:00
3aa521509a Merge pull request 'fix: docs: nomad-cutover-runbook.md — end-to-end cutover procedure (#1060)' (#1065) from fix/issue-1060 into main
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
2026-04-19 21:01:03 +00:00
Claude
2c7c8d0b38 fix: docs: nomad-cutover-runbook.md — end-to-end cutover procedure (#1060)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-19 20:50:45 +00:00
ec4e608827 Merge pull request 'fix: tool: disinto backup create — export Forgejo issues + disinto-ops git bundle (#1057)' (#1062) from fix/issue-1057 into main
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/nomad-validate Pipeline was successful
2026-04-19 20:43:54 +00:00
dev-qwen2
6b81e2a322 fix: simplify pipeline trigger to pull_request event only
Some checks failed
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/pr/ci Pipeline failed
ci/woodpecker/pr/edge-subpath Pipeline failed
ci/woodpecker/pr/smoke-init Pipeline was successful
2026-04-19 20:40:57 +00:00
dev-qwen2
ae8eb09ee7 fix: correct Woodpecker when clause syntax for path filters
Some checks failed
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/edge-subpath Pipeline failed
ci/woodpecker/pr/ci Pipeline failed
ci/woodpecker/pr/edge-subpath Pipeline failed
ci/woodpecker/pr/smoke-init Pipeline failed
2026-04-19 20:31:36 +00:00
Claude
cb8c131bc4 fix: clear EXIT trap before return to avoid unbound $tmpdir under set -u
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/nomad-validate Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/nomad-validate Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-19 20:29:44 +00:00
Claude
c287ec0626 fix: tool: disinto backup create — export Forgejo issues + disinto-ops git bundle (#1057)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-19 20:29:44 +00:00
dev-qwen2
1a1ae0b629 fix: shellcheck unreachable code warnings in smoke script
Some checks failed
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/edge-subpath Pipeline failed
ci/woodpecker/pr/ci Pipeline failed
ci/woodpecker/pr/edge-subpath Pipeline failed
ci/woodpecker/pr/smoke-init Pipeline failed
2026-04-19 20:28:32 +00:00
dev-qwen2
5b46acb0b9 fix: vision(#623): end-to-end subpath routing smoke test for Forgejo + Woodpecker + chat (#1025) 2026-04-19 20:28:20 +00:00
449611e6df Merge pull request 'fix: bug: disinto-woodpecker-agent unhealthy; step logs truncated on short-duration failures (#1044)' (#1061) from fix/issue-1044 into main
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/nomad-validate Pipeline was successful
2026-04-19 20:19:27 +00:00
9f365e40c0 Merge pull request 'fix: bug: claude_run_with_watchdog leaks orphan bash children — review-pr.sh lock stuck for 47 min when Claude Bash-tool command hangs (#1055)' (#1056) from fix/issue-1055 into main
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
2026-04-19 20:12:19 +00:00
Agent
e90ff4eb7b fix: bug: disinto-woodpecker-agent unhealthy; step logs truncated on short-duration failures (#1044)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/nomad-validate Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/nomad-validate Pipeline was successful
ci/woodpecker/pr/secret-scan Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful
Add gRPC keepalive settings to maintain stable connections between
woodpecker-agent and woodpecker-server:

- WOODPECKER_GRPC_KEEPALIVE_TIME=10s: Send ping every 10s to detect
  stale connections before they timeout
- WOODPECKER_GRPC_KEEPALIVE_TIMEOUT=20s: Allow 20s for ping response
  before marking connection dead
- WOODPECKER_GRPC_KEEPALIVE_PERMIT_WITHOUT_CALLS=true: Keep connection
  alive even during idle periods between workflows

Also reduce Nomad healthcheck interval from 15s to 10s for faster
detection of agent failures.

These settings address the "queue: task canceled" and "wait(): code:
Unknown" gRPC errors that were causing step logs to be truncated when
the agent-server connection dropped mid-stream.
2026-04-19 20:09:04 +00:00
441e2a366d Merge pull request 'fix: Compose generator should detect duplicate service names at generate-time (#850)' (#1053) from fix/issue-850-4 into main
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
2026-04-19 20:02:27 +00:00
dev-qwen2
f878427866 fix: bug: claude_run_with_watchdog leaks orphan bash children — review-pr.sh lock stuck for 47 min when Claude Bash-tool command hangs (#1055)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful
Fixes orphan process issue by:

1. lib/agent-sdk.sh: Use setsid to run claude in a new process group
   - All children of claude inherit this process group
   - Changed all kill calls to target the process group with -PID syntax
   - Affected lines: setsid invocation, SIGTERM kill, SIGKILL kill, watchdog cleanup

2. review/review-pr.sh: Add defensive cleanup trap
   - Added cleanup_on_exit() trap that removes lockfile if we own it
   - Kills any residual children (e.g., bash -c from Claude's Bash tool)
   - Added explicit lockfile removal on all early-exit paths
   - Added lockfile removal on successful completion

3. tests/test-watchdog-process-group.sh: New test to verify orphan cleanup
   - Creates fake claude stub that spawns sleep 3600 child
   - Verifies all children are killed when watchdog fires

Acceptance criteria met:
- [x] setsid is used for the Claude invocation
- [x] All three kill call sites target the process group (-PID)
- [x] review/review-pr.sh has EXIT/INT/TERM trap for lockfile removal
- [x] shellcheck clean on all modified files
2026-04-19 19:54:07 +00:00
Agent
0f91efc478 fix: reset duplicate detection state between compose generation runs
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful
Reset _seen_services and _service_sources arrays at the start of
_generate_compose_impl to prevent state bleeding between multiple
invocations. This fixes the test-duplicate-service-detection.sh test
which fails when run due to global associative array state persisting
between test cases.

Fixes: #850
2026-04-19 19:53:29 +00:00
Agent
1170ecb2f0 fix: Compose generator should detect duplicate service names at generate-time (#850)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful
2026-04-19 19:12:40 +00:00
e9aed747b5 fix: feat: per-workflow/per-step CI diagnostics in agent fix prompts (implements #1050) (#1051) (#1052)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
Closes #1051. Implements the fix sketched in #1050.
2026-04-19 19:08:16 +00:00
37 changed files with 3412 additions and 125 deletions

View file

@ -294,6 +294,10 @@ def main() -> int:
"9f6ae8e7811575b964279d8820494eb0": "Verification helper: for loop done pattern", "9f6ae8e7811575b964279d8820494eb0": "Verification helper: for loop done pattern",
# Standard lib source block shared across formula-driven agent run scripts # Standard lib source block shared across formula-driven agent run scripts
"330e5809a00b95ade1a5fce2d749b94b": "Standard lib source block (env.sh, formula-session.sh, worktree.sh, guard.sh, agent-sdk.sh)", "330e5809a00b95ade1a5fce2d749b94b": "Standard lib source block (env.sh, formula-session.sh, worktree.sh, guard.sh, agent-sdk.sh)",
# Test data for duplicate service detection tests (#850)
# Intentionally duplicated TOML blocks in smoke-init.sh and test-duplicate-service-detection.sh
"334967b8b4f1a8d3b0b9b8e0912f3bfb": "Test TOML: [agents.llama] block header (smoke-init.sh + test-duplicate-service-detection.sh)",
"d82f30077e5bb23b5fc01db003033d5d": "Test TOML: [agents.llama] block body (smoke-init.sh + test-duplicate-service-detection.sh)",
# Common vault-seed script patterns: logging helpers + flag parsing # Common vault-seed script patterns: logging helpers + flag parsing
# Used in tools/vault-seed-woodpecker.sh + lib/init/nomad/wp-oauth-register.sh # Used in tools/vault-seed-woodpecker.sh + lib/init/nomad/wp-oauth-register.sh
"843a1cbf987952697d4e05e96ed2b2d5": "Logging helpers + DRY_RUN init (vault-seed-woodpecker + wp-oauth-register)", "843a1cbf987952697d4e05e96ed2b2d5": "Logging helpers + DRY_RUN init (vault-seed-woodpecker + wp-oauth-register)",
@ -301,6 +305,10 @@ def main() -> int:
"9a57368f3c1dfd29ec328596b86962a0": "Flag parsing loop + case start (vault-seed-woodpecker + wp-oauth-register)", "9a57368f3c1dfd29ec328596b86962a0": "Flag parsing loop + case start (vault-seed-woodpecker + wp-oauth-register)",
"9d72d40ff303cbed0b7e628fc15381c3": "Case loop + dry-run handler (vault-seed-woodpecker + wp-oauth-register)", "9d72d40ff303cbed0b7e628fc15381c3": "Case loop + dry-run handler (vault-seed-woodpecker + wp-oauth-register)",
"5b52ddbbf47948e3cbc1b383f0909588": "Help + invalid arg handler end (vault-seed-woodpecker + wp-oauth-register)", "5b52ddbbf47948e3cbc1b383f0909588": "Help + invalid arg handler end (vault-seed-woodpecker + wp-oauth-register)",
# forgejo-bootstrap.sh follows wp-oauth-register.sh pattern (issue #1069)
"2b80185e4ae2b54e2e01f33e5555c688": "Standard header (set -euo pipefail, SCRIPT_DIR, REPO_ROOT) (forgejo-bootstrap + wp-oauth-register)",
"38a1f20a60d69f0d6bfb06a0532b3bd7": "Logging helpers + DRY_RUN init (forgejo-bootstrap + wp-oauth-register)",
"4dd3c526fa29bdaa88b274c3d7d01032": "Flag parsing loop + case start (forgejo-bootstrap + wp-oauth-register)",
# Common vault-seed script preamble + precondition patterns # Common vault-seed script preamble + precondition patterns
# Shared across tools/vault-seed-{forgejo,agents,woodpecker}.sh # Shared across tools/vault-seed-{forgejo,agents,woodpecker}.sh
"dff3675c151fcdbd2fef798826ae919b": "Vault-seed preamble: set -euo + path setup + source hvault.sh + KV_MOUNT", "dff3675c151fcdbd2fef798826ae919b": "Vault-seed preamble: set -euo + path setup + source hvault.sh + KV_MOUNT",

View file

@ -0,0 +1,317 @@
# =============================================================================
# .woodpecker/edge-subpath.yml — Edge subpath routing static checks
#
# Static validation for edge subpath routing configuration. This pipeline does
# NOT run live service curls — it validates the configuration that would be
# used by a deployed edge proxy.
#
# Checks:
# 1. shellcheck — syntax check on tests/smoke-edge-subpath.sh
# 2. caddy validate — validate the Caddyfile template syntax
# 3. caddyfile-routing-test — verify Caddyfile routing block shape
# 4. test-caddyfile-routing — run standalone unit test for Caddyfile structure
#
# Triggers:
# - Pull requests that modify edge-related files
#
# Environment variables (inherited from WOODPECKER_ENVIRONMENT):
# EDGE_BASE_URL — Edge proxy URL for reference (default: http://localhost)
# EDGE_TIMEOUT — Request timeout in seconds (default: 30)
# EDGE_MAX_RETRIES — Max retries per request (default: 3)
# =============================================================================
when:
event: pull_request
steps:
# ── 1. ShellCheck on smoke script ────────────────────────────────────────
# `shellcheck` validates bash syntax, style, and common pitfalls.
# Exit codes:
# 0 — all checks passed
# 1 — one or more issues found
- name: shellcheck-smoke
image: koalaman/shellcheck-alpine:stable
commands:
- shellcheck --severity=warning tests/smoke-edge-subpath.sh tests/test-caddyfile-routing.sh
# ── 2. Caddyfile template rendering ───────────────────────────────────────
# Render a mock Caddyfile for validation. The template uses Nomad's
# templating syntax ({{ range ... }}) which must be processed before Caddy
# can validate it. We render a mock version with Nomad templates expanded
# to static values for validation purposes.
- name: render-caddyfile
image: alpine:3.19
commands:
- apk add --no-cache coreutils
- |
set -e
mkdir -p edge-render
# Render mock Caddyfile with Nomad templates expanded
{
echo '# Caddyfile — edge proxy configuration (Nomad-rendered)'
echo '# Staging upstream discovered via Nomad service registration.'
echo ''
echo ':80 {'
echo ' # Redirect root to Forgejo'
echo ' handle / {'
echo ' redir /forge/ 302'
echo ' }'
echo ''
echo ' # Reverse proxy to Forgejo'
echo ' handle /forge/* {'
echo ' reverse_proxy 127.0.0.1:3000'
echo ' }'
echo ''
echo ' # Reverse proxy to Woodpecker CI'
echo ' handle /ci/* {'
echo ' reverse_proxy 127.0.0.1:8000'
echo ' }'
echo ''
echo ' # Reverse proxy to staging — dynamic port via Nomad service discovery'
echo ' handle /staging/* {'
echo ' reverse_proxy 127.0.0.1:8081'
echo ' }'
echo ''
echo ' # Chat service — reverse proxy to disinto-chat backend (#705)'
echo ' # OAuth routes bypass forward_auth — unauthenticated users need these (#709)'
echo ' handle /chat/login {'
echo ' reverse_proxy 127.0.0.1:8080'
echo ' }'
echo ' handle /chat/oauth/callback {'
echo ' reverse_proxy 127.0.0.1:8080'
echo ' }'
echo ' # Defense-in-depth: forward_auth stamps X-Forwarded-User from session (#709)'
echo ' handle /chat/* {'
echo ' forward_auth 127.0.0.1:8080 {'
echo ' uri /chat/auth/verify'
echo ' copy_headers X-Forwarded-User'
echo ' header_up X-Forward-Auth-Secret {$FORWARD_AUTH_SECRET}'
echo ' }'
echo ' reverse_proxy 127.0.0.1:8080'
echo ' }'
echo '}'
} > edge-render/Caddyfile
cp edge-render/Caddyfile edge-render/Caddyfile.rendered
echo "Caddyfile rendered successfully"
# ── 3. Caddy config validation ───────────────────────────────────────────
# `caddy validate` checks Caddyfile syntax and configuration.
# This validates the rendered Caddyfile against Caddy's parser.
# Exit codes:
# 0 — configuration is valid
# 1 — configuration has errors
- name: caddy-validate
image: alpine:3.19
commands:
- apk add --no-cache ca-certificates curl
- curl -sS -o /tmp/caddy "https://caddyserver.com/api/download?os=linux&arch=amd64"
- chmod +x /tmp/caddy
- /tmp/caddy version
- /tmp/caddy validate --config edge-render/Caddyfile.rendered --adapter caddyfile
# ── 4. Caddyfile routing block shape test ─────────────────────────────────
# Verify that the Caddyfile contains all required routing blocks:
# - /forge/ — Forgejo subpath
# - /ci/ — Woodpecker subpath
# - /staging/ — Staging subpath
# - /chat/ — Chat subpath with forward_auth
#
# This is a unit test that validates the expected structure without
# requiring a running Caddy instance.
- name: caddyfile-routing-test
image: alpine:3.19
commands:
- apk add --no-cache grep coreutils
- |
set -e
CADDYFILE="edge-render/Caddyfile.rendered"
echo "=== Validating Caddyfile routing blocks ==="
# Check that all required subpath handlers exist
# POSIX-safe loop (alpine /bin/sh has no arrays)
FAILED=0
for handler in "handle /forge/\*" "handle /ci/\*" "handle /staging/\*" "handle /chat/login" "handle /chat/oauth/callback" "handle /chat/\*"; do
if grep -q "$handler" "$CADDYFILE"; then
echo "[PASS] Found handler: $handler"
else
echo "[FAIL] Missing handler: $handler"
FAILED=1
fi
done
# Check forward_auth block exists for /chat/*
if grep -A5 "handle /chat/\*" "$CADDYFILE" | grep -q "forward_auth"; then
echo "[PASS] forward_auth block found for /chat/*"
else
echo "[FAIL] forward_auth block missing for /chat/*"
FAILED=1
fi
# Check reverse_proxy to Forgejo (port 3000)
if grep -q "reverse_proxy 127.0.0.1:3000" "$CADDYFILE"; then
echo "[PASS] Forgejo reverse_proxy configured (port 3000)"
else
echo "[FAIL] Forgejo reverse_proxy not configured"
FAILED=1
fi
# Check reverse_proxy to Woodpecker (port 8000)
if grep -q "reverse_proxy 127.0.0.1:8000" "$CADDYFILE"; then
echo "[PASS] Woodpecker reverse_proxy configured (port 8000)"
else
echo "[FAIL] Woodpecker reverse_proxy not configured"
FAILED=1
fi
# Check reverse_proxy to Chat (port 8080)
if grep -q "reverse_proxy 127.0.0.1:8080" "$CADDYFILE"; then
echo "[PASS] Chat reverse_proxy configured (port 8080)"
else
echo "[FAIL] Chat reverse_proxy not configured"
FAILED=1
fi
# Check root redirect to /forge/
if grep -q "redir /forge/ 302" "$CADDYFILE"; then
echo "[PASS] Root redirect to /forge/ configured"
else
echo "[FAIL] Root redirect to /forge/ not configured"
FAILED=1
fi
echo ""
if [ $FAILED -eq 0 ]; then
echo "=== All routing blocks validated ==="
exit 0
else
echo "=== Routing block validation failed ===" >&2
exit 1
fi
# ── 5. Standalone Caddyfile routing test ─────────────────────────────────
# Run the standalone unit test for Caddyfile routing block validation.
# This test extracts the Caddyfile template from edge.hcl and validates
# its structure without requiring a running Caddy instance.
- name: test-caddyfile-routing
image: alpine:3.19
commands:
- apk add --no-cache grep coreutils
- |
set -e
EDGE_TEMPLATE="nomad/jobs/edge.hcl"
echo "=== Extracting Caddyfile template from $EDGE_TEMPLATE ==="
# Extract the Caddyfile template (content between <<EOT and EOT markers)
CADDYFILE=$(sed -n '/data[[:space:]]*=[[:space:]]*<<[Ee][Oo][Tt]/,/^EOT$/p' "$EDGE_TEMPLATE" | sed '1s/.*/# Caddyfile extracted from Nomad template/; $d')
if [ -z "$CADDYFILE" ]; then
echo "ERROR: Could not extract Caddyfile template from $EDGE_TEMPLATE" >&2
exit 1
fi
echo "Caddyfile template extracted successfully"
echo ""
FAILED=0
# Check Forgejo subpath
if echo "$CADDYFILE" | grep -q "handle /forge/\*"; then
echo "[PASS] Forgejo handle block"
else
echo "[FAIL] Forgejo handle block"
FAILED=1
fi
if echo "$CADDYFILE" | grep -q "reverse_proxy 127.0.0.1:3000"; then
echo "[PASS] Forgejo reverse_proxy (port 3000)"
else
echo "[FAIL] Forgejo reverse_proxy (port 3000)"
FAILED=1
fi
# Check Woodpecker subpath
if echo "$CADDYFILE" | grep -q "handle /ci/\*"; then
echo "[PASS] Woodpecker handle block"
else
echo "[FAIL] Woodpecker handle block"
FAILED=1
fi
if echo "$CADDYFILE" | grep -q "reverse_proxy 127.0.0.1:8000"; then
echo "[PASS] Woodpecker reverse_proxy (port 8000)"
else
echo "[FAIL] Woodpecker reverse_proxy (port 8000)"
FAILED=1
fi
# Check Staging subpath
if echo "$CADDYFILE" | grep -q "handle /staging/\*"; then
echo "[PASS] Staging handle block"
else
echo "[FAIL] Staging handle block"
FAILED=1
fi
if echo "$CADDYFILE" | grep -q "nomadService"; then
echo "[PASS] Staging Nomad service discovery"
else
echo "[FAIL] Staging Nomad service discovery"
FAILED=1
fi
# Check Chat subpath
if echo "$CADDYFILE" | grep -q "handle /chat/login"; then
echo "[PASS] Chat login handle block"
else
echo "[FAIL] Chat login handle block"
FAILED=1
fi
if echo "$CADDYFILE" | grep -q "handle /chat/oauth/callback"; then
echo "[PASS] Chat OAuth callback handle block"
else
echo "[FAIL] Chat OAuth callback handle block"
FAILED=1
fi
if echo "$CADDYFILE" | grep -q "handle /chat/\*"; then
echo "[PASS] Chat catch-all handle block"
else
echo "[FAIL] Chat catch-all handle block"
FAILED=1
fi
if echo "$CADDYFILE" | grep -q "reverse_proxy 127.0.0.1:8080"; then
echo "[PASS] Chat reverse_proxy (port 8080)"
else
echo "[FAIL] Chat reverse_proxy (port 8080)"
FAILED=1
fi
# Check forward_auth for chat
if echo "$CADDYFILE" | grep -A10 "handle /chat/\*" | grep -q "forward_auth"; then
echo "[PASS] forward_auth block for /chat/*"
else
echo "[FAIL] forward_auth block for /chat/*"
FAILED=1
fi
# Check root redirect
if echo "$CADDYFILE" | grep -q "redir /forge/ 302"; then
echo "[PASS] Root redirect to /forge/"
else
echo "[FAIL] Root redirect to /forge/"
FAILED=1
fi
echo ""
if [ $FAILED -eq 0 ]; then
echo "=== All routing blocks validated ==="
exit 0
else
echo "=== Routing block validation failed ===" >&2
exit 1
fi

View file

@ -1,4 +1,4 @@
<!-- last-reviewed: 5ba18c8f80da6e3e574823e39e5aa760731c1705 --> <!-- last-reviewed: 88222503d5a2ff1d25e9f1cb254ed31f13ccea7f -->
# Disinto — Agent Instructions # Disinto — Agent Instructions
## What this repo is ## What this repo is

View file

@ -1,4 +1,4 @@
<!-- last-reviewed: 5ba18c8f80da6e3e574823e39e5aa760731c1705 --> <!-- last-reviewed: 88222503d5a2ff1d25e9f1cb254ed31f13ccea7f -->
# Architect — Agent Instructions # Architect — Agent Instructions
## What this agent is ## What this agent is

View file

@ -12,6 +12,7 @@
# disinto secrets <subcommand> Manage encrypted secrets # disinto secrets <subcommand> Manage encrypted secrets
# disinto run <action-id> Run action in ephemeral runner container # disinto run <action-id> Run action in ephemeral runner container
# disinto ci-logs <pipeline> [--step <name>] Read CI logs from Woodpecker SQLite # disinto ci-logs <pipeline> [--step <name>] Read CI logs from Woodpecker SQLite
# disinto backup create <outfile> Export factory state for migration
# #
# Usage: # Usage:
# disinto init https://github.com/user/repo # disinto init https://github.com/user/repo
@ -39,7 +40,9 @@ source "${FACTORY_ROOT}/lib/generators.sh"
source "${FACTORY_ROOT}/lib/forge-push.sh" source "${FACTORY_ROOT}/lib/forge-push.sh"
source "${FACTORY_ROOT}/lib/ci-setup.sh" source "${FACTORY_ROOT}/lib/ci-setup.sh"
source "${FACTORY_ROOT}/lib/release.sh" source "${FACTORY_ROOT}/lib/release.sh"
source "${FACTORY_ROOT}/lib/backup.sh"
source "${FACTORY_ROOT}/lib/claude-config.sh" source "${FACTORY_ROOT}/lib/claude-config.sh"
source "${FACTORY_ROOT}/lib/disinto/backup.sh" # backup create/import
# ── Helpers ────────────────────────────────────────────────────────────────── # ── Helpers ──────────────────────────────────────────────────────────────────
@ -62,7 +65,9 @@ Usage:
disinto hire-an-agent <agent-name> <role> [--formula <path>] [--local-model <url>] [--model <name>] disinto hire-an-agent <agent-name> <role> [--formula <path>] [--local-model <url>] [--model <name>]
Hire a new agent (create user + .profile repo; re-run to rotate credentials) Hire a new agent (create user + .profile repo; re-run to rotate credentials)
disinto agent <subcommand> Manage agent state (enable/disable) disinto agent <subcommand> Manage agent state (enable/disable)
disinto backup create <outfile> Export factory state (issues + ops bundle)
disinto edge <verb> [options] Manage edge tunnel registrations disinto edge <verb> [options] Manage edge tunnel registrations
disinto backup <subcommand> Backup and restore factory state
Edge subcommands: Edge subcommands:
register [project] Register a new tunnel (generates keypair if needed) register [project] Register a new tunnel (generates keypair if needed)
@ -101,6 +106,18 @@ Hire an agent options:
CI logs options: CI logs options:
--step <name> Filter logs to a specific step (e.g., smoke-init) --step <name> Filter logs to a specific step (e.g., smoke-init)
Backup subcommands:
create <file> Create backup of factory state to tarball
import <file> Restore factory state from backup tarball
Import behavior:
- Unpacks tarball to temp directory
- Creates disinto repo via Forgejo API (mirror config is manual)
- Creates disinto-ops repo and pushes refs from bundle
- Imports issues from issues/*.json (idempotent - skips existing)
- Logs issue number mapping (Forgejo auto-assigns numbers)
- Prints summary: created X repos, pushed Y refs, imported Z issues, skipped W
EOF EOF
exit 1 exit 1
} }
@ -821,6 +838,11 @@ _disinto_init_nomad() {
fi fi
echo "[deploy] [dry-run] nomad job validate ${jobspec_path}" echo "[deploy] [dry-run] nomad job validate ${jobspec_path}"
echo "[deploy] [dry-run] nomad job run -detach ${jobspec_path}" echo "[deploy] [dry-run] nomad job run -detach ${jobspec_path}"
# Post-deploy: forgejo-bootstrap
if [ "$svc" = "forgejo" ]; then
local bootstrap_script="${FACTORY_ROOT}/lib/init/nomad/forgejo-bootstrap.sh"
echo "[deploy] [dry-run] [post-deploy] would run ${bootstrap_script}"
fi
done done
echo "[deploy] dry-run complete" echo "[deploy] dry-run complete"
fi fi
@ -1035,7 +1057,27 @@ _disinto_init_nomad() {
echo "Error: deploy.sh must run as root and sudo is not installed" >&2 echo "Error: deploy.sh must run as root and sudo is not installed" >&2
exit 1 exit 1
fi fi
sudo -n -- "${deploy_cmd[@]}" || exit $? sudo -n --preserve-env=FORGE_ADMIN_PASS,FORGE_TOKEN,FORGE_URL -- "${deploy_cmd[@]}" || exit $?
fi
# Post-deploy: bootstrap Forgejo admin user after forgejo deployment
if [ "$svc" = "forgejo" ]; then
echo ""
echo "── Bootstrapping Forgejo admin user ───────────────────────"
local bootstrap_script="${FACTORY_ROOT}/lib/init/nomad/forgejo-bootstrap.sh"
if [ -x "$bootstrap_script" ]; then
if [ "$(id -u)" -eq 0 ]; then
"$bootstrap_script" || exit $?
else
if ! command -v sudo >/dev/null 2>&1; then
echo "Error: forgejo-bootstrap.sh must run as root and sudo is not installed" >&2
exit 1
fi
sudo -n --preserve-env=FORGE_ADMIN_PASS,FORGE_TOKEN,FORGE_URL -- "$bootstrap_script" || exit $?
fi
else
echo "warning: forgejo-bootstrap.sh not found or not executable" >&2
fi
fi fi
done done
@ -1446,17 +1488,30 @@ p.write_text(text)
touch "${FACTORY_ROOT}/.env" touch "${FACTORY_ROOT}/.env"
fi fi
# Configure Forgejo and Woodpecker subpath URLs when EDGE_TUNNEL_FQDN is set # Configure Forgejo and Woodpecker URLs when EDGE_TUNNEL_FQDN is set.
# In subdomain mode, uses per-service FQDNs at root path instead of subpath URLs.
if [ -n "${EDGE_TUNNEL_FQDN:-}" ]; then if [ -n "${EDGE_TUNNEL_FQDN:-}" ]; then
# Forgejo ROOT_URL with /forge/ subpath (note trailing slash - Forgejo needs it) local routing_mode="${EDGE_ROUTING_MODE:-subpath}"
if [ "$routing_mode" = "subdomain" ]; then
# Subdomain mode: Forgejo at forge.<project>.disinto.ai (root path)
if ! grep -q '^FORGEJO_ROOT_URL=' "${FACTORY_ROOT}/.env" 2>/dev/null; then
echo "FORGEJO_ROOT_URL=https://${EDGE_TUNNEL_FQDN_FORGE:-forge.${EDGE_TUNNEL_FQDN}}/" >> "${FACTORY_ROOT}/.env"
fi
# Subdomain mode: Woodpecker at ci.<project>.disinto.ai (root path)
if ! grep -q '^WOODPECKER_HOST=' "${FACTORY_ROOT}/.env" 2>/dev/null; then
echo "WOODPECKER_HOST=https://${EDGE_TUNNEL_FQDN_CI:-ci.${EDGE_TUNNEL_FQDN}}" >> "${FACTORY_ROOT}/.env"
fi
else
# Subpath mode: Forgejo ROOT_URL with /forge/ subpath (trailing slash required)
if ! grep -q '^FORGEJO_ROOT_URL=' "${FACTORY_ROOT}/.env" 2>/dev/null; then if ! grep -q '^FORGEJO_ROOT_URL=' "${FACTORY_ROOT}/.env" 2>/dev/null; then
echo "FORGEJO_ROOT_URL=https://${EDGE_TUNNEL_FQDN}/forge/" >> "${FACTORY_ROOT}/.env" echo "FORGEJO_ROOT_URL=https://${EDGE_TUNNEL_FQDN}/forge/" >> "${FACTORY_ROOT}/.env"
fi fi
# Woodpecker WOODPECKER_HOST with /ci subpath (no trailing slash for v3) # Subpath mode: Woodpecker WOODPECKER_HOST with /ci subpath (no trailing slash for v3)
if ! grep -q '^WOODPECKER_HOST=' "${FACTORY_ROOT}/.env" 2>/dev/null; then if ! grep -q '^WOODPECKER_HOST=' "${FACTORY_ROOT}/.env" 2>/dev/null; then
echo "WOODPECKER_HOST=https://${EDGE_TUNNEL_FQDN}/ci" >> "${FACTORY_ROOT}/.env" echo "WOODPECKER_HOST=https://${EDGE_TUNNEL_FQDN}/ci" >> "${FACTORY_ROOT}/.env"
fi fi
fi fi
fi
# Prompt for FORGE_ADMIN_PASS before setup_forge # Prompt for FORGE_ADMIN_PASS before setup_forge
# This ensures the password is set before Forgejo user creation # This ensures the password is set before Forgejo user creation
@ -1561,9 +1616,15 @@ p.write_text(text)
create_woodpecker_oauth "$forge_url" "$forge_repo" create_woodpecker_oauth "$forge_url" "$forge_repo"
# Create OAuth2 app on Forgejo for disinto-chat (#708) # Create OAuth2 app on Forgejo for disinto-chat (#708)
# In subdomain mode, callback is at chat.<project> root instead of /chat/ subpath.
local chat_redirect_uri local chat_redirect_uri
if [ -n "${EDGE_TUNNEL_FQDN:-}" ]; then if [ -n "${EDGE_TUNNEL_FQDN:-}" ]; then
local chat_routing_mode="${EDGE_ROUTING_MODE:-subpath}"
if [ "$chat_routing_mode" = "subdomain" ]; then
chat_redirect_uri="https://${EDGE_TUNNEL_FQDN_CHAT:-chat.${EDGE_TUNNEL_FQDN}}/oauth/callback"
else
chat_redirect_uri="https://${EDGE_TUNNEL_FQDN}/chat/oauth/callback" chat_redirect_uri="https://${EDGE_TUNNEL_FQDN}/chat/oauth/callback"
fi
else else
chat_redirect_uri="http://localhost/chat/oauth/callback" chat_redirect_uri="http://localhost/chat/oauth/callback"
fi fi
@ -2763,15 +2824,29 @@ disinto_edge() {
# Write to .env (replace existing entries to avoid duplicates) # Write to .env (replace existing entries to avoid duplicates)
local tmp_env local tmp_env
tmp_env=$(mktemp) tmp_env=$(mktemp)
grep -Ev "^EDGE_TUNNEL_(HOST|PORT|FQDN)=" "$env_file" > "$tmp_env" 2>/dev/null || true grep -Ev "^EDGE_TUNNEL_(HOST|PORT|FQDN|FQDN_FORGE|FQDN_CI|FQDN_CHAT)=" "$env_file" > "$tmp_env" 2>/dev/null || true
mv "$tmp_env" "$env_file" mv "$tmp_env" "$env_file"
echo "EDGE_TUNNEL_HOST=${edge_host}" >> "$env_file" echo "EDGE_TUNNEL_HOST=${edge_host}" >> "$env_file"
echo "EDGE_TUNNEL_PORT=${port}" >> "$env_file" echo "EDGE_TUNNEL_PORT=${port}" >> "$env_file"
echo "EDGE_TUNNEL_FQDN=${fqdn}" >> "$env_file" echo "EDGE_TUNNEL_FQDN=${fqdn}" >> "$env_file"
# Subdomain mode: write per-service FQDNs (#1028)
local reg_routing_mode="${EDGE_ROUTING_MODE:-subpath}"
if [ "$reg_routing_mode" = "subdomain" ]; then
echo "EDGE_TUNNEL_FQDN_FORGE=forge.${fqdn}" >> "$env_file"
echo "EDGE_TUNNEL_FQDN_CI=ci.${fqdn}" >> "$env_file"
echo "EDGE_TUNNEL_FQDN_CHAT=chat.${fqdn}" >> "$env_file"
fi
echo "Registered: ${project}" echo "Registered: ${project}"
echo " Port: ${port}" echo " Port: ${port}"
echo " FQDN: ${fqdn}" echo " FQDN: ${fqdn}"
if [ "$reg_routing_mode" = "subdomain" ]; then
echo " Mode: subdomain"
echo " Forge: forge.${fqdn}"
echo " CI: ci.${fqdn}"
echo " Chat: chat.${fqdn}"
fi
echo " Saved to: ${env_file}" echo " Saved to: ${env_file}"
;; ;;
@ -2893,6 +2968,33 @@ EOF
esac esac
} }
# ── backup command ────────────────────────────────────────────────────────────
# Usage: disinto backup <subcommand> [args]
# Subcommands:
# create <outfile.tar.gz> Create backup of factory state
# import <infile.tar.gz> Restore factory state from backup
disinto_backup() {
local subcmd="${1:-}"
shift || true
case "$subcmd" in
create)
backup_create "$@"
;;
import)
backup_import "$@"
;;
*)
echo "Usage: disinto backup <subcommand> [args]" >&2
echo "" >&2
echo "Subcommands:" >&2
echo " create <outfile.tar.gz> Create backup of factory state" >&2
echo " import <infile.tar.gz> Restore factory state from backup" >&2
exit 1
;;
esac
}
# ── Main dispatch ──────────────────────────────────────────────────────────── # ── Main dispatch ────────────────────────────────────────────────────────────
case "${1:-}" in case "${1:-}" in
@ -2909,6 +3011,7 @@ case "${1:-}" in
hire-an-agent) shift; disinto_hire_an_agent "$@" ;; hire-an-agent) shift; disinto_hire_an_agent "$@" ;;
agent) shift; disinto_agent "$@" ;; agent) shift; disinto_agent "$@" ;;
edge) shift; disinto_edge "$@" ;; edge) shift; disinto_edge "$@" ;;
backup) shift; disinto_backup "$@" ;;
-h|--help) usage ;; -h|--help) usage ;;
*) usage ;; *) usage ;;
esac esac

View file

@ -1,4 +1,4 @@
<!-- last-reviewed: 5ba18c8f80da6e3e574823e39e5aa760731c1705 --> <!-- last-reviewed: 88222503d5a2ff1d25e9f1cb254ed31f13ccea7f -->
# Dev Agent # Dev Agent
**Role**: Implement issues autonomously — write code, push branches, address **Role**: Implement issues autonomously — write code, push branches, address

View file

@ -22,6 +22,7 @@ OAuth flow:
The claude binary is expected to be mounted from the host at /usr/local/bin/claude. The claude binary is expected to be mounted from the host at /usr/local/bin/claude.
""" """
import asyncio
import datetime import datetime
import json import json
import os import os
@ -30,8 +31,14 @@ import secrets
import subprocess import subprocess
import sys import sys
import time import time
import threading
from http.server import HTTPServer, BaseHTTPRequestHandler from http.server import HTTPServer, BaseHTTPRequestHandler
from socketserver import ThreadingMixIn
from urllib.parse import urlparse, parse_qs, urlencode from urllib.parse import urlparse, parse_qs, urlencode
import socket
import struct
import base64
import hashlib
# Configuration # Configuration
HOST = os.environ.get("CHAT_HOST", "0.0.0.0") HOST = os.environ.get("CHAT_HOST", "0.0.0.0")
@ -45,6 +52,8 @@ FORGE_URL = os.environ.get("FORGE_URL", "http://localhost:3000")
CHAT_OAUTH_CLIENT_ID = os.environ.get("CHAT_OAUTH_CLIENT_ID", "") CHAT_OAUTH_CLIENT_ID = os.environ.get("CHAT_OAUTH_CLIENT_ID", "")
CHAT_OAUTH_CLIENT_SECRET = os.environ.get("CHAT_OAUTH_CLIENT_SECRET", "") CHAT_OAUTH_CLIENT_SECRET = os.environ.get("CHAT_OAUTH_CLIENT_SECRET", "")
EDGE_TUNNEL_FQDN = os.environ.get("EDGE_TUNNEL_FQDN", "") EDGE_TUNNEL_FQDN = os.environ.get("EDGE_TUNNEL_FQDN", "")
EDGE_TUNNEL_FQDN_CHAT = os.environ.get("EDGE_TUNNEL_FQDN_CHAT", "")
EDGE_ROUTING_MODE = os.environ.get("EDGE_ROUTING_MODE", "subpath")
# Shared secret for Caddy forward_auth verify endpoint (#709). # Shared secret for Caddy forward_auth verify endpoint (#709).
# When set, only requests carrying this value in X-Forward-Auth-Secret are # When set, only requests carrying this value in X-Forward-Auth-Secret are
@ -87,6 +96,10 @@ _request_log = {}
# user -> {"tokens": int, "date": "YYYY-MM-DD"} # user -> {"tokens": int, "date": "YYYY-MM-DD"}
_daily_tokens = {} _daily_tokens = {}
# WebSocket message queues per user
# user -> asyncio.Queue (for streaming messages to connected clients)
_websocket_queues = {}
# MIME types for static files # MIME types for static files
MIME_TYPES = { MIME_TYPES = {
".html": "text/html; charset=utf-8", ".html": "text/html; charset=utf-8",
@ -99,9 +112,22 @@ MIME_TYPES = {
".ico": "image/x-icon", ".ico": "image/x-icon",
} }
# WebSocket subprotocol for chat streaming
WEBSOCKET_SUBPROTOCOL = "chat-stream-v1"
# WebSocket opcodes
OPCODE_CONTINUATION = 0x0
OPCODE_TEXT = 0x1
OPCODE_BINARY = 0x2
OPCODE_CLOSE = 0x8
OPCODE_PING = 0x9
OPCODE_PONG = 0xA
def _build_callback_uri(): def _build_callback_uri():
"""Build the OAuth callback URI based on tunnel configuration.""" """Build the OAuth callback URI based on tunnel configuration."""
if EDGE_ROUTING_MODE == "subdomain" and EDGE_TUNNEL_FQDN_CHAT:
return f"https://{EDGE_TUNNEL_FQDN_CHAT}/oauth/callback"
if EDGE_TUNNEL_FQDN: if EDGE_TUNNEL_FQDN:
return f"https://{EDGE_TUNNEL_FQDN}/chat/oauth/callback" return f"https://{EDGE_TUNNEL_FQDN}/chat/oauth/callback"
return "http://localhost/chat/oauth/callback" return "http://localhost/chat/oauth/callback"
@ -295,6 +321,307 @@ def _parse_stream_json(output):
return "".join(text_parts), total_tokens return "".join(text_parts), total_tokens
# =============================================================================
# WebSocket Handler Class
# =============================================================================
class _WebSocketHandler:
"""Handle WebSocket connections for chat streaming."""
def __init__(self, reader, writer, user, message_queue):
self.reader = reader
self.writer = writer
self.user = user
self.message_queue = message_queue
self.closed = False
async def accept_connection(self, sec_websocket_key, sec_websocket_protocol=None):
"""Accept the WebSocket handshake.
The HTTP request has already been parsed by BaseHTTPRequestHandler,
so we use the provided key and protocol instead of re-reading from socket.
"""
# Validate subprotocol
if sec_websocket_protocol and sec_websocket_protocol != WEBSOCKET_SUBPROTOCOL:
self._send_http_error(
400,
"Bad Request",
f"Unsupported subprotocol. Expected: {WEBSOCKET_SUBPROTOCOL}",
)
self._close_connection()
return False
# Generate accept key
accept_key = self._generate_accept_key(sec_websocket_key)
# Send handshake response
response = (
"HTTP/1.1 101 Switching Protocols\r\n"
"Upgrade: websocket\r\n"
"Connection: Upgrade\r\n"
f"Sec-WebSocket-Accept: {accept_key}\r\n"
)
if sec_websocket_protocol:
response += f"Sec-WebSocket-Protocol: {sec_websocket_protocol}\r\n"
response += "\r\n"
self.writer.write(response.encode("utf-8"))
await self.writer.drain()
return True
def _generate_accept_key(self, sec_key):
"""Generate the Sec-WebSocket-Accept key."""
GUID = "258EAFA5-E914-47DA-95CA-C5AB0DC85B11"
combined = sec_key + GUID
sha1 = hashlib.sha1(combined.encode("utf-8"))
return base64.b64encode(sha1.digest()).decode("utf-8")
async def _read_line(self):
"""Read a line from the socket."""
data = await self.reader.read(1)
line = ""
while data:
if data == b"\r":
data = await self.reader.read(1)
continue
if data == b"\n":
return line
line += data.decode("utf-8", errors="replace")
data = await self.reader.read(1)
return line
def _send_http_error(self, code, title, message):
"""Send an HTTP error response."""
response = (
f"HTTP/1.1 {code} {title}\r\n"
"Content-Type: text/plain; charset=utf-8\r\n"
"Content-Length: " + str(len(message)) + "\r\n"
"\r\n"
+ message
)
try:
self.writer.write(response.encode("utf-8"))
self.writer.drain()
except Exception:
pass
def _close_connection(self):
"""Close the connection."""
try:
self.writer.close()
except Exception:
pass
async def send_text(self, data):
"""Send a text frame."""
if self.closed:
return
try:
frame = self._encode_frame(OPCODE_TEXT, data.encode("utf-8"))
self.writer.write(frame)
await self.writer.drain()
except Exception as e:
print(f"WebSocket send error: {e}", file=sys.stderr)
async def send_binary(self, data):
"""Send a binary frame."""
if self.closed:
return
try:
if isinstance(data, str):
data = data.encode("utf-8")
frame = self._encode_frame(OPCODE_BINARY, data)
self.writer.write(frame)
await self.writer.drain()
except Exception as e:
print(f"WebSocket send error: {e}", file=sys.stderr)
def _encode_frame(self, opcode, payload):
"""Encode a WebSocket frame."""
frame = bytearray()
frame.append(0x80 | opcode) # FIN + opcode
length = len(payload)
if length < 126:
frame.append(length)
elif length < 65536:
frame.append(126)
frame.extend(struct.pack(">H", length))
else:
frame.append(127)
frame.extend(struct.pack(">Q", length))
frame.extend(payload)
return bytes(frame)
async def _decode_frame(self):
"""Decode a WebSocket frame. Returns (opcode, payload)."""
try:
# Read first two bytes (use readexactly for guaranteed length)
header = await self.reader.readexactly(2)
fin = (header[0] >> 7) & 1
opcode = header[0] & 0x0F
masked = (header[1] >> 7) & 1
length = header[1] & 0x7F
# Extended payload length
if length == 126:
ext = await self.reader.readexactly(2)
length = struct.unpack(">H", ext)[0]
elif length == 127:
ext = await self.reader.readexactly(8)
length = struct.unpack(">Q", ext)[0]
# Masking key
if masked:
mask_key = await self.reader.readexactly(4)
# Payload
payload = await self.reader.readexactly(length)
# Unmask if needed
if masked:
payload = bytes(b ^ mask_key[i % 4] for i, b in enumerate(payload))
return opcode, payload
except Exception as e:
print(f"WebSocket decode error: {e}", file=sys.stderr)
return None, None
async def handle_connection(self):
"""Handle the WebSocket connection loop."""
try:
while not self.closed:
opcode, payload = await self._decode_frame()
if opcode is None:
break
if opcode == OPCODE_CLOSE:
await self._send_close()
break
elif opcode == OPCODE_PING:
await self._send_pong(payload)
elif opcode == OPCODE_PONG:
pass # Ignore pong
elif opcode in (OPCODE_TEXT, OPCODE_BINARY):
# Handle text messages from client (e.g., chat_request)
try:
msg = payload.decode("utf-8")
data = json.loads(msg)
if data.get("type") == "chat_request":
# Invoke Claude with the message
await self._handle_chat_request(data.get("message", ""))
except (json.JSONDecodeError, UnicodeDecodeError):
pass
# Check if we should stop waiting for messages
if self.closed:
break
except Exception as e:
print(f"WebSocket connection error: {e}", file=sys.stderr)
finally:
self._close_connection()
# Clean up the message queue on disconnect
if self.user in _websocket_queues:
del _websocket_queues[self.user]
async def _send_close(self):
"""Send a close frame."""
try:
# Close code 1000 = normal closure
frame = self._encode_frame(OPCODE_CLOSE, struct.pack(">H", 1000))
self.writer.write(frame)
await self.writer.drain()
except Exception:
pass
async def _send_pong(self, payload):
"""Send a pong frame."""
try:
frame = self._encode_frame(OPCODE_PONG, payload)
self.writer.write(frame)
await self.writer.drain()
except Exception:
pass
async def _handle_chat_request(self, message):
"""Handle a chat_request WebSocket frame by invoking Claude."""
if not message:
return
# Validate Claude binary exists
if not os.path.exists(CLAUDE_BIN):
await self.send_text(json.dumps({
"type": "error",
"message": "Claude CLI not found",
}))
return
try:
# Spawn claude --print with stream-json for streaming output
proc = subprocess.Popen(
[CLAUDE_BIN, "--print", "--output-format", "stream-json", message],
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
text=True,
bufsize=1,
)
# Stream output line by line
for line in iter(proc.stdout.readline, ""):
line = line.strip()
if not line:
continue
try:
event = json.loads(line)
etype = event.get("type", "")
# Extract text content from content_block_delta events
if etype == "content_block_delta":
delta = event.get("delta", {})
if delta.get("type") == "text_delta":
text = delta.get("text", "")
if text:
# Send tokens to client
await self.send_text(text)
# Check for usage event to know when complete
if etype == "result":
pass # Will send complete after loop
except json.JSONDecodeError:
pass
# Wait for process to complete
proc.wait()
if proc.returncode != 0:
await self.send_text(json.dumps({
"type": "error",
"message": f"Claude CLI failed with exit code {proc.returncode}",
}))
return
# Send complete signal
await self.send_text(json.dumps({
"type": "complete",
}))
except FileNotFoundError:
await self.send_text(json.dumps({
"type": "error",
"message": "Claude CLI not found",
}))
except Exception as e:
await self.send_text(json.dumps({
"type": "error",
"message": str(e),
}))
# ============================================================================= # =============================================================================
# Conversation History Functions (#710) # Conversation History Functions (#710)
# ============================================================================= # =============================================================================
@ -544,9 +871,9 @@ class ChatHandler(BaseHTTPRequestHandler):
self.serve_static(path) self.serve_static(path)
return return
# Reserved WebSocket endpoint (future use) # WebSocket upgrade endpoint
if path == "/ws" or path.startswith("/ws"): if path == "/chat/ws" or path == "/ws" or path.startswith("/ws"):
self.send_error_page(501, "WebSocket upgrade not yet implemented") self.handle_websocket_upgrade()
return return
# 404 for unknown paths # 404 for unknown paths
@ -755,6 +1082,7 @@ class ChatHandler(BaseHTTPRequestHandler):
""" """
Handle chat requests by spawning `claude --print` with the user message. Handle chat requests by spawning `claude --print` with the user message.
Enforces per-user rate limits and tracks token usage (#711). Enforces per-user rate limits and tracks token usage (#711).
Streams tokens over WebSocket if connected.
""" """
# Check rate limits before processing (#711) # Check rate limits before processing (#711)
@ -812,10 +1140,47 @@ class ChatHandler(BaseHTTPRequestHandler):
stdout=subprocess.PIPE, stdout=subprocess.PIPE,
stderr=subprocess.PIPE, stderr=subprocess.PIPE,
text=True, text=True,
bufsize=1, # Line buffered
) )
raw_output = proc.stdout.read() # Stream output line by line
response_parts = []
total_tokens = 0
for line in iter(proc.stdout.readline, ""):
line = line.strip()
if not line:
continue
try:
event = json.loads(line)
etype = event.get("type", "")
# Extract text content from content_block_delta events
if etype == "content_block_delta":
delta = event.get("delta", {})
if delta.get("type") == "text_delta":
text = delta.get("text", "")
if text:
response_parts.append(text)
# Stream to WebSocket if connected
if user in _websocket_queues:
try:
_websocket_queues[user].put_nowait(text)
except Exception:
pass # Client disconnected
# Parse usage from result event
if etype == "result":
usage = event.get("usage", {})
total_tokens = usage.get("input_tokens", 0) + usage.get("output_tokens", 0)
elif "usage" in event:
usage = event["usage"]
if isinstance(usage, dict):
total_tokens = usage.get("input_tokens", 0) + usage.get("output_tokens", 0)
except json.JSONDecodeError:
pass
# Wait for process to complete
error_output = proc.stderr.read() error_output = proc.stderr.read()
if error_output: if error_output:
print(f"Claude stderr: {error_output}", file=sys.stderr) print(f"Claude stderr: {error_output}", file=sys.stderr)
@ -826,8 +1191,8 @@ class ChatHandler(BaseHTTPRequestHandler):
self.send_error_page(500, f"Claude CLI failed with exit code {proc.returncode}") self.send_error_page(500, f"Claude CLI failed with exit code {proc.returncode}")
return return
# Parse stream-json for text and token usage (#711) # Combine response parts
response, total_tokens = _parse_stream_json(raw_output) response = "".join(response_parts)
# Track token usage - does not block *this* request (#711) # Track token usage - does not block *this* request (#711)
if total_tokens > 0: if total_tokens > 0:
@ -839,7 +1204,7 @@ class ChatHandler(BaseHTTPRequestHandler):
# Fall back to raw output if stream-json parsing yielded no text # Fall back to raw output if stream-json parsing yielded no text
if not response: if not response:
response = raw_output response = proc.stdout.getvalue() if hasattr(proc.stdout, 'getvalue') else ""
# Save assistant response to history # Save assistant response to history
_write_message(user, conv_id, "assistant", response) _write_message(user, conv_id, "assistant", response)
@ -909,6 +1274,118 @@ class ChatHandler(BaseHTTPRequestHandler):
self.end_headers() self.end_headers()
self.wfile.write(json.dumps({"conversation_id": conv_id}, ensure_ascii=False).encode("utf-8")) self.wfile.write(json.dumps({"conversation_id": conv_id}, ensure_ascii=False).encode("utf-8"))
@staticmethod
def push_to_websocket(user, message):
"""Push a message to a WebSocket connection for a user.
This is called from the chat handler to stream tokens to connected clients.
The message is added to the user's WebSocket message queue.
"""
# Get the message queue from the WebSocket handler's queue
# We store the queue in a global dict keyed by user
if user in _websocket_queues:
_websocket_queues[user].put_nowait(message)
def handle_websocket_upgrade(self):
"""Handle WebSocket upgrade request for chat streaming."""
# Check session cookie
user = _validate_session(self.headers.get("Cookie"))
if not user:
self.send_error_page(401, "Unauthorized: no valid session")
return
# Check rate limits before allowing WebSocket connection
allowed, retry_after, reason = _check_rate_limit(user)
if not allowed:
self.send_error_page(
429,
f"Rate limit exceeded: {reason}. Retry after {retry_after}s",
)
return
# Record request for rate limiting
_record_request(user)
# Create message queue for this user
_websocket_queues[user] = asyncio.Queue()
# Get WebSocket upgrade headers from the HTTP request
sec_websocket_key = self.headers.get("Sec-WebSocket-Key", "")
sec_websocket_protocol = self.headers.get("Sec-WebSocket-Protocol", "")
# Validate Sec-WebSocket-Key
if not sec_websocket_key:
self.send_error_page(400, "Bad Request", "Missing Sec-WebSocket-Key")
return
# Get the socket from the connection
sock = self.connection
sock.setblocking(False)
# Create async server to handle the connection
async def handle_ws():
try:
# Wrap the socket in asyncio streams using open_connection
reader, writer = await asyncio.open_connection(sock=sock)
# Create WebSocket handler
ws_handler = _WebSocketHandler(reader, writer, user, _websocket_queues[user])
# Accept the connection (pass headers from HTTP request)
if not await ws_handler.accept_connection(sec_websocket_key, sec_websocket_protocol):
return
# Start a task to read from the queue and send to client
async def send_stream():
while not ws_handler.closed:
try:
data = await asyncio.wait_for(ws_handler.message_queue.get(), timeout=1.0)
await ws_handler.send_text(data)
except asyncio.TimeoutError:
# Send ping to keep connection alive
try:
frame = ws_handler._encode_frame(OPCODE_PING, b"")
writer.write(frame)
await writer.drain()
except Exception:
break
except Exception as e:
print(f"Send stream error: {e}", file=sys.stderr)
break
# Start sending task
send_task = asyncio.create_task(send_stream())
# Handle incoming WebSocket frames
await ws_handler.handle_connection()
# Cancel send task
send_task.cancel()
try:
await send_task
except asyncio.CancelledError:
pass
except Exception as e:
print(f"WebSocket handler error: {e}", file=sys.stderr)
finally:
try:
writer.close()
await writer.wait_closed()
except Exception:
pass
# Run the async handler in a thread
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
try:
loop.run_until_complete(handle_ws())
except Exception as e:
print(f"WebSocket error: {e}", file=sys.stderr)
finally:
loop.close()
sock.close()
def do_DELETE(self): def do_DELETE(self):
"""Handle DELETE requests.""" """Handle DELETE requests."""
parsed = urlparse(self.path) parsed = urlparse(self.path)

View file

@ -430,6 +430,10 @@
return div.innerHTML.replace(/\n/g, '<br>'); return div.innerHTML.replace(/\n/g, '<br>');
} }
// WebSocket connection for streaming
let ws = null;
let wsMessageId = null;
// Send message handler // Send message handler
async function sendMessage() { async function sendMessage() {
const message = textarea.value.trim(); const message = textarea.value.trim();
@ -449,6 +453,14 @@
await createNewConversation(); await createNewConversation();
} }
// Try WebSocket streaming first, fall back to fetch
if (window.location.protocol === 'https:' || window.location.hostname === 'localhost') {
if (tryWebSocketSend(message)) {
return;
}
}
// Fallback to fetch
try { try {
// Use fetch with URLSearchParams for application/x-www-form-urlencoded // Use fetch with URLSearchParams for application/x-www-form-urlencoded
const params = new URLSearchParams(); const params = new URLSearchParams();
@ -485,6 +497,111 @@
} }
} }
// Try to send message via WebSocket streaming
function tryWebSocketSend(message) {
try {
// Generate a unique message ID for this request
wsMessageId = Date.now().toString(36) + Math.random().toString(36).substr(2);
// Connect to WebSocket
const wsUrl = window.location.protocol === 'https:'
? `wss://${window.location.host}/chat/ws`
: `ws://${window.location.host}/chat/ws`;
ws = new WebSocket(wsUrl);
ws.onopen = function() {
// Send the message as JSON with message ID
const data = {
type: 'chat_request',
message_id: wsMessageId,
message: message,
conversation_id: currentConversationId
};
ws.send(JSON.stringify(data));
};
ws.onmessage = function(event) {
try {
const data = JSON.parse(event.data);
if (data.type === 'token') {
// Stream a token to the UI
addTokenToLastMessage(data.token);
} else if (data.type === 'complete') {
// Streaming complete
closeWebSocket();
textarea.disabled = false;
sendBtn.disabled = false;
sendBtn.textContent = 'Send';
textarea.focus();
messagesDiv.scrollTop = messagesDiv.scrollHeight;
loadConversations();
} else if (data.type === 'error') {
addSystemMessage(`Error: ${data.message}`);
closeWebSocket();
textarea.disabled = false;
sendBtn.disabled = false;
sendBtn.textContent = 'Send';
textarea.focus();
}
} catch (e) {
console.error('Failed to parse WebSocket message:', e);
}
};
ws.onerror = function(error) {
console.error('WebSocket error:', error);
addSystemMessage('WebSocket connection error. Falling back to regular chat.');
closeWebSocket();
sendMessage(); // Retry with fetch
};
ws.onclose = function() {
wsMessageId = null;
};
return true; // WebSocket attempt started
} catch (error) {
console.error('Failed to create WebSocket:', error);
return false; // Fall back to fetch
}
}
// Add a token to the last assistant message (for streaming)
function addTokenToLastMessage(token) {
const messages = messagesDiv.querySelectorAll('.message.assistant');
if (messages.length === 0) {
// No assistant message yet, create one
const msgDiv = document.createElement('div');
msgDiv.className = 'message assistant';
msgDiv.innerHTML = `
<div class="role">assistant</div>
<div class="content streaming"></div>
`;
messagesDiv.appendChild(msgDiv);
}
const lastMsg = messagesDiv.querySelector('.message.assistant .content.streaming');
if (lastMsg) {
lastMsg.textContent += token;
messagesDiv.scrollTop = messagesDiv.scrollHeight;
}
}
// Close WebSocket connection
function closeWebSocket() {
if (ws) {
ws.onopen = null;
ws.onmessage = null;
ws.onerror = null;
ws.onclose = null;
ws.close();
ws = null;
}
}
// Event listeners // Event listeners
sendBtn.addEventListener('click', sendMessage); sendBtn.addEventListener('click', sendMessage);

View file

@ -0,0 +1,183 @@
# Nomad Cutover Runbook
End-to-end procedure to cut over the disinto factory from docker-compose on
disinto-dev-box to Nomad on disinto-nomad-box.
**Target**: disinto-nomad-box (10.10.10.216) becomes production; disinto-dev-box
stays warm for rollback.
**Downtime budget**: <5 min blue-green flip.
**Data scope**: Forgejo issues + disinto-ops git bundle only. Everything else is
regenerated or discarded. OAuth secrets are regenerated on fresh init (all
sessions invalidated).
---
## 1. Pre-cutover readiness checklist
- [ ] Nomad + Vault stack healthy on a fresh wipe+init (step 5 verified)
- [ ] Codeberg mirror current — `git log` parity between dev-box Forgejo and
Codeberg
- [ ] SSH key pair generated for nomad-box, registered on DO edge (see §4.6)
- [ ] Companion tools landed:
- `disinto backup create` (#1057)
- `disinto backup import` (#1058)
- [ ] Backup tarball produced and tested against a scratch LXC (see §3)
---
## 2. Pre-cutover artifact: backup
On disinto-dev-box:
```bash
./bin/disinto backup create /tmp/disinto-backup-$(date +%Y%m%d).tar.gz
```
Copy the tarball to nomad-box (and optionally to a local workstation for
safekeeping):
```bash
scp /tmp/disinto-backup-*.tar.gz nomad-box:/tmp/
```
---
## 3. Pre-cutover dry-run
On a throwaway LXC:
```bash
lxc launch ubuntu:24.04 cutover-dryrun
# inside the container:
disinto init --backend=nomad --import-env .env --with edge
./bin/disinto backup import /tmp/disinto-backup-*.tar.gz
```
Verify:
- Issue count matches source Forgejo
- disinto-ops repo refs match source bundle
Destroy the LXC once satisfied:
```bash
lxc delete cutover-dryrun --force
```
---
## 4. Cutover T-0 (operator executes; <5 min target)
### 4.1 Stop dev-box services
```bash
# On disinto-dev-box — stop, do NOT remove volumes (rollback needs them)
docker-compose stop
```
### 4.2 Provision nomad-box (if not already done)
```bash
# On disinto-nomad-box
disinto init --backend=nomad --import-env .env --with edge
```
### 4.3 Import backup
```bash
# On disinto-nomad-box
./bin/disinto backup import /tmp/disinto-backup-*.tar.gz
```
### 4.4 Configure Codeberg pull mirror
Manual, one-time step in the new Forgejo UI:
1. Create a mirror repository pointing at the Codeberg upstream
2. Confirm initial sync completes
### 4.5 Claude login
```bash
# On disinto-nomad-box
claude login
```
Set up Anthropic OAuth so agents can authenticate.
### 4.6 Autossh tunnel swap
> **Operator step** — cross-host, no dev-agent involvement. Do NOT automate.
1. Stop the tunnel on dev-box:
```bash
# On disinto-dev-box
systemctl stop reverse-tunnel
```
2. Copy or regenerate the tunnel unit on nomad-box:
```bash
# Copy from dev-box, or let init regenerate it
scp dev-box:/etc/systemd/system/reverse-tunnel.service \
nomad-box:/etc/systemd/system/
```
3. Register nomad-box's public key on DO edge:
```bash
# On DO edge box — same restricted-command as the dev-box key
echo "<nomad-box-pubkey>" >> /home/johba/.ssh/authorized_keys
```
4. Start the tunnel on nomad-box:
```bash
# On disinto-nomad-box
systemctl enable --now reverse-tunnel
```
5. Verify end-to-end:
```bash
curl https://self.disinto.ai/api/v1/version
# Should return the new box's Forgejo version
```
---
## 5. Post-cutover smoke
- [ ] `curl https://self.disinto.ai` → Forgejo welcome page
- [ ] Create a test PR → Woodpecker pipeline runs → agents assign and work
- [ ] Claude chat login via Forgejo OAuth succeeds
---
## 6. Rollback (if any step 4 gate fails)
1. Stop the tunnel on nomad-box:
```bash
systemctl stop reverse-tunnel # on nomad-box
```
2. Restore the tunnel on dev-box:
```bash
systemctl start reverse-tunnel # on dev-box
```
3. Bring dev-box services back up:
```bash
docker-compose up -d # on dev-box
```
4. DO Caddy config is unchanged — traffic restores in <5 min.
5. File a post-mortem issue. Keep nomad-box state intact for debugging.
---
## 7. Post-stable cleanup (T+1 week)
- `docker-compose down -v` on dev-box
- Archive `/var/lib/docker/volumes/disinto_*` to cold storage
- Delete disinto-dev-box LXC or keep as permanent rollback reserve (operator
decision)

View file

@ -1,4 +1,4 @@
<!-- last-reviewed: 5ba18c8f80da6e3e574823e39e5aa760731c1705 --> <!-- last-reviewed: 88222503d5a2ff1d25e9f1cb254ed31f13ccea7f -->
# Gardener Agent # Gardener Agent
**Role**: Backlog grooming — detect duplicate issues, missing acceptance **Role**: Backlog grooming — detect duplicate issues, missing acceptance

View file

@ -1,47 +1 @@
[ []
{
"action": "add_label",
"issue": 1047,
"label": "backlog"
},
{
"action": "add_label",
"issue": 1047,
"label": "priority"
},
{
"action": "add_label",
"issue": 1044,
"label": "backlog"
},
{
"action": "remove_label",
"issue": 1025,
"label": "blocked"
},
{
"action": "add_label",
"issue": 1025,
"label": "backlog"
},
{
"action": "comment",
"issue": 1025,
"body": "Gardener: removing `blocked` — fix path is well-defined (Option 1: static-checks-only pipeline). Promoting to backlog for next dev pick-up. Dev must follow the acceptance criteria literally — no live service curls, static checks only."
},
{
"action": "remove_label",
"issue": 850,
"label": "blocked"
},
{
"action": "add_label",
"issue": 850,
"label": "backlog"
},
{
"action": "comment",
"issue": 850,
"body": "Gardener: removing `blocked` — 5th attempt recipe is at the top of this issue. Dev must follow the recipe exactly (call `_generate_compose_impl` directly in isolated FACTORY_ROOT, do NOT use `bin/disinto init`). Do not copy patterns from prior PRs."
}
]

View file

@ -1,4 +1,4 @@
<!-- last-reviewed: 5ba18c8f80da6e3e574823e39e5aa760731c1705 --> <!-- last-reviewed: 88222503d5a2ff1d25e9f1cb254ed31f13ccea7f -->
# Shared Helpers (`lib/`) # Shared Helpers (`lib/`)
All agents source `lib/env.sh` as their first action. Additional helpers are All agents source `lib/env.sh` as their first action. Additional helpers are
@ -7,7 +7,7 @@ sourced as needed.
| File | What it provides | Sourced by | | File | What it provides | Sourced by |
|---|---|---| |---|---|---|
| `lib/env.sh` | Loads `.env`, sets `FACTORY_ROOT`, exports project config (`FORGE_REPO`, `PROJECT_NAME`, etc.), defines `log()`, `forge_api()`, `forge_api_all()` (paginates all pages; accepts optional second TOKEN parameter, defaults to `$FORGE_TOKEN`; handles invalid/empty JSON responses gracefully — returns empty on parse error instead of crashing), `woodpecker_api()`, `wpdb()`, `memory_guard()` (skips agent if RAM < threshold), `load_secret()` (secret-source abstraction see below). Auto-loads project TOML if `PROJECT_TOML` is set. Exports per-agent tokens (`FORGE_PLANNER_TOKEN`, `FORGE_GARDENER_TOKEN`, `FORGE_VAULT_TOKEN`, `FORGE_SUPERVISOR_TOKEN`, `FORGE_PREDICTOR_TOKEN`) each falls back to `$FORGE_TOKEN` if not set. **Vault-only token guard (AD-006)**: `unset GITHUB_TOKEN CLAWHUB_TOKEN` so agents never hold external-action tokens only the runner container receives them. **Container note**: when `DISINTO_CONTAINER=1`, `.env` is NOT re-sourced compose already injects env vars (including `FORGE_URL=http://forgejo:3000`) and re-sourcing would clobber them. **Save/restore scope (#364)**: only `FORGE_URL` is preserved across `.env` re-sourcing (compose injects `http://forgejo:3000`, `.env` has `http://localhost:3000`). `FORGE_TOKEN` is NOT preserved so refreshed tokens in `.env` take effect immediately. **Per-agent token override (#762)**: agent run scripts export `FORGE_TOKEN_OVERRIDE=<agent-specific-token>` BEFORE sourcing `env.sh`; `env.sh` applies this override at lines 98-100, ensuring the correct identity survives any re-sourcing of `env.sh` by nested shells or `claude -p` invocations. **Required env var**: `FORGE_PASS` bot password for git HTTP push (Forgejo 11.x rejects API tokens for `git push`, #361). **Hard preconditions (#674)**: `USER` and `HOME` must be exported by the entrypoint before sourcing. When `PROJECT_TOML` is set, `PROJECT_REPO_ROOT`, `PRIMARY_BRANCH`, and `OPS_REPO_ROOT` must also be set (by entrypoint or TOML). **`load_secret NAME [DEFAULT]` (#793)**: backend-agnostic secret resolution. Precedence: (1) `/secrets/<NAME>.env` Nomad-rendered template, (2) current environment already set by `.env.enc` / compose, (3) `secrets/<NAME>.enc` age-encrypted per-key file (decrypted on demand, cached in process env), (4) DEFAULT or empty. Consumers call `$(load_secret GITHUB_TOKEN)` instead of `${GITHUB_TOKEN}` identical behavior whether secrets come from Docker compose injection or Nomad Vault templates. | Every agent | | `lib/env.sh` | Loads `.env`, sets `FACTORY_ROOT`, exports project config (`FORGE_REPO`, `PROJECT_NAME`, etc.), defines `log()`, `forge_api()`, `forge_api_all()` (paginates all pages; accepts optional second TOKEN parameter, defaults to `$FORGE_TOKEN`; handles invalid/empty JSON responses gracefully — returns empty on parse error instead of crashing), `woodpecker_api()`, `wpdb()`, `memory_guard()` (skips agent if RAM < threshold), `load_secret()` (secret-source abstraction see below). Auto-loads project TOML if `PROJECT_TOML` is set. Exports per-agent tokens (`FORGE_PLANNER_TOKEN`, `FORGE_GARDENER_TOKEN`, `FORGE_VAULT_TOKEN`, `FORGE_SUPERVISOR_TOKEN`, `FORGE_PREDICTOR_TOKEN`) each falls back to `$FORGE_TOKEN` if not set. **Vault-only token guard (AD-006)**: `unset GITHUB_TOKEN CLAWHUB_TOKEN` so agents never hold external-action tokens only the runner container receives them. **Container note**: when `DISINTO_CONTAINER=1`, `.env` is NOT re-sourced compose already injects env vars (including `FORGE_URL=http://forgejo:3000`) and re-sourcing would clobber them. **Save/restore scope (#364)**: only `FORGE_URL` is preserved across `.env` re-sourcing (compose injects `http://forgejo:3000`, `.env` has `http://localhost:3000`). `FORGE_TOKEN` is NOT preserved so refreshed tokens in `.env` take effect immediately. **Per-agent token override (#762)**: agent run scripts export `FORGE_TOKEN_OVERRIDE=<agent-specific-token>` BEFORE sourcing `env.sh`; `env.sh` applies this override at lines 98-100, ensuring the correct identity survives any re-sourcing of `env.sh` by nested shells or `claude -p` invocations. **Required env var**: `FORGE_PASS` bot password for git HTTP push (Forgejo 11.x rejects API tokens for `git push`, #361). **Hard preconditions (#674)**: `USER` and `HOME` must be exported by the entrypoint before sourcing. When `PROJECT_TOML` is set, `PROJECT_REPO_ROOT`, `PRIMARY_BRANCH`, and `OPS_REPO_ROOT` must also be set (by entrypoint or TOML). **`load_secret NAME [DEFAULT]` (#793)**: backend-agnostic secret resolution. Precedence: (1) `/secrets/<NAME>.env` Nomad-rendered template, (2) current environment already set by `.env.enc` / compose, (3) `secrets/<NAME>.enc` age-encrypted per-key file (decrypted on demand, cached in process env), (4) DEFAULT or empty. Consumers call `$(load_secret GITHUB_TOKEN)` instead of `${GITHUB_TOKEN}` identical behavior whether secrets come from Docker compose injection or Nomad Vault templates. | Every agent |
| `lib/ci-helpers.sh` | `ci_passed()` — returns 0 if CI state is "success" (or no CI configured). `ci_required_for_pr()` — returns 0 if PR has code files (CI required), 1 if non-code only (CI not required). `is_infra_step()` — returns 0 if a single CI step failure matches infra heuristics (clone/git exit 128, any exit 137, log timeout patterns). `classify_pipeline_failure()` — returns "infra \<reason>" if any failed Woodpecker step matches infra heuristics via `is_infra_step()`, else "code". `ensure_priority_label()` — looks up (or creates) the `priority` label and returns its ID; caches in `_PRIORITY_LABEL_ID`. `ci_commit_status <sha>` — queries Woodpecker directly for CI state, falls back to forge commit status API. `ci_pipeline_number <sha>` — returns the Woodpecker pipeline number for a commit, falls back to parsing forge status `target_url`. `ci_promote <repo_id> <pipeline_num> <environment>` — promotes a pipeline to a named Woodpecker environment (vault-gated deployment: vault approves, vault-fire calls this — vault redesign in progress, see #73-#77). `ci_get_logs <pipeline_number> [--step <name>]` — reads CI logs from Woodpecker SQLite database via `lib/ci-log-reader.py`; outputs last 200 lines to stdout. Requires mounted woodpecker-data volume at /woodpecker-data. | dev-poll, review-poll, review-pr | | `lib/ci-helpers.sh` | `ci_passed()` — returns 0 if CI state is "success" (or no CI configured). `ci_required_for_pr()` — returns 0 if PR has code files (CI required), 1 if non-code only (CI not required). `is_infra_step()` — returns 0 if a single CI step failure matches infra heuristics (clone/git exit 128, any exit 137, log timeout patterns). `classify_pipeline_failure()` — returns "infra \<reason>" if any failed Woodpecker step matches infra heuristics via `is_infra_step()`, else "code". `ensure_priority_label()` — looks up (or creates) the `priority` label and returns its ID; caches in `_PRIORITY_LABEL_ID`. `ci_commit_status <sha>` — queries Woodpecker directly for CI state, falls back to forge commit status API. `ci_pipeline_number <sha>` — returns the Woodpecker pipeline number for a commit, falls back to parsing forge status `target_url`. `ci_promote <repo_id> <pipeline_num> <environment>` — promotes a pipeline to a named Woodpecker environment (vault-gated deployment: vault approves, vault-fire calls this — vault redesign in progress, see #73-#77). `ci_get_logs <pipeline_number> [--step <name>]` — reads CI logs from Woodpecker SQLite database via `lib/ci-log-reader.py`; outputs last 200 lines to stdout. Requires mounted woodpecker-data volume at /woodpecker-data. `ci_get_step_logs <pipeline_num> <step_id>` — fetches per-step logs via Woodpecker REST API (`/repos/{id}/logs/{pipeline}/{step_id}`); returns raw log data for a single step. Used by `pr_poll_ci()` to build per-workflow/per-step CI diagnostics (#1051). | dev-poll, review-poll, review-pr |
| `lib/ci-debug.sh` | CLI tool for Woodpecker CI: `list`, `status`, `logs`, `failures` subcommands. Not sourced — run directly. | Humans / dev-agent (tool access) | | `lib/ci-debug.sh` | CLI tool for Woodpecker CI: `list`, `status`, `logs`, `failures` subcommands. Not sourced — run directly. | Humans / dev-agent (tool access) |
| `lib/ci-log-reader.py` | Python tool: reads CI logs from Woodpecker SQLite database. `<pipeline_number> [--step <name>]` — returns last 200 lines from failed steps (or specified step). Used by `ci_get_logs()` in ci-helpers.sh. Requires `WOODPECKER_DATA_DIR` (default: /woodpecker-data). | ci-helpers.sh | | `lib/ci-log-reader.py` | Python tool: reads CI logs from Woodpecker SQLite database. `<pipeline_number> [--step <name>]` — returns last 200 lines from failed steps (or specified step). Used by `ci_get_logs()` in ci-helpers.sh. Requires `WOODPECKER_DATA_DIR` (default: /woodpecker-data). | ci-helpers.sh |
| `lib/load-project.sh` | Parses a `projects/*.toml` file into env vars (`PROJECT_NAME`, `FORGE_REPO`, `WOODPECKER_REPO_ID`, monitoring toggles, mirror config, etc.). Also exports `FORGE_REPO_OWNER` (the owner component of `FORGE_REPO`, e.g. `disinto-admin` from `disinto-admin/disinto`). Reads `repo_root` and `ops_repo_root` from the TOML for host-CLI callers. **Container path handling (#674)**: no longer derives `PROJECT_REPO_ROOT` or `OPS_REPO_ROOT` inside the script — container entrypoints export the correct paths before agent scripts source `env.sh`, and the `DISINTO_CONTAINER` guard (line 90) skips TOML overrides when those vars are already set. | env.sh (when `PROJECT_TOML` is set) | | `lib/load-project.sh` | Parses a `projects/*.toml` file into env vars (`PROJECT_NAME`, `FORGE_REPO`, `WOODPECKER_REPO_ID`, monitoring toggles, mirror config, etc.). Also exports `FORGE_REPO_OWNER` (the owner component of `FORGE_REPO`, e.g. `disinto-admin` from `disinto-admin/disinto`). Reads `repo_root` and `ops_repo_root` from the TOML for host-CLI callers. **Container path handling (#674)**: no longer derives `PROJECT_REPO_ROOT` or `OPS_REPO_ROOT` inside the script — container entrypoints export the correct paths before agent scripts source `env.sh`, and the `DISINTO_CONTAINER` guard (line 90) skips TOML overrides when those vars are already set. | env.sh (when `PROJECT_TOML` is set) |
@ -20,7 +20,7 @@ sourced as needed.
| `lib/stack-lock.sh` | File-based lock protocol for singleton project stack access. `stack_lock_acquire(holder, project)` — polls until free, breaks stale heartbeats (>10 min old), claims lock. `stack_lock_release(project)` — deletes lock file. `stack_lock_check(project)` — inspect current lock state. `stack_lock_heartbeat(project)` — update heartbeat timestamp (callers must call every 2 min while holding). Lock files at `~/data/locks/<project>-stack.lock`. | docker/edge/dispatcher.sh, reproduce formula | | `lib/stack-lock.sh` | File-based lock protocol for singleton project stack access. `stack_lock_acquire(holder, project)` — polls until free, breaks stale heartbeats (>10 min old), claims lock. `stack_lock_release(project)` — deletes lock file. `stack_lock_check(project)` — inspect current lock state. `stack_lock_heartbeat(project)` — update heartbeat timestamp (callers must call every 2 min while holding). Lock files at `~/data/locks/<project>-stack.lock`. | docker/edge/dispatcher.sh, reproduce formula |
| `lib/tea-helpers.sh` | `tea_file_issue(title, body, labels...)` — create issue via tea CLI with secret scanning; sets `FILED_ISSUE_NUM`. `tea_relabel(issue_num, labels...)` — replace labels using tea's `edit` subcommand (not `label`). `tea_comment(issue_num, body)` — add comment with secret scanning. `tea_close(issue_num)` — close issue. All use `TEA_LOGIN` and `FORGE_REPO` from env.sh. Labels by name (no ID lookup). Tea binary download verified via sha256 checksum. Sourced by env.sh when `tea` binary is available. | env.sh (conditional) | | `lib/tea-helpers.sh` | `tea_file_issue(title, body, labels...)` — create issue via tea CLI with secret scanning; sets `FILED_ISSUE_NUM`. `tea_relabel(issue_num, labels...)` — replace labels using tea's `edit` subcommand (not `label`). `tea_comment(issue_num, body)` — add comment with secret scanning. `tea_close(issue_num)` — close issue. All use `TEA_LOGIN` and `FORGE_REPO` from env.sh. Labels by name (no ID lookup). Tea binary download verified via sha256 checksum. Sourced by env.sh when `tea` binary is available. | env.sh (conditional) |
| `lib/worktree.sh` | Reusable git worktree management: `worktree_create(path, branch, [base_ref])` — create worktree, checkout base, fetch submodules. `worktree_recover(path, branch, [remote])` — detect existing worktree, reuse if on correct branch (sets `_WORKTREE_REUSED`), otherwise clean and recreate. `worktree_cleanup(path)``git worktree remove --force`, clear Claude Code project cache (`~/.claude/projects/` matching path). `worktree_cleanup_stale([max_age_hours])` — scan `/tmp` for orphaned worktrees older than threshold, skip preserved and active tmux worktrees, prune. `worktree_preserve(path, reason)` — mark worktree as preserved for debugging (writes `.worktree-preserved` marker, skipped by stale cleanup). | dev-agent.sh, supervisor-run.sh, planner-run.sh, predictor-run.sh, gardener-run.sh | | `lib/worktree.sh` | Reusable git worktree management: `worktree_create(path, branch, [base_ref])` — create worktree, checkout base, fetch submodules. `worktree_recover(path, branch, [remote])` — detect existing worktree, reuse if on correct branch (sets `_WORKTREE_REUSED`), otherwise clean and recreate. `worktree_cleanup(path)``git worktree remove --force`, clear Claude Code project cache (`~/.claude/projects/` matching path). `worktree_cleanup_stale([max_age_hours])` — scan `/tmp` for orphaned worktrees older than threshold, skip preserved and active tmux worktrees, prune. `worktree_preserve(path, reason)` — mark worktree as preserved for debugging (writes `.worktree-preserved` marker, skipped by stale cleanup). | dev-agent.sh, supervisor-run.sh, planner-run.sh, predictor-run.sh, gardener-run.sh |
| `lib/pr-lifecycle.sh` | Reusable PR lifecycle library: `pr_create()`, `pr_find_by_branch()`, `pr_poll_ci()`, `pr_poll_review()`, `pr_merge()`, `pr_is_merged()`, `pr_walk_to_merge()`, `build_phase_protocol_prompt()`. Requires `lib/ci-helpers.sh`. | dev-agent.sh (future) | | `lib/pr-lifecycle.sh` | Reusable PR lifecycle library: `pr_create()`, `pr_find_by_branch()`, `pr_poll_ci()`, `pr_poll_review()`, `pr_merge()`, `pr_is_merged()`, `pr_walk_to_merge()`, `build_phase_protocol_prompt()`. `pr_poll_ci()` builds a **per-workflow/per-step CI diagnostics prompt** (#1051): on failure, each failed workflow gets its own section with step name, exit code (annotated with standard meanings for 126/127/128), and step-local log tail (via `ci_get_step_logs`); passing workflows are listed explicitly so agents don't waste fix attempts on them. Falls back to legacy combined-log fetch if per-step API is unavailable. Requires `lib/ci-helpers.sh`. | dev-agent.sh (future) |
| `lib/issue-lifecycle.sh` | Reusable issue lifecycle library: `issue_claim()` (add in-progress, remove backlog), `issue_release()` (remove in-progress, add backlog), `issue_block()` (post diagnostic comment with secret redaction, add blocked label), `issue_close()`, `issue_check_deps()` (parse deps, check transitive closure; sets `_ISSUE_BLOCKED_BY`, `_ISSUE_SUGGESTION`), `issue_suggest_next()` (find next unblocked backlog issue; sets `_ISSUE_NEXT`), `issue_post_refusal()` (structured refusal comment with dedup). Label IDs cached in globals on first lookup. Sources `lib/secret-scan.sh`. | dev-agent.sh (future) | | `lib/issue-lifecycle.sh` | Reusable issue lifecycle library: `issue_claim()` (add in-progress, remove backlog), `issue_release()` (remove in-progress, add backlog), `issue_block()` (post diagnostic comment with secret redaction, add blocked label), `issue_close()`, `issue_check_deps()` (parse deps, check transitive closure; sets `_ISSUE_BLOCKED_BY`, `_ISSUE_SUGGESTION`), `issue_suggest_next()` (find next unblocked backlog issue; sets `_ISSUE_NEXT`), `issue_post_refusal()` (structured refusal comment with dedup). Label IDs cached in globals on first lookup. Sources `lib/secret-scan.sh`. | dev-agent.sh (future) |
| `lib/action-vault.sh` | **Vault PR helper** — create vault action PRs on ops repo via Forgejo API (works from containers without SSH). `vault_request <action_id> <toml_content>` validates TOML (using `validate_vault_action` from `action-vault/vault-env.sh`), creates branch `vault/<action-id>`, writes `vault/actions/<action-id>.toml`, creates PR targeting `main` with title `vault: <action-id>` and body from context field, returns PR number. Idempotent: if PR exists, returns existing number. **Low-tier bypass**: if the action's `blast_radius` classifies as `low` (via `action-vault/classify.sh`), `vault_request` calls `_vault_commit_direct()` which commits directly to ops `main` using `FORGE_ADMIN_TOKEN` — no PR, no approval wait. Returns `0` (not a PR number) for direct commits. Requires `FORGE_TOKEN`, `FORGE_ADMIN_TOKEN` (low-tier only), `FORGE_URL`, `FORGE_REPO`, `FORGE_OPS_REPO`. Uses the calling agent's own token (saves/restores `FORGE_TOKEN` around sourcing `vault-env.sh`), so approval workflow respects individual agent identities. | dev-agent (vault actions), future vault dispatcher | | `lib/action-vault.sh` | **Vault PR helper** — create vault action PRs on ops repo via Forgejo API (works from containers without SSH). `vault_request <action_id> <toml_content>` validates TOML (using `validate_vault_action` from `action-vault/vault-env.sh`), creates branch `vault/<action-id>`, writes `vault/actions/<action-id>.toml`, creates PR targeting `main` with title `vault: <action-id>` and body from context field, returns PR number. Idempotent: if PR exists, returns existing number. **Low-tier bypass**: if the action's `blast_radius` classifies as `low` (via `action-vault/classify.sh`), `vault_request` calls `_vault_commit_direct()` which commits directly to ops `main` using `FORGE_ADMIN_TOKEN` — no PR, no approval wait. Returns `0` (not a PR number) for direct commits. Requires `FORGE_TOKEN`, `FORGE_ADMIN_TOKEN` (low-tier only), `FORGE_URL`, `FORGE_REPO`, `FORGE_OPS_REPO`. Uses the calling agent's own token (saves/restores `FORGE_TOKEN` around sourcing `vault-env.sh`), so approval workflow respects individual agent identities. | dev-agent (vault actions), future vault dispatcher |
| `lib/branch-protection.sh` | Branch protection helpers for Forgejo repos. `setup_vault_branch_protection()` — configures admin-only merge protection on main (require 1 approval, restrict merge to admin role, block direct pushes). `setup_profile_branch_protection()` — same protection for `.profile` repos. `verify_branch_protection()` — checks protection is correctly configured. `remove_branch_protection()` — removes protection (cleanup/testing). Handles race condition after initial push: retries with backoff if Forgejo hasn't processed the branch yet. Requires `FORGE_TOKEN`, `FORGE_URL`, `FORGE_OPS_REPO`. | bin/disinto (hire-an-agent) | | `lib/branch-protection.sh` | Branch protection helpers for Forgejo repos. `setup_vault_branch_protection()` — configures admin-only merge protection on main (require 1 approval, restrict merge to admin role, block direct pushes). `setup_profile_branch_protection()` — same protection for `.profile` repos. `verify_branch_protection()` — checks protection is correctly configured. `remove_branch_protection()` — removes protection (cleanup/testing). Handles race condition after initial push: retries with backoff if Forgejo hasn't processed the branch yet. Requires `FORGE_TOKEN`, `FORGE_URL`, `FORGE_OPS_REPO`. | bin/disinto (hire-an-agent) |
@ -30,7 +30,9 @@ sourced as needed.
| `lib/git-creds.sh` | Shared git credential helper configuration. `configure_git_creds([HOME_DIR] [RUN_AS_CMD])` — writes a static credential helper script and configures git globally to use password-based HTTP auth (Forgejo 11.x rejects API tokens for `git push`, #361). **Retry on cold boot (#741)**: resolves bot username from `FORGE_TOKEN` with 5 retries (exponential backoff 1-5s); fails loudly and returns 1 if Forgejo is unreachable — never falls back to a wrong hardcoded default (exports `BOT_USER` on success). `repair_baked_cred_urls([--as RUN_AS_CMD] DIR ...)` — rewrites any git remote URLs that have credentials baked in to use clean URLs instead; uses `safe.directory` bypass for root-owned repos (#671). Requires `FORGE_PASS`, `FORGE_URL`, `FORGE_TOKEN`. | entrypoints (agents, edge) | | `lib/git-creds.sh` | Shared git credential helper configuration. `configure_git_creds([HOME_DIR] [RUN_AS_CMD])` — writes a static credential helper script and configures git globally to use password-based HTTP auth (Forgejo 11.x rejects API tokens for `git push`, #361). **Retry on cold boot (#741)**: resolves bot username from `FORGE_TOKEN` with 5 retries (exponential backoff 1-5s); fails loudly and returns 1 if Forgejo is unreachable — never falls back to a wrong hardcoded default (exports `BOT_USER` on success). `repair_baked_cred_urls([--as RUN_AS_CMD] DIR ...)` — rewrites any git remote URLs that have credentials baked in to use clean URLs instead; uses `safe.directory` bypass for root-owned repos (#671). Requires `FORGE_PASS`, `FORGE_URL`, `FORGE_TOKEN`. | entrypoints (agents, edge) |
| `lib/ops-setup.sh` | `setup_ops_repo()` — creates ops repo on Forgejo if it doesn't exist, configures bot collaborators, clones/initializes ops repo locally, seeds directory structure (vault, knowledge, evidence, sprints). Evidence subdirectories seeded: engagement/, red-team/, holdout/, evolution/, user-test/. Also seeds sprints/ for architect output. Exports `_ACTUAL_OPS_SLUG`. `migrate_ops_repo(ops_root, [primary_branch])` — idempotent migration helper that seeds missing directories and .gitkeep files on existing ops repos (pre-#407 deployments). | bin/disinto (init) | | `lib/ops-setup.sh` | `setup_ops_repo()` — creates ops repo on Forgejo if it doesn't exist, configures bot collaborators, clones/initializes ops repo locally, seeds directory structure (vault, knowledge, evidence, sprints). Evidence subdirectories seeded: engagement/, red-team/, holdout/, evolution/, user-test/. Also seeds sprints/ for architect output. Exports `_ACTUAL_OPS_SLUG`. `migrate_ops_repo(ops_root, [primary_branch])` — idempotent migration helper that seeds missing directories and .gitkeep files on existing ops repos (pre-#407 deployments). | bin/disinto (init) |
| `lib/ci-setup.sh` | `_install_cron_impl()` — installs crontab entries for bare-metal deployments (compose mode uses polling loop instead). `_create_forgejo_oauth_app()` — generic helper to create an OAuth2 app on Forgejo (shared by Woodpecker and chat). `_create_woodpecker_oauth_impl()` — creates Woodpecker OAuth2 app (thin wrapper). `_create_chat_oauth_impl()` — creates disinto-chat OAuth2 app, writes `CHAT_OAUTH_CLIENT_ID`/`CHAT_OAUTH_CLIENT_SECRET` to `.env` (#708). `_generate_woodpecker_token_impl()` — auto-generates WOODPECKER_TOKEN via OAuth2 flow. `_activate_woodpecker_repo_impl()` — activates repo in Woodpecker. All gated by `_load_ci_context()` which validates required env vars. | bin/disinto (init) | | `lib/ci-setup.sh` | `_install_cron_impl()` — installs crontab entries for bare-metal deployments (compose mode uses polling loop instead). `_create_forgejo_oauth_app()` — generic helper to create an OAuth2 app on Forgejo (shared by Woodpecker and chat). `_create_woodpecker_oauth_impl()` — creates Woodpecker OAuth2 app (thin wrapper). `_create_chat_oauth_impl()` — creates disinto-chat OAuth2 app, writes `CHAT_OAUTH_CLIENT_ID`/`CHAT_OAUTH_CLIENT_SECRET` to `.env` (#708). `_generate_woodpecker_token_impl()` — auto-generates WOODPECKER_TOKEN via OAuth2 flow. `_activate_woodpecker_repo_impl()` — activates repo in Woodpecker. All gated by `_load_ci_context()` which validates required env vars. | bin/disinto (init) |
| `lib/generators.sh` | Template generation for `disinto init`: `generate_compose()` — docker-compose.yml (uses `codeberg.org/forgejo/forgejo:11.0` tag; `CLAUDE_BIN_DIR` volume mount removed from agents/llama services — only `reproduce` and `edge` still use the host-mounted CLI (#992); adds `security_opt: [apparmor:unconfined]` to all services for rootless container compatibility; Forgejo includes a healthcheck so dependent services use `condition: service_healthy` — fixes cold-start races, #665; adds `chat` service block with isolated `chat-config` named volume and `CHAT_HISTORY_DIR` bind-mount for per-user NDJSON history persistence (#710); injects `FORWARD_AUTH_SECRET` for Caddy↔chat defense-in-depth auth (#709); cost-cap env vars `CHAT_MAX_REQUESTS_PER_HOUR`, `CHAT_MAX_REQUESTS_PER_DAY`, `CHAT_MAX_TOKENS_PER_DAY` (#711); subdomain fallback comment for `EDGE_TUNNEL_FQDN_*` vars (#713); all `depends_on` now use `condition: service_healthy/started` instead of bare service names; all services now include `restart: unless-stopped` including the edge service — #768; agents service now uses `image: ghcr.io/disinto/agents:${DISINTO_IMAGE_TAG:-latest}` instead of `build:` (#429); `WOODPECKER_PLUGINS_PRIVILEGED` env var added to woodpecker service (#779); agents-llama conditional block gated on `ENABLE_LLAMA_AGENT=1` (#769); `agents-llama-all` compose service (profile `agents-llama-all`, all 7 roles: review,dev,gardener,architect,planner,predictor,supervisor) added by #801; agents service gains volume mounts for `./projects`, `./.env`, `./state`), `generate_caddyfile()` — Caddyfile (routes: `/forge/*` → forgejo:3000, `/woodpecker/*` → woodpecker:8000, `/staging/*` → staging:80; `/chat/login` and `/chat/oauth/callback` bypass `forward_auth` so unauthenticated users can reach the OAuth flow; `/chat/*` gated by `forward_auth` on `chat:8080/chat/auth/verify` which stamps `X-Forwarded-User` (#709); root `/` redirects to `/forge/`), `generate_staging_index()` — staging index, `generate_deploy_pipelines()` — Woodpecker deployment pipeline configs. Requires `FACTORY_ROOT`, `PROJECT_NAME`, `PRIMARY_BRANCH`. | bin/disinto (init) | | `lib/generators.sh` | Template generation for `disinto init`: `generate_compose()` — docker-compose.yml (**duplicate service detection**: tracks service names during generation, aborts with `ERROR: Duplicate service name '$name' detected` on conflict; detection state is reset between calls so idempotent reinvocation is safe, #850) (uses `codeberg.org/forgejo/forgejo:11.0` tag; `CLAUDE_BIN_DIR` volume mount removed from agents/llama services — only `reproduce` and `edge` still use the host-mounted CLI (#992); adds `security_opt: [apparmor:unconfined]` to all services for rootless container compatibility; Forgejo includes a healthcheck so dependent services use `condition: service_healthy` — fixes cold-start races, #665; adds `chat` service block with isolated `chat-config` named volume and `CHAT_HISTORY_DIR` bind-mount for per-user NDJSON history persistence (#710); injects `FORWARD_AUTH_SECRET` for Caddy↔chat defense-in-depth auth (#709); cost-cap env vars `CHAT_MAX_REQUESTS_PER_HOUR`, `CHAT_MAX_REQUESTS_PER_DAY`, `CHAT_MAX_TOKENS_PER_DAY` (#711); subdomain fallback comment for `EDGE_TUNNEL_FQDN_*` vars (#713); all `depends_on` now use `condition: service_healthy/started` instead of bare service names; all services now include `restart: unless-stopped` including the edge service — #768; agents service now uses `image: ghcr.io/disinto/agents:${DISINTO_IMAGE_TAG:-latest}` instead of `build:` (#429); `WOODPECKER_PLUGINS_PRIVILEGED` env var added to woodpecker service (#779); agents-llama conditional block gated on `ENABLE_LLAMA_AGENT=1` (#769); `agents-llama-all` compose service (profile `agents-llama-all`, all 7 roles: review,dev,gardener,architect,planner,predictor,supervisor) added by #801; agents service gains volume mounts for `./projects`, `./.env`, `./state`), `generate_caddyfile()` — Caddyfile (routes: `/forge/*` → forgejo:3000, `/woodpecker/*` → woodpecker:8000, `/staging/*` → staging:80; `/chat/login` and `/chat/oauth/callback` bypass `forward_auth` so unauthenticated users can reach the OAuth flow; `/chat/*` gated by `forward_auth` on `chat:8080/chat/auth/verify` which stamps `X-Forwarded-User` (#709); root `/` redirects to `/forge/`), `generate_staging_index()` — staging index, `generate_deploy_pipelines()` — Woodpecker deployment pipeline configs. Requires `FACTORY_ROOT`, `PROJECT_NAME`, `PRIMARY_BRANCH`. | bin/disinto (init) |
| `lib/backup.sh` | Factory backup creation. `backup_create <outfile.tar.gz>` — exports factory state: fetches all issues (open+closed) from the project and ops repos via Forgejo API, bundles the ops repo as a git bundle, and writes a tarball. Requires `FORGE_URL`, `FORGE_TOKEN`, `FORGE_REPO`, `FORGE_OPS_REPO`, `OPS_REPO_ROOT`. Sourced by `bin/disinto backup create` (#1057). | bin/disinto (backup create) |
| `lib/disinto/backup.sh` | Factory backup restore. `backup_import <infile.tar.gz>` — restores from a backup tarball: creates missing repos via Forgejo API, imports issues (idempotent — skips by number if present), unpacks ops repo git bundle. Idempotent: running twice produces same end state with no errors. Requires `FORGE_URL`, `FORGE_TOKEN`. Sourced by `bin/disinto backup import` (#1058). | bin/disinto (backup import) |
| `lib/sprint-filer.sh` | Post-merge sub-issue filer for sprint PRs. Invoked by the `.woodpecker/ops-filer.yml` pipeline after a sprint PR merges to ops repo `main`. Parses `<!-- filer:begin --> ... <!-- filer:end -->` blocks from sprint PR bodies to extract sub-issue definitions, creates them on the project repo using `FORGE_FILER_TOKEN` (narrow-scope `filer-bot` identity with `issues:write` only), adds `in-progress` label to the parent vision issue, and handles vision lifecycle closure when all sub-issues are closed. Uses `filer_api_all()` for paginated fetches. Idempotent: uses `<!-- decomposed-from: #<vision>, sprint: <slug>, id: <id> -->` markers to skip already-filed issues. Requires `FORGE_FILER_TOKEN`, `FORGE_API`, `FORGE_API_BASE`, `FORGE_OPS_REPO`. | `.woodpecker/ops-filer.yml` (CI pipeline on ops repo) | | `lib/sprint-filer.sh` | Post-merge sub-issue filer for sprint PRs. Invoked by the `.woodpecker/ops-filer.yml` pipeline after a sprint PR merges to ops repo `main`. Parses `<!-- filer:begin --> ... <!-- filer:end -->` blocks from sprint PR bodies to extract sub-issue definitions, creates them on the project repo using `FORGE_FILER_TOKEN` (narrow-scope `filer-bot` identity with `issues:write` only), adds `in-progress` label to the parent vision issue, and handles vision lifecycle closure when all sub-issues are closed. Uses `filer_api_all()` for paginated fetches. Idempotent: uses `<!-- decomposed-from: #<vision>, sprint: <slug>, id: <id> -->` markers to skip already-filed issues. Requires `FORGE_FILER_TOKEN`, `FORGE_API`, `FORGE_API_BASE`, `FORGE_OPS_REPO`. | `.woodpecker/ops-filer.yml` (CI pipeline on ops repo) |
| `lib/hire-agent.sh` | `disinto_hire_an_agent()` — user creation, `.profile` repo setup, formula copying, branch protection, and state marker creation for hiring a new agent. Requires `FORGE_URL`, `FORGE_TOKEN`, `FACTORY_ROOT`, `PROJECT_NAME`. Extracted from `bin/disinto`. | bin/disinto (hire) | | `lib/hire-agent.sh` | `disinto_hire_an_agent()` — user creation, `.profile` repo setup, formula copying, branch protection, and state marker creation for hiring a new agent. Requires `FORGE_URL`, `FORGE_TOKEN`, `FACTORY_ROOT`, `PROJECT_NAME`. Extracted from `bin/disinto`. | bin/disinto (hire) |
| `lib/release.sh` | `disinto_release()` — vault TOML creation, branch setup on ops repo, PR creation, and auto-merge request for a versioned release. `_assert_release_globals()` validates required env vars. Requires `FORGE_URL`, `FORGE_TOKEN`, `FORGE_OPS_REPO`, `FACTORY_ROOT`, `PRIMARY_BRANCH`. Extracted from `bin/disinto`. | bin/disinto (release) | | `lib/release.sh` | `disinto_release()` — vault TOML creation, branch setup on ops repo, PR creation, and auto-merge request for a versioned release. `_assert_release_globals()` validates required env vars. Requires `FORGE_URL`, `FORGE_TOKEN`, `FORGE_OPS_REPO`, `FACTORY_ROOT`, `PRIMARY_BRANCH`. Extracted from `bin/disinto`. | bin/disinto (release) |

View file

@ -52,8 +52,9 @@ claude_run_with_watchdog() {
out_file=$(mktemp) || return 1 out_file=$(mktemp) || return 1
trap 'rm -f "$out_file"' RETURN trap 'rm -f "$out_file"' RETURN
# Start claude in background, capturing stdout to temp file # Start claude in new process group (setsid creates new session, $pid is PGID leader)
"${cmd[@]}" > "$out_file" 2>>"$LOGFILE" & # All children of claude will inherit this process group
setsid "${cmd[@]}" > "$out_file" 2>>"$LOGFILE" &
pid=$! pid=$!
# Background watchdog: poll for final result marker # Background watchdog: poll for final result marker
@ -84,12 +85,12 @@ claude_run_with_watchdog() {
sleep "$grace" sleep "$grace"
if kill -0 "$pid" 2>/dev/null; then if kill -0 "$pid" 2>/dev/null; then
log "watchdog: claude -p idle for ${grace}s after final result; SIGTERM" log "watchdog: claude -p idle for ${grace}s after final result; SIGTERM"
kill -TERM "$pid" 2>/dev/null || true kill -TERM -- "-$pid" 2>/dev/null || true
# Give it a moment to clean up # Give it a moment to clean up
sleep 5 sleep 5
if kill -0 "$pid" 2>/dev/null; then if kill -0 "$pid" 2>/dev/null; then
log "watchdog: force kill after SIGTERM timeout" log "watchdog: force kill after SIGTERM timeout"
kill -KILL "$pid" 2>/dev/null || true kill -KILL -- "-$pid" 2>/dev/null || true
fi fi
fi fi
fi fi
@ -100,16 +101,16 @@ claude_run_with_watchdog() {
timeout --foreground "${CLAUDE_TIMEOUT:-7200}" tail --pid="$pid" -f /dev/null 2>/dev/null timeout --foreground "${CLAUDE_TIMEOUT:-7200}" tail --pid="$pid" -f /dev/null 2>/dev/null
rc=$? rc=$?
# Clean up the watchdog # Clean up the watchdog (target process group if it spawned children)
kill "$grace_pid" 2>/dev/null || true kill -- "-$grace_pid" 2>/dev/null || true
wait "$grace_pid" 2>/dev/null || true wait "$grace_pid" 2>/dev/null || true
# When timeout fires (rc=124), explicitly kill the orphaned claude process # When timeout fires (rc=124), explicitly kill the orphaned claude process group
# tail --pid is a passive waiter, not a supervisor # tail --pid is a passive waiter, not a supervisor
if [ "$rc" -eq 124 ]; then if [ "$rc" -eq 124 ]; then
kill "$pid" 2>/dev/null || true kill -TERM -- "-$pid" 2>/dev/null || true
sleep 1 sleep 1
kill -KILL "$pid" 2>/dev/null || true kill -KILL -- "-$pid" 2>/dev/null || true
fi fi
# Output the captured stdout # Output the captured stdout

136
lib/backup.sh Normal file
View file

@ -0,0 +1,136 @@
#!/usr/bin/env bash
# =============================================================================
# disinto backup — export factory state for migration
#
# Usage: source this file, then call backup_create <outfile.tar.gz>
# Requires: FORGE_URL, FORGE_TOKEN, FORGE_REPO, FORGE_OPS_REPO, OPS_REPO_ROOT
# =============================================================================
set -euo pipefail
# Fetch all issues (open + closed) for a repo slug and emit the normalized JSON array.
# Usage: _backup_fetch_issues <org/repo>
_backup_fetch_issues() {
local repo_slug="$1"
local api_url="${FORGE_API_BASE}/repos/${repo_slug}"
local all_issues="[]"
for state in open closed; do
local page=1
while true; do
local page_items
page_items=$(curl -sf -X GET \
-H "Authorization: token ${FORGE_TOKEN}" \
-H "Content-Type: application/json" \
"${api_url}/issues?state=${state}&type=issues&limit=50&page=${page}") || {
echo "ERROR: failed to fetch ${state} issues from ${repo_slug} (page ${page})" >&2
return 1
}
local count
count=$(printf '%s' "$page_items" | jq 'length' 2>/dev/null) || count=0
[ -z "$count" ] && count=0
[ "$count" -eq 0 ] && break
all_issues=$(printf '%s\n%s' "$all_issues" "$page_items" | jq -s 'add')
[ "$count" -lt 50 ] && break
page=$((page + 1))
done
done
# Normalize to the schema: number, title, body, labels, state
printf '%s' "$all_issues" | jq '[.[] | {
number: .number,
title: .title,
body: .body,
labels: [.labels[]?.name],
state: .state
}] | sort_by(.number)'
}
# Create a backup tarball of factory state.
# Usage: backup_create <outfile.tar.gz>
backup_create() {
local outfile="${1:-}"
if [ -z "$outfile" ]; then
echo "Error: output file required" >&2
echo "Usage: disinto backup create <outfile.tar.gz>" >&2
return 1
fi
# Resolve to absolute path before cd-ing into tmpdir
case "$outfile" in
/*) ;;
*) outfile="$(pwd)/${outfile}" ;;
esac
# Validate required env
: "${FORGE_URL:?FORGE_URL must be set}"
: "${FORGE_TOKEN:?FORGE_TOKEN must be set}"
: "${FORGE_REPO:?FORGE_REPO must be set}"
local forge_ops_repo="${FORGE_OPS_REPO:-${FORGE_REPO}-ops}"
local ops_repo_root="${OPS_REPO_ROOT:-}"
if [ -z "$ops_repo_root" ] || [ ! -d "$ops_repo_root/.git" ]; then
echo "Error: OPS_REPO_ROOT (${ops_repo_root:-<unset>}) is not a valid git repo" >&2
return 1
fi
local tmpdir
tmpdir=$(mktemp -d)
trap 'rm -rf "$tmpdir"' EXIT
local project_name="${FORGE_REPO##*/}"
echo "=== disinto backup create ==="
echo "Forge: ${FORGE_URL}"
echo "Repos: ${FORGE_REPO}, ${forge_ops_repo}"
# ── 1. Export issues ──────────────────────────────────────────────────────
mkdir -p "${tmpdir}/issues"
echo "Fetching issues for ${FORGE_REPO}..."
_backup_fetch_issues "$FORGE_REPO" > "${tmpdir}/issues/${project_name}.json"
local main_count
main_count=$(jq 'length' "${tmpdir}/issues/${project_name}.json")
echo " ${main_count} issues exported"
echo "Fetching issues for ${forge_ops_repo}..."
_backup_fetch_issues "$forge_ops_repo" > "${tmpdir}/issues/${project_name}-ops.json"
local ops_count
ops_count=$(jq 'length' "${tmpdir}/issues/${project_name}-ops.json")
echo " ${ops_count} issues exported"
# ── 2. Git bundle of ops repo ────────────────────────────────────────────
mkdir -p "${tmpdir}/repos"
echo "Creating git bundle for ${forge_ops_repo}..."
git -C "$ops_repo_root" bundle create "${tmpdir}/repos/${project_name}-ops.bundle" --all 2>&1
echo " bundle created ($(du -h "${tmpdir}/repos/${project_name}-ops.bundle" | cut -f1))"
# ── 3. Metadata ──────────────────────────────────────────────────────────
local created_at
created_at=$(date -u +"%Y-%m-%dT%H:%M:%SZ")
jq -n \
--arg created_at "$created_at" \
--arg source_host "$(hostname)" \
--argjson schema_version 1 \
--arg forgejo_url "$FORGE_URL" \
'{
created_at: $created_at,
source_host: $source_host,
schema_version: $schema_version,
forgejo_url: $forgejo_url
}' > "${tmpdir}/metadata.json"
# ── 4. Pack tarball ──────────────────────────────────────────────────────
echo "Creating tarball: ${outfile}"
tar -czf "$outfile" -C "$tmpdir" metadata.json issues repos
local size
size=$(du -h "$outfile" | cut -f1)
echo "=== Backup complete: ${outfile} (${size}) ==="
# Clean up before returning — the EXIT trap references the local $tmpdir
# which goes out of scope after return, causing 'unbound variable' under set -u.
trap - EXIT
rm -rf "$tmpdir"
}

View file

@ -142,6 +142,7 @@ _create_forgejo_oauth_app() {
# Set up Woodpecker CI to use Forgejo as its forge backend. # Set up Woodpecker CI to use Forgejo as its forge backend.
# Creates an OAuth2 app on Forgejo for Woodpecker, activates the repo. # Creates an OAuth2 app on Forgejo for Woodpecker, activates the repo.
# Respects EDGE_ROUTING_MODE: in subdomain mode, uses EDGE_TUNNEL_FQDN_CI for redirect URI.
# Usage: create_woodpecker_oauth <forge_url> <repo_slug> # Usage: create_woodpecker_oauth <forge_url> <repo_slug>
_create_woodpecker_oauth_impl() { _create_woodpecker_oauth_impl() {
local forge_url="$1" local forge_url="$1"
@ -150,7 +151,13 @@ _create_woodpecker_oauth_impl() {
echo "" echo ""
echo "── Woodpecker OAuth2 setup ────────────────────────────" echo "── Woodpecker OAuth2 setup ────────────────────────────"
_create_forgejo_oauth_app "woodpecker-ci" "http://localhost:8000/authorize" || return 0 local wp_redirect_uri="http://localhost:8000/authorize"
local routing_mode="${EDGE_ROUTING_MODE:-subpath}"
if [ "$routing_mode" = "subdomain" ] && [ -n "${EDGE_TUNNEL_FQDN_CI:-}" ]; then
wp_redirect_uri="https://${EDGE_TUNNEL_FQDN_CI}/authorize"
fi
_create_forgejo_oauth_app "woodpecker-ci" "$wp_redirect_uri" || return 0
local client_id="${_OAUTH_CLIENT_ID}" local client_id="${_OAUTH_CLIENT_ID}"
local client_secret="${_OAUTH_CLIENT_SECRET}" local client_secret="${_OAUTH_CLIENT_SECRET}"
@ -158,10 +165,15 @@ _create_woodpecker_oauth_impl() {
# WP_FORGEJO_CLIENT/SECRET match the docker-compose.yml variable references # WP_FORGEJO_CLIENT/SECRET match the docker-compose.yml variable references
# WOODPECKER_HOST must be host-accessible URL to match OAuth2 redirect_uri # WOODPECKER_HOST must be host-accessible URL to match OAuth2 redirect_uri
local env_file="${FACTORY_ROOT}/.env" local env_file="${FACTORY_ROOT}/.env"
local wp_host="http://localhost:8000"
if [ "$routing_mode" = "subdomain" ] && [ -n "${EDGE_TUNNEL_FQDN_CI:-}" ]; then
wp_host="https://${EDGE_TUNNEL_FQDN_CI}"
fi
local wp_vars=( local wp_vars=(
"WOODPECKER_FORGEJO=true" "WOODPECKER_FORGEJO=true"
"WOODPECKER_FORGEJO_URL=${forge_url}" "WOODPECKER_FORGEJO_URL=${forge_url}"
"WOODPECKER_HOST=http://localhost:8000" "WOODPECKER_HOST=${wp_host}"
) )
if [ -n "${client_id:-}" ]; then if [ -n "${client_id:-}" ]; then
wp_vars+=("WP_FORGEJO_CLIENT=${client_id}") wp_vars+=("WP_FORGEJO_CLIENT=${client_id}")

391
lib/disinto/backup.sh Normal file
View file

@ -0,0 +1,391 @@
#!/usr/bin/env bash
# =============================================================================
# backup.sh — backup/restore utilities for disinto factory state
#
# Subcommands:
# create <outfile.tar.gz> Create backup of factory state
# import <infile.tar.gz> Restore factory state from backup
#
# Usage:
# source "${FACTORY_ROOT}/lib/disinto/backup.sh"
# backup_import <tarball>
#
# Environment:
# FORGE_URL - Forgejo instance URL (target)
# FORGE_TOKEN - Admin token for target Forgejo
#
# Idempotency:
# - Repos: created via API if missing
# - Issues: check if exists by number, skip if present
# - Runs twice = same end state, no errors
# =============================================================================
set -euo pipefail
# ── Helper: log with timestamp ───────────────────────────────────────────────
backup_log() {
local msg="$1"
echo "[$(date '+%Y-%m-%d %H:%M:%S')] $msg"
}
# ── Helper: create repo if it doesn't exist ─────────────────────────────────
# Usage: backup_create_repo_if_missing <slug>
# Returns: 0 if repo exists or was created, 1 on error
backup_create_repo_if_missing() {
local slug="$1"
local org_name="${slug%%/*}"
local repo_name="${slug##*/}"
# Check if repo exists
if curl -sf --max-time 5 \
-H "Authorization: token ${FORGE_TOKEN}" \
"${FORGE_URL}/api/v1/repos/${slug}" >/dev/null 2>&1; then
backup_log "Repo ${slug} already exists"
return 0
fi
backup_log "Creating repo ${slug}..."
# Create org if needed
curl -sf -X POST \
-H "Authorization: token ${FORGE_TOKEN}" \
-H "Content-Type: application/json" \
"${FORGE_URL}/api/v1/orgs" \
-d "{\"username\":\"${org_name}\",\"visibility\":\"public\"}" >/dev/null 2>&1 || true
# Create repo
local response
response=$(curl -sf -X POST \
-H "Authorization: token ${FORGE_TOKEN}" \
-H "Content-Type: application/json" \
"${FORGE_URL}/api/v1/orgs/${org_name}/repos" \
-d "{\"name\":\"${repo_name}\",\"auto_init\":false,\"default_branch\":\"main\"}" 2>/dev/null) \
|| response=""
if [ -n "$response" ] && echo "$response" | grep -q '"id":\|[0-9]'; then
backup_log "Created repo ${slug}"
BACKUP_CREATED_REPOS=$((BACKUP_CREATED_REPOS + 1))
return 0
fi
# Fallback: admin endpoint
response=$(curl -sf -X POST \
-H "Authorization: token ${FORGE_TOKEN}" \
-H "Content-Type: application/json" \
"${FORGE_URL}/api/v1/admin/users/${org_name}/repos" \
-d "{\"name\":\"${repo_name}\",\"auto_init\":false,\"default_branch\":\"main\"}" 2>/dev/null) \
|| response=""
if [ -n "$response" ] && echo "$response" | grep -q '"id":\|[0-9]'; then
backup_log "Created repo ${slug} (via admin API)"
BACKUP_CREATED_REPOS=$((BACKUP_CREATED_REPOS + 1))
return 0
fi
backup_log "ERROR: failed to create repo ${slug}" >&2
return 1
}
# ── Helper: check if issue exists by number ──────────────────────────────────
# Usage: backup_issue_exists <slug> <issue_number>
# Returns: 0 if exists, 1 if not
backup_issue_exists() {
local slug="$1"
local issue_num="$2"
curl -sf --max-time 5 \
-H "Authorization: token ${FORGE_TOKEN}" \
"${FORGE_URL}/api/v1/repos/${slug}/issues/${issue_num}" >/dev/null 2>&1
}
# ── Helper: create issue with specific number (if Forgejo supports it) ───────
# Note: Forgejo API auto-assigns next integer; we accept renumbering and log mapping
# Usage: backup_create_issue <slug> <original_number> <title> <body> [labels...]
# Returns: new_issue_number on success, 0 on failure
backup_create_issue() {
local slug="$1"
local original_num="$2"
local title="$3"
local body="$4"
shift 4
# Build labels array
local -a labels=()
for label in "$@"; do
# Resolve label name to ID
local label_id
label_id=$(curl -sf --max-time 5 \
-H "Authorization: token ${FORGE_TOKEN}" \
"${FORGE_URL}/api/v1/repos/${slug}/labels" 2>/dev/null \
| jq -r ".[] | select(.name == \"${label}\") | .id" 2>/dev/null) || label_id=""
if [ -n "$label_id" ] && [ "$label_id" != "null" ]; then
labels+=("$label_id")
fi
done
# Build payload
local payload
if [ ${#labels[@]} -gt 0 ]; then
payload=$(jq -n \
--arg title "$title" \
--arg body "$body" \
--argjson labels "$(printf '%s\n' "${labels[@]}" | jq -R . | jq -s .)" \
'{title: $title, body: $body, labels: $labels}')
else
payload=$(jq -n --arg title "$title" --arg body "$body" '{title: $title, body: $body, labels: []}')
fi
local response
response=$(curl -sf -X POST \
-H "Authorization: token ${FORGE_TOKEN}" \
-H "Content-Type: application/json" \
"${FORGE_URL}/api/v1/repos/${slug}/issues" \
-d "$payload" 2>/dev/null) || {
backup_log "ERROR: failed to create issue '${title}'" >&2
return 1
}
local new_num
new_num=$(printf '%s' "$response" | jq -r '.number // empty')
# Log the mapping
echo "${original_num}:${new_num}" >> "${BACKUP_MAPPING_FILE}"
backup_log "Created issue '${title}' as #${new_num} (original: #${original_num})"
echo "$new_num"
}
# ── Step 1: Unpack tarball to temp dir ───────────────────────────────────────
# Usage: backup_unpack_tarball <tarball>
# Returns: temp dir path via BACKUP_TEMP_DIR
backup_unpack_tarball() {
local tarball="$1"
if [ ! -f "$tarball" ]; then
backup_log "ERROR: tarball not found: ${tarball}" >&2
return 1
fi
BACKUP_TEMP_DIR=$(mktemp -d -t disinto-backup.XXXXXX)
backup_log "Unpacking ${tarball} to ${BACKUP_TEMP_DIR}"
if ! tar -xzf "$tarball" -C "$BACKUP_TEMP_DIR"; then
backup_log "ERROR: failed to unpack tarball" >&2
rm -rf "$BACKUP_TEMP_DIR"
return 1
fi
# Verify expected structure
if [ ! -d "${BACKUP_TEMP_DIR}/repos" ]; then
backup_log "ERROR: tarball missing 'repos/' directory" >&2
rm -rf "$BACKUP_TEMP_DIR"
return 1
fi
backup_log "Tarball unpacked successfully"
}
# ── Step 2: disinto repo — create via Forgejo API, trigger sync (manual) ─────
# Usage: backup_import_disinto_repo
# Returns: 0 on success, 1 on failure
backup_import_disinto_repo() {
backup_log "Step 2: Configuring disinto repo..."
# Create disinto repo if missing
backup_create_repo_if_missing "disinto-admin/disinto"
# Note: Manual mirror configuration recommended (avoids SSH deploy-key handling)
backup_log "Note: Configure Codeberg → Forgejo pull mirror manually"
backup_log " Run on Forgejo admin panel: Repository Settings → Repository Mirroring"
backup_log " Source: ssh://git@codeberg.org/johba/disinto.git"
backup_log " Mirror: disinto-admin/disinto"
backup_log " Or use: git clone --mirror ssh://git@codeberg.org/johba/disinto.git"
backup_log " cd disinto.git && git push --mirror ${FORGE_URL}/disinto-admin/disinto.git"
return 0
}
# ── Step 3: disinto-ops repo — create empty, push from bundle ────────────────
# Usage: backup_import_disinto_ops_repo
# Returns: 0 on success, 1 on failure
backup_import_disinto_ops_repo() {
backup_log "Step 3: Configuring disinto-ops repo from bundle..."
local bundle_path="${BACKUP_TEMP_DIR}/repos/disinto-ops.bundle"
if [ ! -f "$bundle_path" ]; then
backup_log "WARNING: Bundle not found at ${bundle_path}, skipping"
return 0
fi
# Create ops repo if missing
backup_create_repo_if_missing "disinto-admin/disinto-ops"
# Clone bundle and push to Forgejo
local clone_dir
clone_dir=$(mktemp -d -t disinto-ops-clone.XXXXXX)
backup_log "Cloning bundle to ${clone_dir}"
if ! git clone --bare "$bundle_path" "$clone_dir/disinto-ops.git"; then
backup_log "ERROR: failed to clone bundle"
rm -rf "$clone_dir"
return 1
fi
# Push all refs to Forgejo
backup_log "Pushing refs to Forgejo..."
if ! cd "$clone_dir/disinto-ops.git" && \
git push --mirror "${FORGE_URL}/disinto-admin/disinto-ops.git" 2>&1; then
backup_log "ERROR: failed to push refs"
rm -rf "$clone_dir"
return 1
fi
local ref_count
ref_count=$(cd "$clone_dir/disinto-ops.git" && git show-ref | wc -l)
BACKUP_PUSHED_REFS=$((BACKUP_PUSHED_REFS + ref_count))
backup_log "Pushed ${ref_count} refs to disinto-ops"
rm -rf "$clone_dir"
return 0
}
# ── Step 4: Import issues from backup ────────────────────────────────────────
# Usage: backup_import_issues <slug> <issues_file>
# issues_file is a JSON array of issues (per create schema)
# Returns: 0 on success
backup_import_issues() {
local slug="$1"
local issues_file="$2"
if [ ! -f "$issues_file" ]; then
backup_log "No issues file found, skipping"
return 0
fi
local count
count=$(jq 'length' "$issues_file")
backup_log "Importing ${count} issues from ${issues_file}"
local created=0
local skipped=0
for i in $(seq 0 $((count - 1))); do
local issue_num title body
issue_num=$(jq -r ".[${i}].number" "$issues_file")
title=$(jq -r ".[${i}].title" "$issues_file")
body=$(jq -r ".[${i}].body" "$issues_file")
if [ -z "$issue_num" ] || [ "$issue_num" = "null" ]; then
backup_log "WARNING: skipping issue without number at index ${i}"
continue
fi
# Check if issue already exists
if backup_issue_exists "$slug" "$issue_num"; then
backup_log "Issue #${issue_num} already exists, skipping"
skipped=$((skipped + 1))
continue
fi
# Extract labels
local -a labels=()
while IFS= read -r label; do
[ -n "$label" ] && labels+=("$label")
done < <(jq -r ".[${i}].labels[]? // empty" "$issues_file")
# Create issue
local new_num
if new_num=$(backup_create_issue "$slug" "$issue_num" "$title" "$body" "${labels[@]}"); then
created=$((created + 1))
fi
done
BACKUP_CREATED_ISSUES=$((BACKUP_CREATED_ISSUES + created))
BACKUP_SKIPPED_ISSUES=$((BACKUP_SKIPPED_ISSUES + skipped))
backup_log "Created ${created} issues, skipped ${skipped}"
}
# ── Main: import subcommand ──────────────────────────────────────────────────
# Usage: backup_import <tarball>
backup_import() {
local tarball="$1"
# Validate required environment
[ -n "${FORGE_URL:-}" ] || { echo "Error: FORGE_URL not set" >&2; exit 1; }
[ -n "${FORGE_TOKEN:-}" ] || { echo "Error: FORGE_TOKEN not set" >&2; exit 1; }
backup_log "=== Backup Import Started ==="
backup_log "Target: ${FORGE_URL}"
backup_log "Tarball: ${tarball}"
# Initialize counters
BACKUP_CREATED_REPOS=0
BACKUP_PUSHED_REFS=0
BACKUP_CREATED_ISSUES=0
BACKUP_SKIPPED_ISSUES=0
# Create temp dir for mapping file
BACKUP_MAPPING_FILE=$(mktemp -t disinto-mapping.XXXXXX.json)
echo '{"mappings": []}' > "$BACKUP_MAPPING_FILE"
# Step 1: Unpack tarball
if ! backup_unpack_tarball "$tarball"; then
exit 1
fi
# Step 2: disinto repo
if ! backup_import_disinto_repo; then
exit 1
fi
# Step 3: disinto-ops repo
if ! backup_import_disinto_ops_repo; then
exit 1
fi
# Step 4: Import issues — iterate issues/<slug>.json files, each is a JSON array
for issues_file in "${BACKUP_TEMP_DIR}/issues"/*.json; do
[ -f "$issues_file" ] || continue
local slug_filename
slug_filename=$(basename "$issues_file" .json)
# Map slug-filename → forgejo-slug: "disinto" → "disinto-admin/disinto",
# "disinto-ops" → "disinto-admin/disinto-ops"
local slug
case "$slug_filename" in
"disinto") slug="${FORGE_REPO}" ;;
"disinto-ops") slug="${FORGE_OPS_REPO}" ;;
*) slug="disinto-admin/${slug_filename}" ;;
esac
backup_log "Processing issues from ${slug_filename}.json (${slug})"
backup_import_issues "$slug" "$issues_file"
done
# Summary
backup_log "=== Backup Import Complete ==="
backup_log "Created ${BACKUP_CREATED_REPOS} repos"
backup_log "Pushed ${BACKUP_PUSHED_REFS} refs"
backup_log "Imported ${BACKUP_CREATED_ISSUES} issues"
backup_log "Skipped ${BACKUP_SKIPPED_ISSUES} (already present)"
backup_log "Issue mapping saved to: ${BACKUP_MAPPING_FILE}"
# Cleanup
rm -rf "$BACKUP_TEMP_DIR"
exit 0
}
# ── Entry point: if sourced, don't run; if executed directly, run import ────
if [[ "${BASH_SOURCE[0]}" == "${0}" ]]; then
if [ $# -lt 1 ]; then
echo "Usage: $0 <tarball>" >&2
exit 1
fi
backup_import "$1"
fi

View file

@ -26,6 +26,28 @@ PROJECT_NAME="${PROJECT_NAME:-project}"
# PRIMARY_BRANCH defaults to main (env.sh may have set it to 'master') # PRIMARY_BRANCH defaults to main (env.sh may have set it to 'master')
PRIMARY_BRANCH="${PRIMARY_BRANCH:-main}" PRIMARY_BRANCH="${PRIMARY_BRANCH:-main}"
# Track service names for duplicate detection
declare -A _seen_services
declare -A _service_sources
# Record a service name and its source; return 0 if unique, 1 if duplicate
_record_service() {
local service_name="$1"
local source="$2"
if [ -n "${_seen_services[$service_name]:-}" ]; then
local original_source="${_service_sources[$service_name]}"
echo "ERROR: Duplicate service name '$service_name' detected —" >&2
echo " '$service_name' emitted twice — from $original_source and from $source" >&2
echo " Remove one of the conflicting activations to proceed." >&2
return 1
fi
_seen_services[$service_name]=1
_service_sources[$service_name]="$source"
return 0
}
# Helper: extract woodpecker_repo_id from a project TOML file # Helper: extract woodpecker_repo_id from a project TOML file
# Returns empty string if not found or file doesn't exist # Returns empty string if not found or file doesn't exist
_get_woodpecker_repo_id() { _get_woodpecker_repo_id() {
@ -97,6 +119,16 @@ _generate_local_model_services() {
POLL_INTERVAL) poll_interval_val="$value" ;; POLL_INTERVAL) poll_interval_val="$value" ;;
---) ---)
if [ -n "$service_name" ] && [ -n "$base_url" ]; then if [ -n "$service_name" ] && [ -n "$base_url" ]; then
# Record service for duplicate detection using the full service name
local full_service_name="agents-${service_name}"
local toml_basename
toml_basename=$(basename "$toml")
if ! _record_service "$full_service_name" "[agents.$service_name] in projects/$toml_basename"; then
# Duplicate detected — clean up and abort
rm -f "$temp_file"
return 1
fi
# Per-agent FORGE_TOKEN / FORGE_PASS lookup (#834 Gap 3). # Per-agent FORGE_TOKEN / FORGE_PASS lookup (#834 Gap 3).
# Two hired llama agents must not share the same Forgejo identity, # Two hired llama agents must not share the same Forgejo identity,
# so we key the env-var lookup by forge_user (which hire-agent.sh # so we key the env-var lookup by forge_user (which hire-agent.sh
@ -281,6 +313,21 @@ _generate_compose_impl() {
return 0 return 0
fi fi
# Reset duplicate detection state for fresh run
_seen_services=()
_service_sources=()
# Initialize duplicate detection with base services defined in the template
_record_service "forgejo" "base compose template" || return 1
_record_service "woodpecker" "base compose template" || return 1
_record_service "woodpecker-agent" "base compose template" || return 1
_record_service "agents" "base compose template" || return 1
_record_service "runner" "base compose template" || return 1
_record_service "edge" "base compose template" || return 1
_record_service "staging" "base compose template" || return 1
_record_service "staging-deploy" "base compose template" || return 1
_record_service "chat" "base compose template" || return 1
# Extract primary woodpecker_repo_id from project TOML files # Extract primary woodpecker_repo_id from project TOML files
local wp_repo_id local wp_repo_id
wp_repo_id=$(_get_primary_woodpecker_repo_id) wp_repo_id=$(_get_primary_woodpecker_repo_id)
@ -358,6 +405,9 @@ services:
WOODPECKER_SERVER: localhost:9000 WOODPECKER_SERVER: localhost:9000
WOODPECKER_AGENT_SECRET: ${WOODPECKER_AGENT_SECRET:-} WOODPECKER_AGENT_SECRET: ${WOODPECKER_AGENT_SECRET:-}
WOODPECKER_GRPC_SECURE: "false" WOODPECKER_GRPC_SECURE: "false"
WOODPECKER_GRPC_KEEPALIVE_TIME: "10s"
WOODPECKER_GRPC_KEEPALIVE_TIMEOUT: "20s"
WOODPECKER_GRPC_KEEPALIVE_PERMIT_WITHOUT_CALLS: "true"
WOODPECKER_HEALTHCHECK_ADDR: ":3333" WOODPECKER_HEALTHCHECK_ADDR: ":3333"
WOODPECKER_BACKEND_DOCKER_NETWORK: ${WOODPECKER_CI_NETWORK:-disinto_disinto-net} WOODPECKER_BACKEND_DOCKER_NETWORK: ${WOODPECKER_CI_NETWORK:-disinto_disinto-net}
WOODPECKER_MAX_WORKFLOWS: 1 WOODPECKER_MAX_WORKFLOWS: 1
@ -436,6 +486,76 @@ services:
COMPOSEEOF COMPOSEEOF
# ── Conditional agents-llama block (ENABLE_LLAMA_AGENT=1) ──────────────
# This legacy flag was removed in #846 but kept for duplicate detection testing
if [ "${ENABLE_LLAMA_AGENT:-0}" = "1" ]; then
if ! _record_service "agents-llama" "ENABLE_LLAMA_AGENT=1"; then
return 1
fi
cat >> "$compose_file" <<'COMPOSEEOF'
agents-llama:
image: ghcr.io/disinto/agents:${DISINTO_IMAGE_TAG:-latest}
container_name: disinto-agents-llama
restart: unless-stopped
security_opt:
- apparmor=unconfined
volumes:
- agent-data:/home/agent/data
- project-repos:/home/agent/repos
- ${CLAUDE_SHARED_DIR:-/var/lib/disinto/claude-shared}:${CLAUDE_SHARED_DIR:-/var/lib/disinto/claude-shared}
- ${CLAUDE_CONFIG_FILE:-${HOME}/.claude.json}:/home/agent/.claude.json:ro
- ${AGENT_SSH_DIR:-${HOME}/.ssh}:/home/agent/.ssh:ro
- woodpecker-data:/woodpecker-data:ro
- ./projects:/home/agent/disinto/projects:ro
- ./.env:/home/agent/disinto/.env:ro
- ./state:/home/agent/disinto/state
environment:
FORGE_URL: http://forgejo:3000
FORGE_REPO: ${FORGE_REPO:-disinto-admin/disinto}
FORGE_TOKEN: ${FORGE_TOKEN:-}
FORGE_REVIEW_TOKEN: ${FORGE_REVIEW_TOKEN:-}
FORGE_PLANNER_TOKEN: ${FORGE_PLANNER_TOKEN:-}
FORGE_GARDENER_TOKEN: ${FORGE_GARDENER_TOKEN:-}
FORGE_VAULT_TOKEN: ${FORGE_VAULT_TOKEN:-}
FORGE_SUPERVISOR_TOKEN: ${FORGE_SUPERVISOR_TOKEN:-}
FORGE_PREDICTOR_TOKEN: ${FORGE_PREDICTOR_TOKEN:-}
FORGE_ARCHITECT_TOKEN: ${FORGE_ARCHITECT_TOKEN:-}
FORGE_BOT_USERNAMES: ${FORGE_BOT_USERNAMES:-}
WOODPECKER_TOKEN: ${WOODPECKER_TOKEN:-}
CLAUDE_TIMEOUT: ${CLAUDE_TIMEOUT:-7200}
CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC: ${CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC:-1}
ANTHROPIC_API_KEY: ${ANTHROPIC_API_KEY:-}
FORGE_PASS: ${FORGE_PASS:-}
FORGE_ADMIN_PASS: ${FORGE_ADMIN_PASS:-}
FACTORY_REPO: ${FORGE_REPO:-disinto-admin/disinto}
DISINTO_CONTAINER: "1"
PROJECT_NAME: ${PROJECT_NAME:-project}
PROJECT_REPO_ROOT: /home/agent/repos/${PROJECT_NAME:-project}
WOODPECKER_DATA_DIR: /woodpecker-data
WOODPECKER_REPO_ID: "PLACEHOLDER_WP_REPO_ID"
CLAUDE_CONFIG_DIR: ${CLAUDE_CONFIG_DIR:-/var/lib/disinto/claude-shared/config}
POLL_INTERVAL: ${POLL_INTERVAL:-300}
GARDENER_INTERVAL: ${GARDENER_INTERVAL:-21600}
ARCHITECT_INTERVAL: ${ARCHITECT_INTERVAL:-21600}
PLANNER_INTERVAL: ${PLANNER_INTERVAL:-43200}
healthcheck:
test: ["CMD", "pgrep", "-f", "entrypoint.sh"]
interval: 60s
timeout: 5s
retries: 3
start_period: 30s
depends_on:
forgejo:
condition: service_healthy
woodpecker:
condition: service_started
networks:
- disinto-net
COMPOSEEOF
fi
# Resume the rest of the compose file (runner onward) # Resume the rest of the compose file (runner onward)
cat >> "$compose_file" <<'COMPOSEEOF' cat >> "$compose_file" <<'COMPOSEEOF'
@ -487,9 +607,12 @@ COMPOSEEOF
- EDGE_TUNNEL_USER=${EDGE_TUNNEL_USER:-tunnel} - EDGE_TUNNEL_USER=${EDGE_TUNNEL_USER:-tunnel}
- EDGE_TUNNEL_PORT=${EDGE_TUNNEL_PORT:-} - EDGE_TUNNEL_PORT=${EDGE_TUNNEL_PORT:-}
- EDGE_TUNNEL_FQDN=${EDGE_TUNNEL_FQDN:-} - EDGE_TUNNEL_FQDN=${EDGE_TUNNEL_FQDN:-}
# Subdomain fallback (#713): if subpath routing (#704/#708) fails, add: # Subdomain fallback (#1028): per-service FQDNs for subdomain routing mode.
# EDGE_TUNNEL_FQDN_FORGE, EDGE_TUNNEL_FQDN_CI, EDGE_TUNNEL_FQDN_CHAT # Set EDGE_ROUTING_MODE=subdomain to activate. See docs/edge-routing-fallback.md.
# See docs/edge-routing-fallback.md for the full pivot plan. - EDGE_ROUTING_MODE=${EDGE_ROUTING_MODE:-subpath}
- EDGE_TUNNEL_FQDN_FORGE=${EDGE_TUNNEL_FQDN_FORGE:-}
- EDGE_TUNNEL_FQDN_CI=${EDGE_TUNNEL_FQDN_CI:-}
- EDGE_TUNNEL_FQDN_CHAT=${EDGE_TUNNEL_FQDN_CHAT:-}
# Shared secret for Caddy ↔ chat forward_auth (#709) # Shared secret for Caddy ↔ chat forward_auth (#709)
- FORWARD_AUTH_SECRET=${FORWARD_AUTH_SECRET:-} - FORWARD_AUTH_SECRET=${FORWARD_AUTH_SECRET:-}
volumes: volumes:
@ -580,6 +703,8 @@ COMPOSEEOF
CHAT_OAUTH_CLIENT_ID: ${CHAT_OAUTH_CLIENT_ID:-} CHAT_OAUTH_CLIENT_ID: ${CHAT_OAUTH_CLIENT_ID:-}
CHAT_OAUTH_CLIENT_SECRET: ${CHAT_OAUTH_CLIENT_SECRET:-} CHAT_OAUTH_CLIENT_SECRET: ${CHAT_OAUTH_CLIENT_SECRET:-}
EDGE_TUNNEL_FQDN: ${EDGE_TUNNEL_FQDN:-} EDGE_TUNNEL_FQDN: ${EDGE_TUNNEL_FQDN:-}
EDGE_TUNNEL_FQDN_CHAT: ${EDGE_TUNNEL_FQDN_CHAT:-}
EDGE_ROUTING_MODE: ${EDGE_ROUTING_MODE:-subpath}
DISINTO_CHAT_ALLOWED_USERS: ${DISINTO_CHAT_ALLOWED_USERS:-} DISINTO_CHAT_ALLOWED_USERS: ${DISINTO_CHAT_ALLOWED_USERS:-}
# Shared secret for Caddy forward_auth verify endpoint (#709) # Shared secret for Caddy forward_auth verify endpoint (#709)
FORWARD_AUTH_SECRET: ${FORWARD_AUTH_SECRET:-} FORWARD_AUTH_SECRET: ${FORWARD_AUTH_SECRET:-}
@ -631,7 +756,10 @@ COMPOSEEOF
fi fi
# Append local-model agent services if any are configured # Append local-model agent services if any are configured
_generate_local_model_services "$compose_file" if ! _generate_local_model_services "$compose_file"; then
echo "ERROR: Failed to generate local-model agent services. See errors above." >&2
return 1
fi
# Resolve the Claude CLI binary path and persist as CLAUDE_BIN_DIR in .env. # Resolve the Claude CLI binary path and persist as CLAUDE_BIN_DIR in .env.
# Only used by reproduce and edge services which still use host-mounted CLI. # Only used by reproduce and edge services which still use host-mounted CLI.
@ -682,6 +810,11 @@ _generate_agent_docker_impl() {
# Output path: ${FACTORY_ROOT}/docker/Caddyfile (gitignored — generated artifact). # Output path: ${FACTORY_ROOT}/docker/Caddyfile (gitignored — generated artifact).
# The edge compose service mounts this path as /etc/caddy/Caddyfile. # The edge compose service mounts this path as /etc/caddy/Caddyfile.
# On a fresh clone, `disinto init` calls generate_caddyfile before first `disinto up`. # On a fresh clone, `disinto init` calls generate_caddyfile before first `disinto up`.
#
# Routing mode (EDGE_ROUTING_MODE env var):
# subpath — (default) all services under <project>.disinto.ai/{forge,ci,chat,staging}
# subdomain — per-service subdomains: forge.<project>, ci.<project>, chat.<project>
# See docs/edge-routing-fallback.md for the full pivot plan.
_generate_caddyfile_impl() { _generate_caddyfile_impl() {
local docker_dir="${FACTORY_ROOT}/docker" local docker_dir="${FACTORY_ROOT}/docker"
local caddyfile="${docker_dir}/Caddyfile" local caddyfile="${docker_dir}/Caddyfile"
@ -691,8 +824,22 @@ _generate_caddyfile_impl() {
return return
fi fi
local routing_mode="${EDGE_ROUTING_MODE:-subpath}"
if [ "$routing_mode" = "subdomain" ]; then
_generate_caddyfile_subdomain "$caddyfile"
else
_generate_caddyfile_subpath "$caddyfile"
fi
echo "Created: ${caddyfile} (routing_mode=${routing_mode})"
}
# Subpath Caddyfile: all services under a single :80 block with path-based routing.
_generate_caddyfile_subpath() {
local caddyfile="$1"
cat > "$caddyfile" <<'CADDYFILEEOF' cat > "$caddyfile" <<'CADDYFILEEOF'
# Caddyfile — edge proxy configuration # Caddyfile — edge proxy configuration (subpath mode)
# IP-only binding at bootstrap; domain + TLS added later via vault resource request # IP-only binding at bootstrap; domain + TLS added later via vault resource request
:80 { :80 {
@ -735,8 +882,50 @@ _generate_caddyfile_impl() {
} }
} }
CADDYFILEEOF CADDYFILEEOF
}
echo "Created: ${caddyfile}" # Subdomain Caddyfile: four host blocks per docs/edge-routing-fallback.md.
# Uses env vars EDGE_TUNNEL_FQDN_FORGE, EDGE_TUNNEL_FQDN_CI, EDGE_TUNNEL_FQDN_CHAT,
# and EDGE_TUNNEL_FQDN (main project domain → staging).
_generate_caddyfile_subdomain() {
local caddyfile="$1"
cat > "$caddyfile" <<'CADDYFILEEOF'
# Caddyfile — edge proxy configuration (subdomain mode)
# Per-service subdomains; see docs/edge-routing-fallback.md
# Main project domain — staging / landing
{$EDGE_TUNNEL_FQDN} {
reverse_proxy staging:80
}
# Forgejo — root path, no subpath rewrite needed
{$EDGE_TUNNEL_FQDN_FORGE} {
reverse_proxy forgejo:3000
}
# Woodpecker CI — root path
{$EDGE_TUNNEL_FQDN_CI} {
reverse_proxy woodpecker:8000
}
# Chat — with forward_auth (#709, on its own host)
{$EDGE_TUNNEL_FQDN_CHAT} {
handle /login {
reverse_proxy chat:8080
}
handle /oauth/callback {
reverse_proxy chat:8080
}
handle /* {
forward_auth chat:8080 {
uri /auth/verify
copy_headers X-Forwarded-User
header_up X-Forward-Auth-Secret {$FORWARD_AUTH_SECRET}
}
reverse_proxy chat:8080
}
}
CADDYFILEEOF
} }
# Generate docker/index.html default page. # Generate docker/index.html default page.

View file

@ -19,10 +19,12 @@
# JOB_READY_TIMEOUT_SECS — poll timeout in seconds (default: 360) # JOB_READY_TIMEOUT_SECS — poll timeout in seconds (default: 360)
# JOB_READY_TIMEOUT_<JOBNAME> — per-job timeout override (e.g., # JOB_READY_TIMEOUT_<JOBNAME> — per-job timeout override (e.g.,
# JOB_READY_TIMEOUT_FORGEJO=300) # JOB_READY_TIMEOUT_FORGEJO=300)
# Built-in: JOB_READY_TIMEOUT_CHAT=600
# #
# Exit codes: # Exit codes:
# 0 success (all jobs deployed and healthy, or dry-run completed) # 0 success (all jobs deployed and healthy, or dry-run completed)
# 1 failure (validation error, timeout, or nomad command failure) # 1 failure (validation error, or one or more jobs unhealthy after all
# jobs submitted — deploy does NOT cascade-skip on timeout)
# #
# Idempotency: # Idempotency:
# Running twice back-to-back on a healthy cluster is a no-op. Jobs that are # Running twice back-to-back on a healthy cluster is a no-op. Jobs that are
@ -35,7 +37,11 @@ SCRIPT_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
REPO_ROOT="${REPO_ROOT:-$(cd "${SCRIPT_ROOT}/../../.." && pwd)}" REPO_ROOT="${REPO_ROOT:-$(cd "${SCRIPT_ROOT}/../../.." && pwd)}"
JOB_READY_TIMEOUT_SECS="${JOB_READY_TIMEOUT_SECS:-360}" JOB_READY_TIMEOUT_SECS="${JOB_READY_TIMEOUT_SECS:-360}"
# Per-job built-in defaults (override with JOB_READY_TIMEOUT_<JOBNAME> env var)
JOB_READY_TIMEOUT_CHAT="${JOB_READY_TIMEOUT_CHAT:-600}"
DRY_RUN=0 DRY_RUN=0
FAILED_JOBS=() # jobs that timed out or failed deployment
log() { printf '[deploy] %s\n' "$*" >&2; } log() { printf '[deploy] %s\n' "$*" >&2; }
die() { printf '[deploy] ERROR: %s\n' "$*" >&2; exit 1; } die() { printf '[deploy] ERROR: %s\n' "$*" >&2; exit 1; }
@ -168,6 +174,43 @@ _wait_job_running() {
return 1 return 1
} }
# ── Helper: _run_post_deploy <job_name> ─────────────────────────────────────
# Runs post-deploy scripts for a job after it becomes healthy.
# Currently supports: forgejo → run forgejo-bootstrap.sh
#
# Args:
# job_name — name of the deployed job
#
# Returns:
# 0 on success (script ran or not applicable)
# 1 on failure
# ─────────────────────────────────────────────────────────────────────────────
_run_post_deploy() {
local job_name="$1"
local post_deploy_script
case "$job_name" in
forgejo)
post_deploy_script="${SCRIPT_ROOT}/forgejo-bootstrap.sh"
if [ -x "$post_deploy_script" ]; then
log "running post-deploy script for ${job_name}"
if ! "$post_deploy_script"; then
log "ERROR: post-deploy script failed for ${job_name}"
return 1
fi
log "post-deploy script completed for ${job_name}"
else
log "no post-deploy script found for ${job_name}, skipping"
fi
;;
*)
log "no post-deploy script for ${job_name}, skipping"
;;
esac
return 0
}
# ── Main: deploy each job in order ─────────────────────────────────────────── # ── Main: deploy each job in order ───────────────────────────────────────────
for job_name in "${JOBS[@]}"; do for job_name in "${JOBS[@]}"; do
jobspec_path="${REPO_ROOT}/nomad/jobs/${job_name}.hcl" jobspec_path="${REPO_ROOT}/nomad/jobs/${job_name}.hcl"
@ -186,6 +229,9 @@ for job_name in "${JOBS[@]}"; do
log "[dry-run] nomad job validate ${jobspec_path}" log "[dry-run] nomad job validate ${jobspec_path}"
log "[dry-run] nomad job run -detach ${jobspec_path}" log "[dry-run] nomad job run -detach ${jobspec_path}"
log "[dry-run] (would wait for '${job_name}' to become healthy for ${job_timeout}s)" log "[dry-run] (would wait for '${job_name}' to become healthy for ${job_timeout}s)"
case "$job_name" in
forgejo) log "[dry-run] [post-deploy] would run forgejo-bootstrap.sh" ;;
esac
continue continue
fi fi
@ -215,7 +261,13 @@ for job_name in "${JOBS[@]}"; do
# 4. Wait for healthy state # 4. Wait for healthy state
if ! _wait_job_running "$job_name" "$job_timeout"; then if ! _wait_job_running "$job_name" "$job_timeout"; then
die "deployment for job '${job_name}' did not reach successful state" log "WARNING: deployment for job '${job_name}' did not reach successful state — continuing with remaining jobs"
FAILED_JOBS+=("$job_name")
else
# 5. Run post-deploy scripts (only if job reached healthy state)
if ! _run_post_deploy "$job_name"; then
die "post-deploy script failed for job '${job_name}'"
fi
fi fi
done done
@ -223,4 +275,17 @@ if [ "$DRY_RUN" -eq 1 ]; then
log "dry-run complete" log "dry-run complete"
fi fi
# ── Final health summary ─────────────────────────────────────────────────────
if [ "${#FAILED_JOBS[@]}" -gt 0 ]; then
log ""
log "=== DEPLOY SUMMARY ==="
log "The following jobs did NOT reach healthy state:"
for failed in "${FAILED_JOBS[@]}"; do
log " - ${failed}"
done
log "All other jobs were submitted and healthy."
log "======================"
exit 1
fi
exit 0 exit 0

View file

@ -0,0 +1,215 @@
#!/usr/bin/env bash
# =============================================================================
# lib/init/nomad/forgejo-bootstrap.sh — Bootstrap Forgejo admin user
#
# Part of the Nomad+Vault migration (S2.4, issue #1069). Creates the
# disinto-admin user in Forgejo if it doesn't exist, enabling:
# - First-login success without manual intervention
# - PAT generation via API (required for disinto backup import #1058)
#
# The script is idempotent — re-running after success is a no-op.
#
# Scope:
# - Checks if user 'disinto-admin' exists via GET /api/v1/users/search
# - If not: POST /api/v1/admin/users to create admin user
# - Uses FORGE_ADMIN_PASS from environment (required)
#
# Idempotency contract:
# - User 'disinto-admin' exists → skip creation, log
# "[forgejo-bootstrap] admin user already exists"
# - User creation fails with "user already exists" → treat as success
#
# Preconditions:
# - Forgejo reachable at $FORGE_URL (default: http://127.0.0.1:3000)
# - Forgejo admin token at $FORGE_TOKEN (from Vault or env)
# - FORGE_ADMIN_PASS set (env var with admin password)
#
# Requires:
# - curl, jq
#
# Usage:
# lib/init/nomad/forgejo-bootstrap.sh
# lib/init/nomad/forgejo-bootstrap.sh --dry-run
#
# Exit codes:
# 0 success (user created + ready, or already exists)
# 1 precondition / API failure
# =============================================================================
set -euo pipefail
# ── Configuration ────────────────────────────────────────────────────────────
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
REPO_ROOT="$(cd "${SCRIPT_DIR}/../../.." && pwd)"
# shellcheck source=../../../lib/hvault.sh
source "${REPO_ROOT}/lib/hvault.sh"
# Configuration
FORGE_URL="${FORGE_URL:-http://127.0.0.1:3000}"
FORGE_TOKEN="${FORGE_TOKEN:-}"
FORGE_ADMIN_USER="${DISINTO_ADMIN_USER:-disinto-admin}"
FORGE_ADMIN_EMAIL="${DISINTO_ADMIN_EMAIL:-admin@disinto.local}"
# Derive FORGE_ADMIN_PASS from common env var patterns
# Priority: explicit FORGE_ADMIN_PASS > DISINTO_FORGE_ADMIN_PASS > FORGEJO_ADMIN_PASS
FORGE_ADMIN_PASS="${FORGE_ADMIN_PASS:-${DISINTO_FORGE_ADMIN_PASS:-${FORGEJO_ADMIN_PASS:-}}}"
LOG_TAG="[forgejo-bootstrap]"
log() { printf '%s %s\n' "$LOG_TAG" "$*" >&2; }
die() { printf '%s ERROR: %s\n' "$LOG_TAG" "$*" >&2; exit 1; }
# ── Flag parsing ─────────────────────────────────────────────────────────────
DRY_RUN="${DRY_RUN:-0}"
for arg in "$@"; do
case "$arg" in
--dry-run) DRY_RUN=1 ;;
-h|--help)
printf 'Usage: %s [--dry-run]\n\n' "$(basename "$0")"
printf 'Bootstrap Forgejo admin user if it does not exist.\n'
printf 'Idempotent: re-running is a no-op.\n\n'
printf 'Environment:\n'
printf ' FORGE_URL Forgejo base URL (default: http://127.0.0.1:3000)\n'
printf ' FORGE_TOKEN Forgejo admin token (from Vault or env)\n'
printf ' FORGE_ADMIN_PASS Admin password (required)\n'
printf ' DISINTO_ADMIN_USER Username for admin account (default: disinto-admin)\n'
printf ' DISINTO_ADMIN_EMAIL Admin email (default: admin@disinto.local)\n\n'
printf ' --dry-run Print planned actions without modifying Forgejo.\n'
exit 0
;;
*) die "invalid argument: ${arg} (try --help)" ;;
esac
done
# ── Precondition checks ──────────────────────────────────────────────────────
log "── Precondition check ──"
if [ -z "$FORGE_URL" ]; then
die "FORGE_URL is not set"
fi
if [ -z "$FORGE_ADMIN_PASS" ]; then
die "FORGE_ADMIN_PASS is not set (required for admin user creation)"
fi
# Resolve FORGE_TOKEN from Vault if not set in env
if [ -z "$FORGE_TOKEN" ]; then
log "reading FORGE_TOKEN from Vault at kv/disinto/shared/forge/token"
_hvault_default_env
token_raw="$(hvault_get_or_empty "kv/data/disinto/shared/forge/token" 2>/dev/null)" || true
if [ -n "$token_raw" ]; then
FORGE_TOKEN="$(printf '%s' "$token_raw" | jq -r '.data.data.token // empty' 2>/dev/null)" || true
fi
if [ -z "$FORGE_TOKEN" ]; then
die "FORGE_TOKEN not set and not found in Vault"
fi
log "forge token loaded from Vault"
fi
# ── Step 1/3: Check if admin user already exists ─────────────────────────────
log "── Step 1/3: check if admin user '${FORGE_ADMIN_USER}' exists ──"
# Use exact match via GET /api/v1/users/{username} (returns 404 if absent)
user_lookup_raw=$(curl -sf --max-time 10 \
"${FORGE_URL}/api/v1/users/${FORGE_ADMIN_USER}" 2>/dev/null) || {
# 404 means user doesn't exist
if [ $? -eq 7 ]; then
log "admin user '${FORGE_ADMIN_USER}' not found"
admin_user_exists=false
user_id=""
else
# Other curl errors (e.g., network, Forgejo down)
log "warning: failed to lookup user (Forgejo may not be ready yet)"
admin_user_exists=false
user_id=""
fi
}
if [ -n "$user_lookup_raw" ]; then
admin_user_exists=true
user_id=$(printf '%s' "$user_lookup_raw" | jq -r '.id // empty' 2>/dev/null) || true
if [ -n "$user_id" ]; then
log "admin user '${FORGE_ADMIN_USER}' already exists (user_id: ${user_id})"
fi
fi
# ── Step 2/3: Create admin user if needed ────────────────────────────────────
if [ "$admin_user_exists" = false ]; then
log "creating admin user '${FORGE_ADMIN_USER}'"
if [ "$DRY_RUN" -eq 1 ]; then
log "[dry-run] would create admin user with:"
log "[dry-run] username: ${FORGE_ADMIN_USER}"
log "[dry-run] email: ${FORGE_ADMIN_EMAIL}"
log "[dry-run] admin: true"
log "[dry-run] must_change_password: false"
else
# Create the admin user via the admin API
create_response=$(curl -sf --max-time 30 -X POST \
-H "Authorization: token ${FORGE_TOKEN}" \
-H "Content-Type: application/json" \
"${FORGE_URL}/api/v1/admin/users" \
-d "{
\"username\": \"${FORGE_ADMIN_USER}\",
\"email\": \"${FORGE_ADMIN_EMAIL}\",
\"password\": \"${FORGE_ADMIN_PASS}\",
\"admin\": true,
\"must_change_password\": false
}" 2>/dev/null) || {
# Check if the error is "user already exists" (race condition on re-run)
error_body=$(curl -s --max-time 30 -X POST \
-H "Authorization: token ${FORGE_TOKEN}" \
-H "Content-Type: application/json" \
"${FORGE_URL}/api/v1/admin/users" \
-d "{\"username\": \"${FORGE_ADMIN_USER}\", \"email\": \"${FORGE_ADMIN_EMAIL}\", \"password\": \"${FORGE_ADMIN_PASS}\", \"admin\": true, \"must_change_password\": false}" 2>/dev/null) || error_body=""
if echo "$error_body" | grep -q '"message".*"user already exists"'; then
log "admin user '${FORGE_ADMIN_USER}' already exists (race condition handled)"
admin_user_exists=true
else
die "failed to create admin user in Forgejo: ${error_body:-unknown error}"
fi
}
# Extract user_id from response
user_id=$(printf '%s' "$create_response" | jq -r '.id // empty' 2>/dev/null) || true
if [ -n "$user_id" ]; then
admin_user_exists=true
log "admin user '${FORGE_ADMIN_USER}' created (user_id: ${user_id})"
else
die "failed to extract user_id from Forgejo response"
fi
fi
else
log "admin user '${FORGE_ADMIN_USER}' already exists — skipping creation"
fi
# ── Step 3/3: Verify user was created and is admin ───────────────────────────
log "── Step 3/3: verify admin user is properly configured ──"
if [ "$DRY_RUN" -eq 1 ]; then
log "[dry-run] would verify admin user configuration"
log "done — [dry-run] complete"
else
# Verify the user exists and is admin
verify_response=$(curl -sf --max-time 10 \
-u "${FORGE_ADMIN_USER}:${FORGE_ADMIN_PASS}" \
"${FORGE_URL}/api/v1/user" 2>/dev/null) || {
die "failed to verify admin user credentials"
}
is_admin=$(printf '%s' "$verify_response" | jq -r '.is_admin // false' 2>/dev/null) || true
login=$(printf '%s' "$verify_response" | jq -r '.login // empty' 2>/dev/null) || true
if [ "$is_admin" != "true" ]; then
die "admin user '${FORGE_ADMIN_USER}' is not marked as admin"
fi
if [ "$login" != "$FORGE_ADMIN_USER" ]; then
die "admin user login mismatch: expected '${FORGE_ADMIN_USER}', got '${login}'"
fi
log "admin user verified: login=${login}, is_admin=${is_admin}"
log "done — Forgejo admin user is ready"
fi
exit 0

View file

@ -198,6 +198,7 @@ setup_ops_repo() {
[ -f "${ops_root}/evidence/holdout/.gitkeep" ] || { touch "${ops_root}/evidence/holdout/.gitkeep"; seeded=true; } [ -f "${ops_root}/evidence/holdout/.gitkeep" ] || { touch "${ops_root}/evidence/holdout/.gitkeep"; seeded=true; }
[ -f "${ops_root}/evidence/evolution/.gitkeep" ] || { touch "${ops_root}/evidence/evolution/.gitkeep"; seeded=true; } [ -f "${ops_root}/evidence/evolution/.gitkeep" ] || { touch "${ops_root}/evidence/evolution/.gitkeep"; seeded=true; }
[ -f "${ops_root}/evidence/user-test/.gitkeep" ] || { touch "${ops_root}/evidence/user-test/.gitkeep"; seeded=true; } [ -f "${ops_root}/evidence/user-test/.gitkeep" ] || { touch "${ops_root}/evidence/user-test/.gitkeep"; seeded=true; }
[ -f "${ops_root}/knowledge/.gitkeep" ] || { touch "${ops_root}/knowledge/.gitkeep"; seeded=true; }
if [ ! -f "${ops_root}/README.md" ]; then if [ ! -f "${ops_root}/README.md" ]; then
cat > "${ops_root}/README.md" <<OPSEOF cat > "${ops_root}/README.md" <<OPSEOF
@ -362,6 +363,45 @@ migrate_ops_repo() {
if [ ! -f "$tfile" ]; then if [ ! -f "$tfile" ]; then
local title local title
title=$(basename "$tfile" | sed 's/\.md$//; s/_/ /g' | sed 's/\b\(.\)/\u\1/g') title=$(basename "$tfile" | sed 's/\.md$//; s/_/ /g' | sed 's/\b\(.\)/\u\1/g')
case "$tfile" in
portfolio.md)
{
echo "# ${title}"
echo ""
echo "## Addressables"
echo ""
echo "<!-- Add addressables here -->"
echo ""
echo "## Observables"
echo ""
echo "<!-- Add observables here -->"
} > "$tfile"
;;
RESOURCES.md)
{
echo "# ${title}"
echo ""
echo "## Accounts"
echo ""
echo "<!-- Add account references here -->"
echo ""
echo "## Tokens"
echo ""
echo "<!-- Add token references here -->"
echo ""
echo "## Infrastructure"
echo ""
echo "<!-- Add infrastructure inventory here -->"
} > "$tfile"
;;
prerequisites.md)
{
echo "# ${title}"
echo ""
echo "<!-- Add dependency graph here -->"
} > "$tfile"
;;
*)
{ {
echo "# ${title}" echo "# ${title}"
echo "" echo ""
@ -369,6 +409,8 @@ migrate_ops_repo() {
echo "" echo ""
echo "<!-- Add content here -->" echo "<!-- Add content here -->"
} > "$tfile" } > "$tfile"
;;
esac
echo " + Created: ${tfile}" echo " + Created: ${tfile}"
migrated=true migrated=true
fi fi

View file

@ -1,4 +1,4 @@
<!-- last-reviewed: 5ba18c8f80da6e3e574823e39e5aa760731c1705 --> <!-- last-reviewed: 88222503d5a2ff1d25e9f1cb254ed31f13ccea7f -->
# nomad/ — Agent Instructions # nomad/ — Agent Instructions
Nomad + Vault HCL for the factory's single-node cluster. These files are Nomad + Vault HCL for the factory's single-node cluster. These files are

View file

@ -172,6 +172,12 @@ EOT
handle /chat/oauth/callback { handle /chat/oauth/callback {
reverse_proxy 127.0.0.1:8080 reverse_proxy 127.0.0.1:8080
} }
# WebSocket endpoint for streaming (#1026)
handle /chat/ws {
header_up Upgrade $http.upgrade
header_up Connection $http.connection
reverse_proxy 127.0.0.1:8080
}
# Defense-in-depth: forward_auth stamps X-Forwarded-User from session (#709) # Defense-in-depth: forward_auth stamps X-Forwarded-User from session (#709)
handle /chat/* { handle /chat/* {
forward_auth 127.0.0.1:8080 { forward_auth 127.0.0.1:8080 {

View file

@ -57,7 +57,7 @@ job "woodpecker-agent" {
check { check {
type = "http" type = "http"
path = "/healthz" path = "/healthz"
interval = "15s" interval = "10s"
timeout = "3s" timeout = "3s"
} }
} }
@ -91,6 +91,9 @@ job "woodpecker-agent" {
env { env {
WOODPECKER_SERVER = "${attr.unique.network.ip-address}:9000" WOODPECKER_SERVER = "${attr.unique.network.ip-address}:9000"
WOODPECKER_GRPC_SECURE = "false" WOODPECKER_GRPC_SECURE = "false"
WOODPECKER_GRPC_KEEPALIVE_TIME = "10s"
WOODPECKER_GRPC_KEEPALIVE_TIMEOUT = "20s"
WOODPECKER_GRPC_KEEPALIVE_PERMIT_WITHOUT_CALLS = "true"
WOODPECKER_MAX_WORKFLOWS = "1" WOODPECKER_MAX_WORKFLOWS = "1"
WOODPECKER_HEALTHCHECK_ADDR = ":3333" WOODPECKER_HEALTHCHECK_ADDR = ":3333"
} }

View file

@ -1,4 +1,4 @@
<!-- last-reviewed: 5ba18c8f80da6e3e574823e39e5aa760731c1705 --> <!-- last-reviewed: 88222503d5a2ff1d25e9f1cb254ed31f13ccea7f -->
# Planner Agent # Planner Agent
**Role**: Strategic planning using a Prerequisite Tree (Theory of Constraints), **Role**: Strategic planning using a Prerequisite Tree (Theory of Constraints),

View file

@ -1,4 +1,4 @@
<!-- last-reviewed: 5ba18c8f80da6e3e574823e39e5aa760731c1705 --> <!-- last-reviewed: 88222503d5a2ff1d25e9f1cb254ed31f13ccea7f -->
# Predictor Agent # Predictor Agent
**Role**: Abstract adversary (the "goblin"). Runs a 2-step formula **Role**: Abstract adversary (the "goblin"). Runs a 2-step formula

View file

@ -59,6 +59,23 @@ check_pipeline_stall = false
# compact_pct = 60 # compact_pct = 60
# poll_interval = 60 # poll_interval = 60
# Edge routing mode (default: subpath)
#
# Controls how services are exposed through the edge proxy.
# subpath — all services under <project>.disinto.ai/{forge,ci,chat,staging}
# subdomain — per-service subdomains: forge.<project>, ci.<project>, chat.<project>
#
# Set to "subdomain" if subpath routing causes unfixable issues (redirect loops,
# OAuth callback mismatches, cookie collisions). See docs/edge-routing-fallback.md.
#
# Set in .env (not TOML) since it's consumed by docker-compose and shell scripts:
# EDGE_ROUTING_MODE=subdomain
#
# In subdomain mode, `disinto edge register` also writes:
# EDGE_TUNNEL_FQDN_FORGE=forge.<project>.disinto.ai
# EDGE_TUNNEL_FQDN_CI=ci.<project>.disinto.ai
# EDGE_TUNNEL_FQDN_CHAT=chat.<project>.disinto.ai
# [mirrors] # [mirrors]
# github = "git@github.com:johba/disinto.git" # github = "git@github.com:johba/disinto.git"
# codeberg = "git@codeberg.org:johba/disinto.git" # codeberg = "git@codeberg.org:johba/disinto.git"

View file

@ -1,4 +1,4 @@
<!-- last-reviewed: 5ba18c8f80da6e3e574823e39e5aa760731c1705 --> <!-- last-reviewed: 88222503d5a2ff1d25e9f1cb254ed31f13ccea7f -->
# Review Agent # Review Agent
**Role**: AI-powered PR review — post structured findings and formal **Role**: AI-powered PR review — post structured findings and formal

View file

@ -52,8 +52,35 @@ REVIEW_TMPDIR=$(mktemp -d)
log() { printf '[%s] PR#%s %s\n' "$(date -u '+%Y-%m-%d %H:%M:%S UTC')" "$PR_NUMBER" "$*" >> "$LOGFILE"; } log() { printf '[%s] PR#%s %s\n' "$(date -u '+%Y-%m-%d %H:%M:%S UTC')" "$PR_NUMBER" "$*" >> "$LOGFILE"; }
status() { printf '[%s] PR #%s: %s\n' "$(date -u '+%Y-%m-%d %H:%M:%S UTC')" "$PR_NUMBER" "$*" > "$STATUSFILE"; log "$*"; } status() { printf '[%s] PR #%s: %s\n' "$(date -u '+%Y-%m-%d %H:%M:%S UTC')" "$PR_NUMBER" "$*" > "$STATUSFILE"; log "$*"; }
cleanup() { rm -rf "$REVIEW_TMPDIR" "$LOCKFILE" "$STATUSFILE" "/tmp/${PROJECT_NAME}-review-graph-${PR_NUMBER}.json"; }
trap cleanup EXIT # cleanup — remove temp files (NOT lockfile — cleanup_on_exit handles that)
cleanup() {
rm -rf "$REVIEW_TMPDIR" "$STATUSFILE" "/tmp/${PROJECT_NAME}-review-graph-${PR_NUMBER}.json"
}
# cleanup_on_exit — defensive cleanup: remove lockfile if we own it, kill residual children
# This handles the case where review-pr.sh is terminated unexpectedly (e.g., watchdog SIGTERM)
cleanup_on_exit() {
local ec=$?
# Remove lockfile only if we own it (PID matches $$)
if [ -f "$LOCKFILE" ] && [ -n "$(cat "$LOCKFILE" 2>/dev/null)" ]; then
if [ "$(cat "$LOCKFILE" 2>/dev/null)" = "$$" ]; then
rm -f "$LOCKFILE"
log "cleanup_on_exit: removed lockfile (we owned it)"
fi
fi
# Kill any direct children that may have been spawned by this process
# (e.g., bash -c commands from Claude's Bash tool that didn't get reaped)
pkill -P $$ 2>/dev/null || true
# Call the main cleanup function to remove temp files
cleanup
exit "$ec"
}
trap cleanup_on_exit EXIT INT TERM
# Note: EXIT trap is already set above. The cleanup function is still available for
# non-error exits (e.g., normal completion via exit 0 after verdict posted).
# When review succeeds, we want to skip lockfile removal since the verdict was posted.
# ============================================================================= # =============================================================================
# LOG ROTATION # LOG ROTATION
@ -104,6 +131,7 @@ if [ "$PR_STATE" != "open" ]; then
log "SKIP: state=${PR_STATE}" log "SKIP: state=${PR_STATE}"
worktree_cleanup "$WORKTREE" worktree_cleanup "$WORKTREE"
rm -f "$OUTPUT_FILE" "$SID_FILE" 2>/dev/null || true rm -f "$OUTPUT_FILE" "$SID_FILE" 2>/dev/null || true
rm -f "$LOCKFILE"
exit 0 exit 0
fi fi
@ -113,7 +141,7 @@ fi
CI_STATE=$(ci_commit_status "$PR_SHA") CI_STATE=$(ci_commit_status "$PR_SHA")
CI_NOTE="" CI_NOTE=""
if ! ci_passed "$CI_STATE"; then if ! ci_passed "$CI_STATE"; then
ci_required_for_pr "$PR_NUMBER" && { log "SKIP: CI=${CI_STATE}"; exit 0; } ci_required_for_pr "$PR_NUMBER" && { log "SKIP: CI=${CI_STATE}"; rm -f "$LOCKFILE"; exit 0; }
CI_NOTE=" (not required — non-code PR)" CI_NOTE=" (not required — non-code PR)"
fi fi
@ -123,10 +151,10 @@ fi
ALL_COMMENTS=$(forge_api_all "/issues/${PR_NUMBER}/comments") ALL_COMMENTS=$(forge_api_all "/issues/${PR_NUMBER}/comments")
HAS_CMT=$(printf '%s' "$ALL_COMMENTS" | jq --arg s "$PR_SHA" \ HAS_CMT=$(printf '%s' "$ALL_COMMENTS" | jq --arg s "$PR_SHA" \
'[.[]|select(.body|contains("<!-- reviewed: "+$s+" -->"))]|length') '[.[]|select(.body|contains("<!-- reviewed: "+$s+" -->"))]|length')
[ "${HAS_CMT:-0}" -gt 0 ] && [ "$FORCE" != "--force" ] && { log "SKIP: reviewed ${PR_SHA:0:7}"; exit 0; } [ "${HAS_CMT:-0}" -gt 0 ] && [ "$FORCE" != "--force" ] && { log "SKIP: reviewed ${PR_SHA:0:7}"; rm -f "$LOCKFILE"; exit 0; }
HAS_FML=$(forge_api_all "/pulls/${PR_NUMBER}/reviews" | jq --arg s "$PR_SHA" \ HAS_FML=$(forge_api_all "/pulls/${PR_NUMBER}/reviews" | jq --arg s "$PR_SHA" \
'[.[]|select(.commit_id==$s)|select(.state!="COMMENT")]|length') '[.[]|select(.commit_id==$s)|select(.state!="COMMENT")]|length')
[ "${HAS_FML:-0}" -gt 0 ] && [ "$FORCE" != "--force" ] && { log "SKIP: formal review"; exit 0; } [ "${HAS_FML:-0}" -gt 0 ] && [ "$FORCE" != "--force" ] && { log "SKIP: formal review"; rm -f "$LOCKFILE"; exit 0; }
# ============================================================================= # =============================================================================
# RE-REVIEW DETECTION # RE-REVIEW DETECTION
@ -324,3 +352,7 @@ esac
profile_write_journal "review-${PR_NUMBER}" "Review PR #${PR_NUMBER} (${VERDICT})" "${VERDICT,,}" "" || true profile_write_journal "review-${PR_NUMBER}" "Review PR #${PR_NUMBER} (${VERDICT})" "${VERDICT,,}" "" || true
log "DONE: ${VERDICT} (re-review: ${IS_RE_REVIEW})" log "DONE: ${VERDICT} (re-review: ${IS_RE_REVIEW})"
# Remove lockfile on successful completion (cleanup_on_exit will also do this,
# but we do it here to avoid the trap running twice)
rm -f "$LOCKFILE"

View file

@ -209,3 +209,72 @@ jq -nc \
log "Engagement report written to ${OUTPUT}: ${UNIQUE_VISITORS} visitors, ${PAGE_VIEWS} page views" log "Engagement report written to ${OUTPUT}: ${UNIQUE_VISITORS} visitors, ${PAGE_VIEWS} page views"
echo "Engagement report: ${UNIQUE_VISITORS} unique visitors, ${PAGE_VIEWS} page views → ${OUTPUT}" echo "Engagement report: ${UNIQUE_VISITORS} unique visitors, ${PAGE_VIEWS} page views → ${OUTPUT}"
# ── Commit evidence to ops repo via Forgejo API ─────────────────────────────
commit_evidence_via_forgejo() {
local evidence_file="$1"
local report_date
report_date=$(basename "$evidence_file" .json)
local file_path="evidence/engagement/${report_date}.json"
# Check if ops repo is available
if [ -z "${OPS_REPO_ROOT:-}" ] || [ ! -d "${OPS_REPO_ROOT}/.git" ]; then
log "SKIP: OPS_REPO_ROOT not set or not a git repo — evidence file not committed"
return 0
fi
# Check if Forgejo credentials are available
if [ -z "${FORGE_TOKEN:-}" ] || [ -z "${FORGE_URL:-}" ] || [ -z "${FORGE_OPS_REPO:-}" ]; then
log "SKIP: Forgejo credentials not available (FORGE_TOKEN/FORGE_URL/FORGE_OPS_REPO) — evidence file not committed"
return 0
fi
# Read and encode the file content
local content
content=$(base64 < "$evidence_file")
local ops_owner="${OPS_FORGE_OWNER:-${FORGE_REPO%%/*}}"
local ops_repo="${OPS_FORGE_REPO:-${PROJECT_NAME:-disinto}-ops}"
# Check if file already exists in the ops repo
local existing
existing=$(curl -sf \
-H "Authorization: token ${FORGE_TOKEN}" \
"${FORGE_URL}/api/v1/repos/${ops_owner}/${ops_repo}/contents/${file_path}" \
2>/dev/null || echo "")
if [ -n "$existing" ] && printf '%s' "$existing" | jq -e '.sha' >/dev/null 2>&1; then
# Update existing file
local sha
sha=$(printf '%s' "$existing" | jq -r '.sha')
if curl -sf -X PUT \
-H "Authorization: token ${FORGE_TOKEN}" \
-H "Content-Type: application/json" \
"${FORGE_URL}/api/v1/repos/${ops_owner}/${ops_repo}/contents/${file_path}" \
-d "$(jq -nc --arg content "$content" --arg sha "$sha" --arg msg "evidence: engagement ${report_date}" \
'{message: $msg, content: $content, sha: $sha}')" >/dev/null 2>&1; then
log "Updated evidence file in ops repo: ${file_path}"
return 0
else
log "ERROR: failed to update evidence file in ops repo"
return 1
fi
else
# Create new file
if curl -sf -X POST \
-H "Authorization: token ${FORGE_TOKEN}" \
-H "Content-Type: application/json" \
"${FORGE_URL}/api/v1/repos/${ops_owner}/${ops_repo}/contents/${file_path}" \
-d "$(jq -nc --arg content "$content" --arg msg "evidence: engagement ${report_date}" \
'{message: $msg, content: $content}')" >/dev/null 2>&1; then
log "Created evidence file in ops repo: ${file_path}"
return 0
else
log "ERROR: failed to create evidence file in ops repo"
return 1
fi
fi
}
# Attempt to commit evidence (non-fatal — data collection succeeded even if commit fails)
commit_evidence_via_forgejo "$OUTPUT" || log "WARNING: evidence commit skipped or failed — file exists locally at ${OUTPUT}"

View file

@ -1,4 +1,4 @@
<!-- last-reviewed: 5ba18c8f80da6e3e574823e39e5aa760731c1705 --> <!-- last-reviewed: 88222503d5a2ff1d25e9f1cb254ed31f13ccea7f -->
# Supervisor Agent # Supervisor Agent
**Role**: Health monitoring and auto-remediation, executed as a formula-driven **Role**: Health monitoring and auto-remediation, executed as a formula-driven

310
tests/smoke-edge-subpath.sh Executable file
View file

@ -0,0 +1,310 @@
#!/usr/bin/env bash
# =============================================================================
# smoke-edge-subpath.sh — End-to-end subpath routing smoke test
#
# Verifies Forgejo, Woodpecker, and chat function correctly under subpaths:
# - Forgejo at /forge/
# - Woodpecker at /ci/
# - Chat at /chat/
# - Staging at /staging/
#
# Usage:
# smoke-edge-subpath.sh [--base-url BASE_URL]
#
# Environment variables:
# BASE_URL — Edge proxy URL (default: http://localhost)
# EDGE_TIMEOUT — Request timeout in seconds (default: 30)
# EDGE_MAX_RETRIES — Max retries per request (default: 3)
#
# Exit codes:
# 0 — All checks passed
# 1 — One or more checks failed
# =============================================================================
set -euo pipefail
# Script directory for relative paths
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
# Source common helpers if available
source "${SCRIPT_DIR}/../lib/env.sh" 2>/dev/null || true
# ─────────────────────────────────────────────────────────────────────────────
# Configuration
# ─────────────────────────────────────────────────────────────────────────────
BASE_URL="${BASE_URL:-http://localhost}"
EDGE_TIMEOUT="${EDGE_TIMEOUT:-30}"
EDGE_MAX_RETRIES="${EDGE_MAX_RETRIES:-3}"
# Subpaths to test
FORGE_PATH="/forge/"
CI_PATH="/ci/"
CHAT_PATH="/chat/"
STAGING_PATH="/staging/"
# Track overall test status
FAILED=0
PASSED=0
SKIPPED=0
# ─────────────────────────────────────────────────────────────────────────────
# Logging helpers
# ─────────────────────────────────────────────────────────────────────────────
log_info() {
echo "[INFO] $*"
}
log_pass() {
echo "[PASS] $*"
((PASSED++)) || true
}
log_fail() {
echo "[FAIL] $*"
((FAILED++)) || true
}
log_skip() {
echo "[SKIP] $*"
((SKIPPED++)) || true
}
log_section() {
echo ""
echo "=== $* ==="
echo ""
}
# ─────────────────────────────────────────────────────────────────────────────
# HTTP helpers
# ─────────────────────────────────────────────────────────────────────────────
# Make an HTTP request with retry logic
# Usage: http_request <method> <url> [options...]
# Returns: HTTP status code on stdout
http_request() {
local method="$1"
local url="$2"
shift 2
local retries=0
local response status
while [ "$retries" -lt "$EDGE_MAX_RETRIES" ]; do
response=$(curl -sS -w '\n%{http_code}' -X "$method" \
--max-time "$EDGE_TIMEOUT" \
-o /tmp/edge-response-$$ \
"$@" 2>&1) || {
retries=$((retries + 1))
log_info "Retry $retries/$EDGE_MAX_RETRIES for $url"
sleep 1
continue
}
status=$(echo "$response" | tail -n1)
echo "$status"
return 0
done
log_fail "Max retries exceeded for $url"
return 1
}
# Make a GET request and return status code
http_get() {
local url="$1"
shift || true
http_request "GET" "$url" "$@"
}
# Make a HEAD request (no body)
http_head() {
local url="$1"
shift || true
http_request "HEAD" "$url" "$@"
}
# Make a GET request and return the response body
http_get_body() {
local url="$1"
shift || true
curl -sS --max-time "$EDGE_TIMEOUT" "$@" "$url"
}
# ─────────────────────────────────────────────────────────────────────────────
# Test functions
# ─────────────────────────────────────────────────────────────────────────────
test_root_redirect() {
log_section "Test 1: Root redirect to /forge/"
local status
status=$(http_head "$BASE_URL/")
if [ "$status" = "302" ]; then
log_pass "Root / redirects with 302"
else
log_fail "Expected 302 redirect from /, got status $status"
fi
}
test_forgejo_subpath() {
log_section "Test 2: Forgejo at /forge/"
local status
status=$(http_head "$BASE_URL${FORGE_PATH}")
if [ "$status" -ge 200 ] && [ "$status" -lt 400 ]; then
log_pass "Forgejo at ${BASE_URL}${FORGE_PATH} returns status $status"
else
log_fail "Forgejo at ${BASE_URL}${FORGE_PATH} returned unexpected status $status"
fi
}
test_woodpecker_subpath() {
log_section "Test 3: Woodpecker at /ci/"
local status
status=$(http_head "$BASE_URL${CI_PATH}")
if [ "$status" -ge 200 ] && [ "$status" -lt 400 ]; then
log_pass "Woodpecker at ${BASE_URL}${CI_PATH} returns status $status"
else
log_fail "Woodpecker at ${BASE_URL}${CI_PATH} returned unexpected status $status"
fi
}
test_chat_subpath() {
log_section "Test 4: Chat at /chat/"
# Test chat login endpoint
local status
status=$(http_head "$BASE_URL${CHAT_PATH}login")
if [ "$status" -ge 200 ] && [ "$status" -lt 400 ]; then
log_pass "Chat login at ${BASE_URL}${CHAT_PATH}login returns status $status"
else
log_fail "Chat login at ${BASE_URL}${CHAT_PATH}login returned unexpected status $status"
fi
# Test chat OAuth callback endpoint
status=$(http_head "$BASE_URL${CHAT_PATH}oauth/callback")
if [ "$status" -ge 200 ] && [ "$status" -lt 400 ]; then
log_pass "Chat OAuth callback at ${BASE_URL}${CHAT_PATH}oauth/callback returns status $status"
else
log_fail "Chat OAuth callback at ${BASE_URL}${CHAT_PATH}oauth/callback returned unexpected status $status"
fi
}
test_staging_subpath() {
log_section "Test 5: Staging at /staging/"
local status
status=$(http_head "$BASE_URL${STAGING_PATH}")
if [ "$status" -ge 200 ] && [ "$status" -lt 400 ]; then
log_pass "Staging at ${BASE_URL}${STAGING_PATH} returns status $status"
else
log_fail "Staging at ${BASE_URL}${STAGING_PATH} returned unexpected status $status"
fi
}
test_forward_auth_rejection() {
log_section "Test 6: Forward auth on /chat/* rejects unauthenticated requests"
# Request a protected chat endpoint without auth header
# Should return 401 (Unauthorized) due to forward_auth
local status
status=$(http_head "$BASE_URL${CHAT_PATH}auth/verify")
if [ "$status" = "401" ]; then
log_pass "Unauthenticated /chat/auth/verify returns 401 (forward_auth working)"
elif [ "$status" -ge 200 ] && [ "$status" -lt 400 ]; then
log_skip "Unauthenticated /chat/auth/verify returns $status (forward_auth may be disabled)"
else
log_fail "Expected 401 for unauthenticated /chat/auth/verify, got status $status"
fi
}
test_forgejo_oauth_callback() {
log_section "Test 7: Forgejo OAuth callback for Woodpecker under subpath"
# Test that Forgejo OAuth callback path works (Woodpecker OAuth integration)
local status
status=$(http_head "$BASE_URL${FORGE_PATH}login/oauth/callback")
if [ "$status" -ge 200 ] && [ "$status" -lt 400 ]; then
log_pass "Forgejo OAuth callback at ${BASE_URL}${FORGE_PATH}login/oauth/callback works"
else
log_fail "Forgejo OAuth callback returned unexpected status $status"
fi
}
# ─────────────────────────────────────────────────────────────────────────────
# Main
# ─────────────────────────────────────────────────────────────────────────────
main() {
log_info "Starting subpath routing smoke test"
log_info "Base URL: $BASE_URL"
log_info "Timeout: ${EDGE_TIMEOUT}s, Max retries: ${EDGE_MAX_RETRIES}"
# Run all tests
test_root_redirect
test_forgejo_subpath
test_woodpecker_subpath
test_chat_subpath
test_staging_subpath
test_forward_auth_rejection
test_forgejo_oauth_callback
# Summary
log_section "Test Summary"
log_info "Passed: $PASSED"
log_info "Failed: $FAILED"
log_info "Skipped: $SKIPPED"
if [ "$FAILED" -gt 0 ]; then
log_fail "Some tests failed"
exit 1
fi
log_pass "All tests passed!"
exit 0
}
# Parse arguments
while [[ $# -gt 0 ]]; do
case "$1" in
--base-url)
BASE_URL="$2"
shift 2
;;
--base-url=*)
BASE_URL="${1#*=}"
shift
;;
--help)
echo "Usage: $0 [options]"
echo ""
echo "Options:"
echo " --base-url URL Set base URL (default: http://localhost)"
echo " --help Show this help message"
echo ""
echo "Environment variables:"
echo " BASE_URL Base URL for edge proxy (default: http://localhost)"
echo " EDGE_TIMEOUT Request timeout in seconds (default: 30)"
echo " EDGE_MAX_RETRIES Max retries per request (default: 3)"
exit 0
;;
*)
echo "Unknown option: $1" >&2
exit 1
;;
esac
done
main

View file

@ -15,6 +15,7 @@
set -euo pipefail set -euo pipefail
FACTORY_ROOT="$(cd "$(dirname "$0")/.." && pwd)" FACTORY_ROOT="$(cd "$(dirname "$0")/.." && pwd)"
export FACTORY_ROOT_REAL="$FACTORY_ROOT"
# Always use localhost for mock Forgejo (in case FORGE_URL is set from docker-compose) # Always use localhost for mock Forgejo (in case FORGE_URL is set from docker-compose)
export FORGE_URL="http://localhost:3000" export FORGE_URL="http://localhost:3000"
MOCK_BIN="/tmp/smoke-mock-bin" MOCK_BIN="/tmp/smoke-mock-bin"
@ -30,7 +31,8 @@ cleanup() {
rm -rf "$MOCK_BIN" /tmp/smoke-test-repo \ rm -rf "$MOCK_BIN" /tmp/smoke-test-repo \
"${FACTORY_ROOT}/projects/smoke-repo.toml" \ "${FACTORY_ROOT}/projects/smoke-repo.toml" \
/tmp/smoke-claude-shared /tmp/smoke-home-claude \ /tmp/smoke-claude-shared /tmp/smoke-home-claude \
/tmp/smoke-env-before-rerun /tmp/smoke-env-before-dryrun /tmp/smoke-env-before-rerun /tmp/smoke-env-before-dryrun \
"${FACTORY_ROOT}/docker-compose.yml"
# Restore .env only if we created the backup # Restore .env only if we created the backup
if [ -f "${FACTORY_ROOT}/.env.smoke-backup" ]; then if [ -f "${FACTORY_ROOT}/.env.smoke-backup" ]; then
mv "${FACTORY_ROOT}/.env.smoke-backup" "${FACTORY_ROOT}/.env" mv "${FACTORY_ROOT}/.env.smoke-backup" "${FACTORY_ROOT}/.env"
@ -423,6 +425,51 @@ export CLAUDE_SHARED_DIR="$ORIG_CLAUDE_SHARED_DIR"
export CLAUDE_CONFIG_DIR="$ORIG_CLAUDE_CONFIG_DIR" export CLAUDE_CONFIG_DIR="$ORIG_CLAUDE_CONFIG_DIR"
rm -rf /tmp/smoke-claude-shared /tmp/smoke-home-claude rm -rf /tmp/smoke-claude-shared /tmp/smoke-home-claude
# ── 8. Test duplicate service name detection ──────────────────────────────
echo "=== 8/8 Testing duplicate service name detection ==="
# Isolated factory root — do NOT touch the real ${FACTORY_ROOT}/projects/
SMOKE_DUP_ROOT=$(mktemp -d)
mkdir -p "$SMOKE_DUP_ROOT/projects"
cat > "$SMOKE_DUP_ROOT/projects/duplicate-test.toml" <<'TOMLEOF'
name = "duplicate-test"
description = "dup-detection smoke"
[ci]
woodpecker_repo_id = "999"
[agents.llama]
base_url = "http://localhost:8080"
model = "qwen:latest"
roles = ["dev"]
forge_user = "llama-bot"
TOMLEOF
# Call the generator directly — no `disinto init` to overwrite the TOML.
# FACTORY_ROOT tells generators.sh where projects/ + compose_file live.
(
export FACTORY_ROOT="$SMOKE_DUP_ROOT"
export ENABLE_LLAMA_AGENT=1
# shellcheck disable=SC1091
source "${FACTORY_ROOT_REAL:-$(cd "$(dirname "$0")/.." && pwd)}/lib/generators.sh"
# Use a temp file to capture output since pipefail will kill the pipeline
# when _generate_compose_impl returns non-zero
_generate_compose_impl > /tmp/smoke-dup-output.txt 2>&1 || true
if grep -q "Duplicate service name" /tmp/smoke-dup-output.txt; then
pass "Duplicate service detection: conflict between ENABLE_LLAMA_AGENT and [agents.llama] reported"
rm -f /tmp/smoke-dup-output.txt
exit 0
else
fail "Duplicate service detection: no error raised for ENABLE_LLAMA_AGENT + [agents.llama]"
cat /tmp/smoke-dup-output.txt >&2
rm -f /tmp/smoke-dup-output.txt
exit 1
fi
) || FAILED=1
rm -rf "$SMOKE_DUP_ROOT"
unset ENABLE_LLAMA_AGENT
# ── Summary ────────────────────────────────────────────────────────────────── # ── Summary ──────────────────────────────────────────────────────────────────
echo "" echo ""
if [ "$FAILED" -ne 0 ]; then if [ "$FAILED" -ne 0 ]; then

231
tests/test-caddyfile-routing.sh Executable file
View file

@ -0,0 +1,231 @@
#!/usr/bin/env bash
# =============================================================================
# test-caddyfile-routing.sh — Caddyfile routing block unit test
#
# Extracts the Caddyfile template from nomad/jobs/edge.hcl and validates its
# structure without requiring a running Caddy instance.
#
# Checks:
# - Forgejo subpath (/forge/* -> :3000)
# - Woodpecker subpath (/ci/* -> :8000)
# - Staging subpath (/staging/* -> nomadService discovery)
# - Chat subpath (/chat/* with forward_auth and OAuth routes)
# - Root redirect to /forge/
#
# Usage:
# test-caddyfile-routing.sh
#
# Exit codes:
# 0 — All checks passed
# 1 — One or more checks failed
# =============================================================================
set -euo pipefail
# Script directory for relative paths
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
REPO_ROOT="$(cd "${SCRIPT_DIR}/.." && pwd)"
EDGE_TEMPLATE="${REPO_ROOT}/nomad/jobs/edge.hcl"
# Track test status
FAILED=0
PASSED=0
# ─────────────────────────────────────────────────────────────────────────────
# Logging helpers
# ─────────────────────────────────────────────────────────────────────────────
tr_info() {
echo "[INFO] $*"
}
tr_pass() {
echo "[PASS] $*"
((PASSED++)) || true
}
tr_fail() {
echo "[FAIL] $*"
((FAILED++)) || true
}
tr_section() {
echo ""
echo "=== $* ==="
echo ""
}
# ─────────────────────────────────────────────────────────────────────────────
# Caddyfile extraction
# ─────────────────────────────────────────────────────────────────────────────
extract_caddyfile() {
local template_file="$1"
# Extract the Caddyfile template (content between <<EOT and EOT markers
# within the template stanza)
local caddyfile
caddyfile=$(sed -n '/data[[:space:]]*=[[:space:]]*<<[Ee][Oo][Tt]/,/^EOT$/p' "$template_file" | sed '1s/.*/# Caddyfile extracted from Nomad template/; $d')
if [ -z "$caddyfile" ]; then
echo "ERROR: Could not extract Caddyfile template from $template_file" >&2
return 1
fi
echo "$caddyfile"
}
# ─────────────────────────────────────────────────────────────────────────────
# Validation functions
# ─────────────────────────────────────────────────────────────────────────────
check_forgejo_routing() {
tr_section "Validating Forgejo routing"
# Check handle block for /forge/*
if echo "$CADDYFILE" | grep -q "handle /forge/\*"; then
tr_pass "Forgejo handle block (handle /forge/*)"
else
tr_fail "Missing Forgejo handle block (handle /forge/*)"
fi
# Check reverse_proxy to Forgejo on port 3000
if echo "$CADDYFILE" | grep -q "reverse_proxy 127.0.0.1:3000"; then
tr_pass "Forgejo reverse_proxy configured (127.0.0.1:3000)"
else
tr_fail "Missing Forgejo reverse_proxy (127.0.0.1:3000)"
fi
}
check_woodpecker_routing() {
tr_section "Validating Woodpecker routing"
# Check handle block for /ci/*
if echo "$CADDYFILE" | grep -q "handle /ci/\*"; then
tr_pass "Woodpecker handle block (handle /ci/*)"
else
tr_fail "Missing Woodpecker handle block (handle /ci/*)"
fi
# Check reverse_proxy to Woodpecker on port 8000
if echo "$CADDYFILE" | grep -q "reverse_proxy 127.0.0.1:8000"; then
tr_pass "Woodpecker reverse_proxy configured (127.0.0.1:8000)"
else
tr_fail "Missing Woodpecker reverse_proxy (127.0.0.1:8000)"
fi
}
check_staging_routing() {
tr_section "Validating Staging routing"
# Check handle block for /staging/*
if echo "$CADDYFILE" | grep -q "handle /staging/\*"; then
tr_pass "Staging handle block (handle /staging/*)"
else
tr_fail "Missing Staging handle block (handle /staging/*)"
fi
# Check for nomadService discovery (dynamic port)
if echo "$CADDYFILE" | grep -q "nomadService"; then
tr_pass "Staging uses Nomad service discovery"
else
tr_fail "Missing Nomad service discovery for staging"
fi
}
check_chat_routing() {
tr_section "Validating Chat routing"
# Check login endpoint
if echo "$CADDYFILE" | grep -q "handle /chat/login"; then
tr_pass "Chat login handle block (handle /chat/login)"
else
tr_fail "Missing Chat login handle block (handle /chat/login)"
fi
# Check OAuth callback endpoint
if echo "$CADDYFILE" | grep -q "handle /chat/oauth/callback"; then
tr_pass "Chat OAuth callback handle block (handle /chat/oauth/callback)"
else
tr_fail "Missing Chat OAuth callback handle block (handle /chat/oauth/callback)"
fi
# Check catch-all for /chat/*
if echo "$CADDYFILE" | grep -q "handle /chat/\*"; then
tr_pass "Chat catch-all handle block (handle /chat/*)"
else
tr_fail "Missing Chat catch-all handle block (handle /chat/*)"
fi
# Check reverse_proxy to Chat on port 8080
if echo "$CADDYFILE" | grep -q "reverse_proxy 127.0.0.1:8080"; then
tr_pass "Chat reverse_proxy configured (127.0.0.1:8080)"
else
tr_fail "Missing Chat reverse_proxy (127.0.0.1:8080)"
fi
# Check forward_auth block for /chat/*
if echo "$CADDYFILE" | grep -A10 "handle /chat/\*" | grep -q "forward_auth"; then
tr_pass "forward_auth block configured for /chat/*"
else
tr_fail "Missing forward_auth block for /chat/*"
fi
# Check forward_auth URI
if echo "$CADDYFILE" | grep -q "uri /chat/auth/verify"; then
tr_pass "forward_auth URI configured (/chat/auth/verify)"
else
tr_fail "Missing forward_auth URI (/chat/auth/verify)"
fi
}
check_root_redirect() {
tr_section "Validating root redirect"
# Check root redirect to /forge/
if echo "$CADDYFILE" | grep -q "redir /forge/ 302"; then
tr_pass "Root redirect to /forge/ configured (302)"
else
tr_fail "Missing root redirect to /forge/"
fi
}
# ─────────────────────────────────────────────────────────────────────────────
# Main
# ─────────────────────────────────────────────────────────────────────────────
main() {
tr_info "Extracting Caddyfile template from $EDGE_TEMPLATE"
# Extract Caddyfile
CADDYFILE=$(extract_caddyfile "$EDGE_TEMPLATE")
if [ -z "$CADDYFILE" ]; then
tr_fail "Could not extract Caddyfile template"
exit 1
fi
tr_pass "Caddyfile template extracted successfully"
# Run all validation checks
check_forgejo_routing
check_woodpecker_routing
check_staging_routing
check_chat_routing
check_root_redirect
# Summary
tr_section "Test Summary"
tr_info "Passed: $PASSED"
tr_info "Failed: $FAILED"
if [ "$FAILED" -gt 0 ]; then
tr_fail "Some checks failed"
exit 1
fi
tr_pass "All routing blocks validated!"
exit 0
}
main

View file

@ -0,0 +1,210 @@
#!/usr/bin/env bash
# tests/test-duplicate-service-detection.sh — Unit test for duplicate service detection
#
# Tests that the compose generator correctly detects duplicate service names
# between ENABLE_LLAMA_AGENT=1 and [agents.llama] TOML configuration.
set -euo pipefail
# Get the absolute path to the disinto root
DISINTO_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
TEST_DIR=$(mktemp -d)
trap "rm -rf \"\$TEST_DIR\"" EXIT
FAILED=0
fail() { printf 'FAIL: %s\n' "$*" >&2; FAILED=1; }
pass() { printf 'PASS: %s\n' "$*"; }
# Test 1: Duplicate between ENABLE_LLAMA_AGENT and [agents.llama]
echo "=== Test 1: Duplicate between ENABLE_LLAMA_AGENT and [agents.llama] ==="
# Create projects directory and test project TOML with an agent named "llama"
mkdir -p "${TEST_DIR}/projects"
cat > "${TEST_DIR}/projects/test-project.toml" <<'TOMLEOF'
name = "test-project"
description = "Test project for duplicate detection"
[ci]
woodpecker_repo_id = "123"
[agents.llama]
base_url = "http://localhost:8080"
model = "qwen:latest"
roles = ["dev"]
forge_user = "llama-bot"
TOMLEOF
# Create a minimal compose file
cat > "${TEST_DIR}/docker-compose.yml" <<'COMPOSEEOF'
# Test compose file
services:
agents:
image: test:latest
command: echo "hello"
volumes:
test-data:
networks:
test-net:
COMPOSEEOF
# Set up the test environment
export FACTORY_ROOT="${TEST_DIR}"
export PROJECT_NAME="test-project"
export ENABLE_LLAMA_AGENT="1"
export FORGE_TOKEN=""
export FORGE_PASS=""
export CLAUDE_TIMEOUT="7200"
export POLL_INTERVAL="300"
export GARDENER_INTERVAL="21600"
export ARCHITECT_INTERVAL="21600"
export PLANNER_INTERVAL="43200"
export SUPERVISOR_INTERVAL="1200"
# Source the generators module and run the compose generator directly
source "${DISINTO_ROOT}/lib/generators.sh"
# Delete the compose file to force regeneration
rm -f "${TEST_DIR}/docker-compose.yml"
# Run the compose generator directly
if _generate_compose_impl 3000 false 2>&1 | tee "${TEST_DIR}/output.txt"; then
# Check if the output contains the duplicate error message
if grep -q "Duplicate service name 'agents-llama'" "${TEST_DIR}/output.txt"; then
pass "Duplicate detection: correctly detected conflict between ENABLE_LLAMA_AGENT and [agents.llama]"
else
fail "Duplicate detection: should have detected conflict between ENABLE_LLAMA_AGENT and [agents.llama]"
cat "${TEST_DIR}/output.txt" >&2
fi
else
# Generator should fail with non-zero exit code
if grep -q "Duplicate service name 'agents-llama'" "${TEST_DIR}/output.txt"; then
pass "Duplicate detection: correctly detected conflict and returned non-zero exit code"
else
fail "Duplicate detection: should have failed with duplicate error"
cat "${TEST_DIR}/output.txt" >&2
fi
fi
# Test 2: No duplicate when only ENABLE_LLAMA_AGENT is set (no conflicting TOML)
echo ""
echo "=== Test 2: No duplicate when only ENABLE_LLAMA_AGENT is set ==="
# Remove the projects directory created in Test 1
rm -rf "${TEST_DIR}/projects"
# Create a fresh compose file
cat > "${TEST_DIR}/docker-compose.yml" <<'COMPOSEEOF'
# Test compose file
services:
agents:
image: test:latest
volumes:
test-data:
networks:
test-net:
COMPOSEEOF
# Set ENABLE_LLAMA_AGENT
export ENABLE_LLAMA_AGENT="1"
# Delete the compose file to force regeneration
rm -f "${TEST_DIR}/docker-compose.yml"
if _generate_compose_impl 3000 false 2>&1 | tee "${TEST_DIR}/output2.txt"; then
if grep -q "Duplicate" "${TEST_DIR}/output2.txt"; then
fail "No duplicate: should not detect duplicate when only ENABLE_LLAMA_AGENT is set"
else
pass "No duplicate: correctly generated compose without duplicates"
fi
else
# Non-zero exit is fine if there's a legitimate reason (e.g., missing files)
if grep -q "Duplicate" "${TEST_DIR}/output2.txt"; then
fail "No duplicate: should not detect duplicate when only ENABLE_LLAMA_AGENT is set"
else
pass "No duplicate: generator failed for other reason (acceptable)"
fi
fi
# Test 3: Duplicate between two TOML agents with same name
echo ""
echo "=== Test 3: Duplicate between two TOML agents with same name ==="
rm -f "${TEST_DIR}/docker-compose.yml"
# Create projects directory for Test 3
mkdir -p "${TEST_DIR}/projects"
cat > "${TEST_DIR}/projects/project1.toml" <<'TOMLEOF'
name = "project1"
description = "First project"
[ci]
woodpecker_repo_id = "1"
[agents.llama]
base_url = "http://localhost:8080"
model = "qwen:latest"
roles = ["dev"]
forge_user = "llama-bot1"
TOMLEOF
cat > "${TEST_DIR}/projects/project2.toml" <<'TOMLEOF'
name = "project2"
description = "Second project"
[ci]
woodpecker_repo_id = "2"
[agents.llama]
base_url = "http://localhost:8080"
model = "qwen:latest"
roles = ["dev"]
forge_user = "llama-bot2"
TOMLEOF
cat > "${TEST_DIR}/docker-compose.yml" <<'COMPOSEEOF'
# Test compose file
services:
agents:
image: test:latest
volumes:
test-data:
networks:
test-net:
COMPOSEEOF
unset ENABLE_LLAMA_AGENT
# Delete the compose file to force regeneration
rm -f "${TEST_DIR}/docker-compose.yml"
if _generate_compose_impl 3000 false 2>&1 | tee "${TEST_DIR}/output3.txt"; then
if grep -q "Duplicate service name 'agents-llama'" "${TEST_DIR}/output3.txt"; then
pass "Duplicate detection: correctly detected conflict between two [agents.llama] blocks"
else
fail "Duplicate detection: should have detected conflict between two [agents.llama] blocks"
cat "${TEST_DIR}/output3.txt" >&2
fi
else
if grep -q "Duplicate service name 'agents-llama'" "${TEST_DIR}/output3.txt"; then
pass "Duplicate detection: correctly detected conflict and returned non-zero exit code"
else
fail "Duplicate detection: should have failed with duplicate error"
cat "${TEST_DIR}/output3.txt" >&2
fi
fi
# Summary
echo ""
if [ "$FAILED" -ne 0 ]; then
echo "=== TESTS FAILED ==="
exit 1
fi
echo "=== ALL TESTS PASSED ==="

View file

@ -0,0 +1,129 @@
#!/usr/bin/env bash
# test-watchdog-process-group.sh — Test that claude_run_with_watchdog kills orphan children
#
# This test verifies that when claude_run_with_watchdog terminates the Claude process,
# all child processes (including those spawned by Claude's Bash tool) are also killed.
#
# Reproducer scenario:
# 1. Create a fake "claude" stub that:
# a. Spawns a long-running child process (sleep 3600)
# b. Writes a result marker to stdout to trigger idle detection
# c. Stays running
# 2. Run claude_run_with_watchdog with the stub
# 3. Before the fix: sleep child survives (orphaned to PID 1)
# 4. After the fix: sleep child dies (killed as part of process group with -PID)
#
# Usage: ./tests/test-watchdog-process-group.sh
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "$0")/.." && pwd)"
TEST_TMP="/tmp/test-watchdog-$$"
LOGFILE="${TEST_TMP}/log.txt"
PASS=true
# shellcheck disable=SC2317
cleanup_test() {
rm -rf "$TEST_TMP"
}
trap cleanup_test EXIT INT TERM
mkdir -p "$TEST_TMP"
log() {
printf '[TEST] %s\n' "$*" | tee -a "$LOGFILE"
}
fail() {
printf '[TEST] FAIL: %s\n' "$*" | tee -a "$LOGFILE"
PASS=false
}
pass() {
printf '[TEST] PASS: %s\n' "$*" | tee -a "$LOGFILE"
}
# Export required environment variables
export CLAUDE_TIMEOUT=10 # Short timeout for testing
export CLAUDE_IDLE_GRACE=2 # Short grace period for testing
export LOGFILE="${LOGFILE}" # Required by agent-sdk.sh
# Create a fake claude stub that:
# 1. Spawns a long-running child process (sleep 3600) that will become an orphan if parent is killed
# 2. Writes a result marker to stdout (to trigger the watchdog's idle-after-result path)
# 3. Stays running so the watchdog can kill it
cat > "${TEST_TMP}/fake-claude" << 'FAKE_CLAUDE_EOF'
#!/usr/bin/env bash
# Fake claude that spawns a child and stays running
# Simulates Claude's behavior when it spawns a Bash tool command
# Write result marker to stdout (triggers watchdog idle detection)
echo '{"type":"result","session_id":"test-session-123","verdict":"APPROVE"}'
# Spawn a child that simulates Claude's Bash tool hanging
# This is the process that should be killed when the parent is terminated
sleep 3600 &
CHILD_PID=$!
# Log the child PID for debugging
echo "FAKE_CLAUDE_CHILD_PID=$CHILD_PID" >&2
# Stay running - sleep in a loop so the watchdog can kill us
while true; do
sleep 3600 &
wait $! 2>/dev/null || true
done
FAKE_CLAUDE_EOF
chmod +x "${TEST_TMP}/fake-claude"
log "Testing claude_run_with_watchdog process group cleanup..."
# Source the library and run claude_run_with_watchdog
cd "$SCRIPT_DIR"
source lib/agent-sdk.sh
log "Starting claude_run_with_watchdog with fake claude..."
# Run the function directly (not as a script)
# We need to capture output and redirect stderr
OUTPUT_FILE="${TEST_TMP}/output.txt"
timeout 35 bash -c "
source '${SCRIPT_DIR}/lib/agent-sdk.sh'
CLAUDE_TIMEOUT=10 CLAUDE_IDLE_GRACE=2 LOGFILE='${LOGFILE}' claude_run_with_watchdog '${TEST_TMP}/fake-claude' > '${OUTPUT_FILE}' 2>&1
exit \$?
" || true
# Give the watchdog a moment to clean up
log "Waiting for cleanup..."
sleep 5
# More precise check: look for sleep 3600 processes
# These would be the orphans from our fake claude
ORPHAN_COUNT=$(pgrep -a sleep 2>/dev/null | grep -c "sleep 3600" 2>/dev/null || echo "0")
if [ "$ORPHAN_COUNT" -gt 0 ]; then
log "Found $ORPHAN_COUNT orphan sleep 3600 processes:"
pgrep -a sleep | grep "sleep 3600"
fail "Orphan children found - process group cleanup did not work"
else
pass "No orphan children found - process group cleanup worked"
fi
# Also verify that the fake claude itself is not running
FAKE_CLAUDE_COUNT=$(pgrep -c -f "fake-claude" 2>/dev/null || echo "0")
if [ "$FAKE_CLAUDE_COUNT" -gt 0 ]; then
log "Found $FAKE_CLAUDE_COUNT fake-claude processes still running"
fail "Fake claude process(es) still running"
else
pass "Fake claude process terminated"
fi
# Summary
echo ""
if [ "$PASS" = true ]; then
log "All tests passed!"
exit 0
else
log "Some tests failed. See log at $LOGFILE"
exit 1
fi

View file

@ -39,13 +39,10 @@ EOF
exit 1 exit 1
} }
# TODO(#713): Subdomain fallback — if subpath routing (#704/#708) fails, this
# function would need to register additional routes for forge.<project>,
# ci.<project>, chat.<project> subdomains (or accept a --subdomain parameter).
# See docs/edge-routing-fallback.md for the full pivot plan.
# Register a new tunnel # Register a new tunnel
# Usage: do_register <project> <pubkey> # Usage: do_register <project> <pubkey>
# When EDGE_ROUTING_MODE=subdomain, also registers forge.<project>, ci.<project>,
# and chat.<project> subdomain routes (see docs/edge-routing-fallback.md).
do_register() { do_register() {
local project="$1" local project="$1"
local pubkey="$2" local pubkey="$2"
@ -79,17 +76,32 @@ do_register() {
local port local port
port=$(allocate_port "$project" "$full_pubkey" "${project}.${DOMAIN_SUFFIX}") port=$(allocate_port "$project" "$full_pubkey" "${project}.${DOMAIN_SUFFIX}")
# Add Caddy route # Add Caddy route for main project domain
add_route "$project" "$port" add_route "$project" "$port"
# Subdomain mode: register additional routes for per-service subdomains
local routing_mode="${EDGE_ROUTING_MODE:-subpath}"
if [ "$routing_mode" = "subdomain" ]; then
local subdomain
for subdomain in forge ci chat; do
add_route "${subdomain}.${project}" "$port"
done
fi
# Rebuild authorized_keys for tunnel user # Rebuild authorized_keys for tunnel user
rebuild_authorized_keys rebuild_authorized_keys
# Reload Caddy # Reload Caddy
reload_caddy reload_caddy
# Return JSON response # Build JSON response
echo "{\"port\":${port},\"fqdn\":\"${project}.${DOMAIN_SUFFIX}\"}" local response="{\"port\":${port},\"fqdn\":\"${project}.${DOMAIN_SUFFIX}\""
if [ "$routing_mode" = "subdomain" ]; then
response="${response},\"routing_mode\":\"subdomain\""
response="${response},\"subdomains\":{\"forge\":\"forge.${project}.${DOMAIN_SUFFIX}\",\"ci\":\"ci.${project}.${DOMAIN_SUFFIX}\",\"chat\":\"chat.${project}.${DOMAIN_SUFFIX}\"}"
fi
response="${response}}"
echo "$response"
} }
# Deregister a tunnel # Deregister a tunnel
@ -109,9 +121,18 @@ do_deregister() {
# Remove from registry # Remove from registry
free_port "$project" >/dev/null free_port "$project" >/dev/null
# Remove Caddy route # Remove Caddy route for main project domain
remove_route "$project" remove_route "$project"
# Subdomain mode: also remove per-service subdomain routes
local routing_mode="${EDGE_ROUTING_MODE:-subpath}"
if [ "$routing_mode" = "subdomain" ]; then
local subdomain
for subdomain in forge ci chat; do
remove_route "${subdomain}.${project}"
done
fi
# Rebuild authorized_keys for tunnel user # Rebuild authorized_keys for tunnel user
rebuild_authorized_keys rebuild_authorized_keys

View file

@ -1,4 +1,4 @@
<!-- last-reviewed: 5ba18c8f80da6e3e574823e39e5aa760731c1705 --> <!-- last-reviewed: 88222503d5a2ff1d25e9f1cb254ed31f13ccea7f -->
# vault/policies/ — Agent Instructions # vault/policies/ — Agent Instructions
HashiCorp Vault ACL policies for the disinto factory. One `.hcl` file per HashiCorp Vault ACL policies for the disinto factory. One `.hcl` file per