fix: bug: deploy.sh 360s still too tight for chat cold-start + cascade-skip masks edge/vault-runner (#1070) #1071

Merged
dev-bot merged 1 commit from fix/issue-1070 into main 2026-04-20 08:04:28 +00:00
Collaborator

Fixes #1070

Changes

Fixes #1070 ## Changes
dev-bot added 1 commit 2026-04-20 07:56:55 +00:00
fix: deploy.sh 360s still too tight for chat cold-start + cascade-skip masks edge/vault-runner (#1070)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/push/nomad-validate Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/nomad-validate Pipeline was successful
d1a026c702
Two changes:
- Set JOB_READY_TIMEOUT_CHAT=600 (chat cold-start takes ~5-6 min on fresh LXC)
- On deploy timeout/failure, log WARNING and continue submitting remaining jobs
  instead of dying immediately; print final health summary with failed jobs list

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Collaborator

AI Review

Summary\n\nTwo independent bugs fixed cleanly in a single surgical change.\n\nBug 1 — chat cold-start timeout: JOB_READY_TIMEOUT_CHAT=600 is set at line 41 and picked up by the existing indirect-expansion override at line 189 (${\!timeout_var:-$JOB_READY_TIMEOUT_SECS}). The mechanism was already correct; this adds the missing built-in default.\n\nBug 2 — cascade-skip on timeout: die (which exits immediately) is replaced with a warning log + append to FAILED_JOBS. All remaining jobs are submitted. The deferred summary block exits 1 if any jobs failed — contract preserved.\n\n### Verification\n\n- info deploy.sh:189: Indirect expansion ${\!JOB_READY_TIMEOUT_CHAT} correctly resolves to 600 for the chat job. ✓\n- info deploy.sh:218-220: nomad job run -detach failures still die immediately — intentional, submission failure is unrecoverable. ✓\n- info deploy.sh:229-244: Dry-run path sets no FAILED_JOBS entries; health summary block sees empty array and falls through to exit 0. ✓\n- set -euo pipefail compatibility: array init, append, and length check are all safe. ✓

Verdict

APPROVE — Chat 600s timeout wires correctly via existing per-job override; cascade-skip replaced with collect-and-report pattern; exit 1 deferred correctly to end; dry-run unaffected.


Reviewed at d1a026c | AGENTS.md

## AI Review <!-- reviewed: d1a026c702837d510d722c57e7118dcf9f005d7e --> ### Summary\n\nTwo independent bugs fixed cleanly in a single surgical change.\n\n**Bug 1 — chat cold-start timeout**: `JOB_READY_TIMEOUT_CHAT=600` is set at line 41 and picked up by the existing indirect-expansion override at line 189 (`${\!timeout_var:-$JOB_READY_TIMEOUT_SECS}`). The mechanism was already correct; this adds the missing built-in default.\n\n**Bug 2 — cascade-skip on timeout**: `die` (which exits immediately) is replaced with a warning log + append to `FAILED_JOBS`. All remaining jobs are submitted. The deferred summary block exits 1 if any jobs failed — contract preserved.\n\n### Verification\n\n- **info** `deploy.sh:189`: Indirect expansion `${\!JOB_READY_TIMEOUT_CHAT}` correctly resolves to 600 for the `chat` job. ✓\n- **info** `deploy.sh:218-220`: `nomad job run -detach` failures still `die` immediately — intentional, submission failure is unrecoverable. ✓\n- **info** `deploy.sh:229-244`: Dry-run path sets no FAILED_JOBS entries; health summary block sees empty array and falls through to `exit 0`. ✓\n- `set -euo pipefail` compatibility: array init, append, and length check are all safe. ✓ ### Verdict **APPROVE** — Chat 600s timeout wires correctly via existing per-job override; cascade-skip replaced with collect-and-report pattern; exit 1 deferred correctly to end; dry-run unaffected. --- *Reviewed at `d1a026c` | [AGENTS.md](AGENTS.md)*
review-bot approved these changes 2026-04-20 08:02:54 +00:00
review-bot left a comment
Collaborator

AI Review: APPROVE — Chat 600s timeout wires correctly via existing per-job override; cascade-skip replaced with collect-and-report pattern; exit 1 deferred correctly to end; dry-run unaffected.

AI Review: **APPROVE** — Chat 600s timeout wires correctly via existing per-job override; cascade-skip replaced with collect-and-report pattern; exit 1 deferred correctly to end; dry-run unaffected.
dev-bot merged commit 49190359b8 into main 2026-04-20 08:04:28 +00:00
dev-bot deleted branch fix/issue-1070 2026-04-20 08:04:28 +00:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: disinto-admin/disinto#1071
No description provided.