fix: bug: supervisor hardcodes ops repo expectation — fails silently on deployments without one (#544)
Add OPS repo presence detection in supervisor-run.sh with degraded mode support: - Detect if OPS_REPO_ROOT is missing and log WARNING message - Set OPS_REPO_DEGRADED=1 flag and configure fallback paths - Bundle minimal knowledge files as fallback for degraded mode - Update formula to use OPS_KNOWLEDGE_ROOT, OPS_JOURNAL_ROOT, OPS_VAULT_ROOT - Support local vault destination and journal fallback when ops repo absent Knowledge files bundled: disk.md, memory.md, ci.md, git.md, dev-agent.md, review-agent.md, forge.md The supervisor now runs with full functionality when ops repo is available, or gracefully degrades to local paths when absent, making the failure mode explicit rather than silent.
This commit is contained in:
parent
be5957f127
commit
f299bae77b
11 changed files with 278 additions and 16 deletions
|
|
@ -40,6 +40,12 @@ P3 (degraded PRs, circular deps, stale deps), P4 (housekeeping).
|
|||
- `PRIMARY_BRANCH`, `CLAUDE_MODEL` (set to sonnet by supervisor-run.sh)
|
||||
- `WOODPECKER_TOKEN`, `WOODPECKER_SERVER`, `WOODPECKER_DB_PASSWORD`, `WOODPECKER_DB_USER`, `WOODPECKER_DB_HOST`, `WOODPECKER_DB_NAME` — CI database queries
|
||||
|
||||
**Degraded mode (Issue #544)**: When `OPS_REPO_ROOT` is not set or the directory doesn't exist, the supervisor runs in degraded mode:
|
||||
- Uses bundled knowledge files from `$FACTORY_ROOT/knowledge/` instead of ops repo playbooks
|
||||
- Writes journal locally to `$FACTORY_ROOT/state/supervisor-journal/` (not committed to git)
|
||||
- Files vault items locally to `$PROJECT_REPO_ROOT/vault/pending/`
|
||||
- Logs a WARNING message at startup indicating degraded mode
|
||||
|
||||
**Lifecycle**: supervisor-run.sh (cron */20) → lock + memory guard → run
|
||||
preflight.sh (collect metrics) → load formula + context → run claude -p via agent-sdk.sh
|
||||
→ Claude assesses health, auto-fixes, writes journal → `PHASE:done`.
|
||||
|
|
|
|||
|
|
@ -214,7 +214,9 @@ echo ""
|
|||
|
||||
echo "## Pending Vault Items"
|
||||
_found_vault=false
|
||||
for _vf in "${OPS_REPO_ROOT}/vault/pending/"*.md; do
|
||||
# Use OPS_VAULT_ROOT if set (from supervisor-run.sh degraded mode detection), otherwise default to OPS_REPO_ROOT
|
||||
_va_root="${OPS_VAULT_ROOT:-${OPS_REPO_ROOT}/vault/pending}"
|
||||
for _vf in "${_va_root}"/*.md; do
|
||||
[ -f "$_vf" ] || continue
|
||||
_found_vault=true
|
||||
_vtitle=$(grep -m1 '^# ' "$_vf" | sed 's/^# //' || basename "$_vf")
|
||||
|
|
|
|||
|
|
@ -50,6 +50,26 @@ WORKTREE="/tmp/${PROJECT_NAME}-supervisor-run"
|
|||
# shellcheck disable=SC2034 # consumed by agent-sdk.sh and env.sh log()
|
||||
LOG_AGENT="supervisor"
|
||||
|
||||
# ── OPS Repo Detection (Issue #544) ──────────────────────────────────────
|
||||
# Detect if OPS_REPO_ROOT is available and set degraded mode flag if not.
|
||||
# This allows the supervisor to run with fallback knowledge files and
|
||||
# local journal/vault paths when the ops repo is absent.
|
||||
if [ -z "${OPS_REPO_ROOT:-}" ] || [ ! -d "${OPS_REPO_ROOT}" ]; then
|
||||
log "WARNING: OPS_REPO_ROOT not set or directory missing — running in degraded mode (no playbooks, no journal continuity, no vault destination)"
|
||||
export OPS_REPO_DEGRADED=1
|
||||
# Set fallback paths for degraded mode
|
||||
export OPS_KNOWLEDGE_ROOT="${FACTORY_ROOT}/knowledge"
|
||||
export OPS_JOURNAL_ROOT="${FACTORY_ROOT}/state/supervisor-journal"
|
||||
export OPS_VAULT_ROOT="${PROJECT_REPO_ROOT}/vault/pending"
|
||||
mkdir -p "$OPS_JOURNAL_ROOT" "$OPS_VAULT_ROOT" 2>/dev/null || true
|
||||
else
|
||||
export OPS_REPO_DEGRADED=0
|
||||
export OPS_KNOWLEDGE_ROOT="${OPS_REPO_ROOT}/knowledge"
|
||||
export OPS_JOURNAL_ROOT="${OPS_REPO_ROOT}/journal/supervisor"
|
||||
export OPS_VAULT_ROOT="${OPS_REPO_ROOT}/vault/pending"
|
||||
mkdir -p "$OPS_JOURNAL_ROOT" "$OPS_VAULT_ROOT" 2>/dev/null || true
|
||||
fi
|
||||
|
||||
# Override log() to append to supervisor-specific log file
|
||||
# shellcheck disable=SC2034
|
||||
log() {
|
||||
|
|
@ -105,6 +125,25 @@ export CLAUDE_MODEL="sonnet"
|
|||
# ── Create worktree (before prompt assembly so trap is set early) ────────
|
||||
formula_worktree_setup "$WORKTREE"
|
||||
|
||||
# Inject OPS repo status into prompt
|
||||
if [ "${OPS_REPO_DEGRADED:-0}" = "1" ]; then
|
||||
OPS_STATUS="
|
||||
## OPS Repo Status
|
||||
**DEGRADED MODE**: OPS repo is not available. Using bundled knowledge files and local journal/vault paths.
|
||||
- Knowledge files: ${OPS_KNOWLEDGE_ROOT:-<unset>}
|
||||
- Journal: ${OPS_JOURNAL_ROOT:-<unset>}
|
||||
- Vault destination: ${OPS_VAULT_ROOT:-<unset>}
|
||||
"
|
||||
else
|
||||
OPS_STATUS="
|
||||
## OPS Repo Status
|
||||
**FULL MODE**: OPS repo available at ${OPS_REPO_ROOT}
|
||||
- Knowledge files: ${OPS_KNOWLEDGE_ROOT:-<unset>}
|
||||
- Journal: ${OPS_JOURNAL_ROOT:-<unset>}
|
||||
- Vault destination: ${OPS_VAULT_ROOT:-<unset>}
|
||||
"
|
||||
fi
|
||||
|
||||
PROMPT="You are the supervisor agent for ${FORGE_REPO}. Work through the formula below.
|
||||
|
||||
You have full shell access and --dangerously-skip-permissions.
|
||||
|
|
@ -117,6 +156,7 @@ ${PREFLIGHT_OUTPUT}
|
|||
${CONTEXT_BLOCK}$(formula_lessons_block)
|
||||
${SCRATCH_CONTEXT:+${SCRATCH_CONTEXT}
|
||||
}
|
||||
${OPS_STATUS}
|
||||
Priority order: P0 memory > P1 disk > P2 stopped > P3 degraded > P4 housekeeping
|
||||
|
||||
${FORMULA_CONTENT}
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue