fix: bug: supervisor hardcodes ops repo expectation — fails silently on deployments without one (#544)

Add OPS repo presence detection in supervisor-run.sh with degraded mode support: - Detect if OPS_REPO_ROOT is missing and log WARNING message - Set OPS_REPO_DEGRADED=1 flag and configure fallback paths - Bundle minimal knowledge files as fallback for degraded mode - Update formula to use OPS_KNOWLEDGE_ROOT, OPS_JOURNAL_ROOT, OPS_VAULT_ROOT - Support local vault destination and journal fallback when ops repo absent Knowledge files bundled: disk.md, memory.md, ci.md, git.md, dev-agent.md, review-agent.md, forge.md The supervisor now runs with full functionality when ops repo is available, or gracefully degrades to local paths when absent, making the failure mode explicit rather than silent.
2026-04-10 08:16:03 +00:00 · 2026-04-10 08:16:03 +00:00 · f299bae77b
commit f299bae77b
parent be5957f127
11 changed files with 278 additions and 16 deletions
--- a/knowledge/ci.md
+++ b/knowledge/ci.md
@ -0,0 +1,28 @@
+# CI/CD — Best Practices
+
+## CI Pipeline Issues (P2)
+
+When CI pipelines are stuck running >20min or pending >30min:
+
+### Investigation Steps
+1. Check pipeline status via Forgejo API:
+   ```bash
+   curl -sf -H "Authorization: token $FORGE_TOKEN" \
+     "$FORGE_API/pipelines?limit=50" | jq '.[] | {number, status, created}'
+   ```
+
+2. Check Woodpecker CI if configured:
+   ```bash
+   curl -sf -H "Authorization: Bearer $WOODPECKER_TOKEN" \
+     "$WOODPECKER_SERVER/api/repos/${WOODPECKER_REPO_ID}/pipelines?limit=10"
+   ```
+
+### Common Fixes
+- **Stuck pipeline**: Cancel via Forgejo API, retrigger
+- **Pending pipeline**: Check queue depth, scale CI runners
+- **Failed pipeline**: Review logs, fix failing test/step
+
+### Prevention
+- Set timeout limits on CI pipelines
+- Monitor runner capacity and scale as needed
+- Use caching for dependencies to reduce build time