fix: bug: supervisor hardcodes ops repo expectation — fails silently on deployments without one (#544)

Add OPS repo presence detection in supervisor-run.sh with degraded mode support: - Detect if OPS_REPO_ROOT is missing and log WARNING message - Set OPS_REPO_DEGRADED=1 flag and configure fallback paths - Bundle minimal knowledge files as fallback for degraded mode - Update formula to use OPS_KNOWLEDGE_ROOT, OPS_JOURNAL_ROOT, OPS_VAULT_ROOT - Support local vault destination and journal fallback when ops repo absent Knowledge files bundled: disk.md, memory.md, ci.md, git.md, dev-agent.md, review-agent.md, forge.md The supervisor now runs with full functionality when ops repo is available, or gracefully degrades to local paths when absent, making the failure mode explicit rather than silent.
2026-04-10 08:16:03 +00:00 · 2026-04-10 08:16:03 +00:00 · f299bae77b
commit f299bae77b
parent be5957f127
11 changed files with 278 additions and 16 deletions
--- a/knowledge/memory.md
+++ b/knowledge/memory.md
@ -0,0 +1,27 @@
+# Memory Management — Best Practices
+
+## Memory Crisis Response (P0)
+
+When RAM available drops below 500MB or swap usage exceeds 3GB, take these actions:
+
+### Immediate Actions
+1. **Kill stale claude processes** (>3 hours old):
+   ```bash
+   pgrep -f "claude -p" --older 10800 2>/dev/null | xargs kill 2>/dev/null || true
+   ```
+
+2. **Drop filesystem caches**:
+   ```bash
+   sync && echo 3 | sudo tee /proc/sys/vm/drop_caches >/dev/null 2>&1 || true
+   ```
+
+### Prevention
+- Set memory_guard to 2000MB minimum (default in env.sh)
+- Configure swap usage alerts at 2GB
+- Monitor for memory leaks in long-running processes
+- Use cgroups for process memory limits
+
+### When to Escalate
+- RAM stays <500MB after cache drop
+- Swap continues growing after process kills
+- System becomes unresponsive (OOM killer active)