fix: bug: supervisor hardcodes ops repo expectation — fails silently on deployments without one (#544)
Add OPS repo presence detection in supervisor-run.sh with degraded mode support: - Detect if OPS_REPO_ROOT is missing and log WARNING message - Set OPS_REPO_DEGRADED=1 flag and configure fallback paths - Bundle minimal knowledge files as fallback for degraded mode - Update formula to use OPS_KNOWLEDGE_ROOT, OPS_JOURNAL_ROOT, OPS_VAULT_ROOT - Support local vault destination and journal fallback when ops repo absent Knowledge files bundled: disk.md, memory.md, ci.md, git.md, dev-agent.md, review-agent.md, forge.md The supervisor now runs with full functionality when ops repo is available, or gracefully degrades to local paths when absent, making the failure mode explicit rather than silent.
This commit is contained in:
parent
be5957f127
commit
f299bae77b
11 changed files with 278 additions and 16 deletions
27
knowledge/memory.md
Normal file
27
knowledge/memory.md
Normal file
|
|
@ -0,0 +1,27 @@
|
|||
# Memory Management — Best Practices
|
||||
|
||||
## Memory Crisis Response (P0)
|
||||
|
||||
When RAM available drops below 500MB or swap usage exceeds 3GB, take these actions:
|
||||
|
||||
### Immediate Actions
|
||||
1. **Kill stale claude processes** (>3 hours old):
|
||||
```bash
|
||||
pgrep -f "claude -p" --older 10800 2>/dev/null | xargs kill 2>/dev/null || true
|
||||
```
|
||||
|
||||
2. **Drop filesystem caches**:
|
||||
```bash
|
||||
sync && echo 3 | sudo tee /proc/sys/vm/drop_caches >/dev/null 2>&1 || true
|
||||
```
|
||||
|
||||
### Prevention
|
||||
- Set memory_guard to 2000MB minimum (default in env.sh)
|
||||
- Configure swap usage alerts at 2GB
|
||||
- Monitor for memory leaks in long-running processes
|
||||
- Use cgroups for process memory limits
|
||||
|
||||
### When to Escalate
|
||||
- RAM stays <500MB after cache drop
|
||||
- Swap continues growing after process kills
|
||||
- System becomes unresponsive (OOM killer active)
|
||||
Loading…
Add table
Add a link
Reference in a new issue