fix: feat: define phase-signaling protocol for persistent Claude sessions (#79)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
openhands 2026-03-17 18:53:15 +00:00
parent a74eb8ba08
commit d87b7db8f3
4 changed files with 433 additions and 0 deletions

114
AGENTS.md Normal file
View file

@ -0,0 +1,114 @@
<!-- last-reviewed: a74eb8b -->
# Disinto — Agent Instructions
## What this repo is
Disinto is an autonomous code factory. It manages four agents (dev, review,
gardener, supervisor) that pick up issues from Codeberg, implement them,
review PRs, and keep the system healthy — all via cron and `claude -p`.
See `README.md` for the full architecture and `BOOTSTRAP.md` for setup.
## Directory layout
```
disinto/
├── dev/ dev-poll.sh, dev-agent.sh — issue implementation
├── review/ review-poll.sh, review-pr.sh — PR review
├── gardener/ gardener-poll.sh — backlog grooming
├── supervisor/ supervisor-poll.sh — health monitoring
├── lib/ env.sh, ci-debug.sh, matrix_listener.sh
├── projects/ *.toml — per-project config
└── docs/ Protocol docs (PHASE-PROTOCOL.md, etc.)
```
## Tech stack
- **Shell**: bash (all agents are bash scripts)
- **AI**: `claude -p` (one-shot) or `claude` (interactive/tmux sessions)
- **CI**: Woodpecker CI (queried via REST API + Postgres)
- **VCS**: Codeberg (git + Gitea REST API)
- **Notifications**: Matrix (optional)
## Coding conventions
- All scripts start with `#!/usr/bin/env bash` and `set -euo pipefail`
- Source shared environment: `source "$(dirname "$0")/../lib/env.sh"`
- Log to `$LOGFILE` using the `log()` function from env.sh or defined locally
- Never hardcode secrets — all come from `.env` or TOML project files
- ShellCheck must pass (CI runs `shellcheck` on all `.sh` files)
- Avoid duplicate code — shared helpers go in `lib/`
## How to lint and test
```bash
# ShellCheck all scripts
shellcheck dev/dev-poll.sh dev/dev-agent.sh review/review-poll.sh \
review/review-pr.sh gardener/gardener-poll.sh \
supervisor/supervisor-poll.sh lib/env.sh lib/ci-debug.sh \
lib/parse-deps.sh lib/matrix_listener.sh
# Run phase protocol test
bash dev/phase-test.sh
```
---
## Phase-Signaling Protocol (for persistent tmux sessions)
When running as a **persistent tmux session** (issue #80+), Claude must signal
the orchestrator at each phase boundary by writing to a well-known file.
### Phase file path
```
/tmp/dev-session-{PROJECT_NAME}-{ISSUE}.phase
```
### Required phase sentinels
Write exactly one of these lines (with `>`, not `>>`) when a phase ends:
```bash
PHASE_FILE="/tmp/dev-session-${PROJECT_NAME:-project}-${ISSUE:-0}.phase"
# After pushing a PR branch — waiting for CI
echo "PHASE:awaiting_ci" > "$PHASE_FILE"
# After CI passes — waiting for review
echo "PHASE:awaiting_review" > "$PHASE_FILE"
# Blocked on human decision (ambiguous spec, architectural question)
echo "PHASE:needs_human" > "$PHASE_FILE"
# PR is merged and issue is done
echo "PHASE:done" > "$PHASE_FILE"
# Unrecoverable failure
printf 'PHASE:failed\nReason: %s\n' "describe what failed" > "$PHASE_FILE"
```
### When to write each phase
1. **After `git push origin $BRANCH`** → write `PHASE:awaiting_ci`
2. **After receiving "CI passed" injection** → write `PHASE:awaiting_review`
3. **After receiving review feedback** → address it, push, write `PHASE:awaiting_review`
4. **After receiving "Approved" injection** → merge (or wait for orchestrator to merge), write `PHASE:done`
5. **When stuck on human-only decision** → write `PHASE:needs_human`, then wait for input
6. **When a step fails unrecoverably** → write `PHASE:failed`
### Crash recovery
If this session was restarted after a crash, the orchestrator will inject:
- The issue body
- `git diff` of work completed before the crash
- The last known phase
- Any CI results or review comments
Read that context, then resume from where you left off. The git worktree is
the checkpoint — your code changes survived the crash.
### Full protocol reference
See `docs/PHASE-PROTOCOL.md` for the complete spec including the orchestrator
reaction matrix and sequence diagram.

View file

@ -11,6 +11,7 @@ Before starting, ensure you have:
- [ ] A **second Codeberg account** for the review bot (branch protection requires reviews from a different user)
- [ ] A **local clone** of the target repo on the same machine as disinto
- [ ] `claude` CLI installed and authenticated (`claude --version`)
- [ ] `tmux` installed (`tmux -V`) — required for persistent dev sessions (issue #80+)
## 1. Configure `.env`

140
dev/phase-test.sh Executable file
View file

@ -0,0 +1,140 @@
#!/usr/bin/env bash
# phase-test.sh — Integration test for the phase-signaling protocol
#
# Simulates a Claude session writing phases and an orchestrator reading them.
# Tests all phase values and verifies the read/write contract.
#
# Usage: bash dev/phase-test.sh
set -euo pipefail
PROJECT="testproject"
ISSUE="999"
PHASE_FILE="/tmp/dev-session-${PROJECT}-${ISSUE}.phase"
PASS=0
FAIL=0
ok() {
printf '[PASS] %s\n' "$1"
PASS=$((PASS + 1))
}
fail() {
printf '[FAIL] %s\n' "$1"
FAIL=$((FAIL + 1))
}
# Cleanup
rm -f "$PHASE_FILE"
# ── Test 1: phase file path convention ────────────────────────────────────────
expected_path="/tmp/dev-session-${PROJECT}-${ISSUE}.phase"
if [ "$PHASE_FILE" = "$expected_path" ]; then
ok "phase file path follows /tmp/dev-session-{project}-{issue}.phase convention"
else
fail "phase file path mismatch: got $PHASE_FILE, expected $expected_path"
fi
# ── Test 2: write and read each phase sentinel ─────────────────────────────────
check_phase() {
local sentinel="$1"
echo "$sentinel" > "$PHASE_FILE"
local got
got=$(cat "$PHASE_FILE" | tr -d '[:space:]')
if [ "$got" = "$sentinel" ]; then
ok "write/read: $sentinel"
else
fail "write/read: expected '$sentinel', got '$got'"
fi
}
check_phase "PHASE:awaiting_ci"
check_phase "PHASE:awaiting_review"
check_phase "PHASE:needs_human"
check_phase "PHASE:done"
check_phase "PHASE:failed"
# ── Test 3: write overwrites (not appends) ─────────────────────────────────────
echo "PHASE:awaiting_ci" > "$PHASE_FILE"
echo "PHASE:awaiting_review" > "$PHASE_FILE"
line_count=$(wc -l < "$PHASE_FILE")
if [ "$line_count" -eq 1 ]; then
ok "phase file overwrite (single line after two writes)"
else
fail "phase file should have 1 line, got $line_count"
fi
# ── Test 4: failed phase with reason ──────────────────────────────────────────
printf 'PHASE:failed\nReason: %s\n' "shellcheck failed on ci.sh" > "$PHASE_FILE"
first_line=$(head -1 "$PHASE_FILE")
second_line=$(sed -n '2p' "$PHASE_FILE")
if [ "$first_line" = "PHASE:failed" ] && echo "$second_line" | grep -q "^Reason:"; then
ok "PHASE:failed with reason line"
else
fail "PHASE:failed format: first='$first_line' second='$second_line'"
fi
# ── Test 5: orchestrator read function ────────────────────────────────────────
read_phase() {
local pfile="$1"
# Allow cat to fail (missing file) — pipeline exits 0 via || true
{ cat "$pfile" 2>/dev/null || true; } | head -1 | tr -d '[:space:]'
}
echo "PHASE:awaiting_ci" > "$PHASE_FILE"
phase=$(read_phase "$PHASE_FILE")
if [ "$phase" = "PHASE:awaiting_ci" ]; then
ok "orchestrator read_phase() extracts first line"
else
fail "orchestrator read_phase() got: '$phase'"
fi
# ── Test 6: missing file returns empty ────────────────────────────────────────
rm -f "$PHASE_FILE"
phase=$(read_phase "$PHASE_FILE")
if [ -z "$phase" ]; then
ok "missing phase file returns empty string"
else
fail "missing phase file should return empty, got: '$phase'"
fi
# ── Test 7: all valid phase values are recognized ─────────────────────────────
is_valid_phase() {
local p="$1"
case "$p" in
PHASE:awaiting_ci|PHASE:awaiting_review|PHASE:needs_human|PHASE:done|PHASE:failed)
return 0 ;;
*)
return 1 ;;
esac
}
for p in "PHASE:awaiting_ci" "PHASE:awaiting_review" "PHASE:needs_human" \
"PHASE:done" "PHASE:failed"; do
if is_valid_phase "$p"; then
ok "is_valid_phase: $p"
else
fail "is_valid_phase rejected valid phase: $p"
fi
done
if ! is_valid_phase "PHASE:unknown"; then
ok "is_valid_phase rejects unknown phase"
else
fail "is_valid_phase should reject PHASE:unknown"
fi
# ── Cleanup ───────────────────────────────────────────────────────────────────
rm -f "$PHASE_FILE"
# ── Summary ───────────────────────────────────────────────────────────────────
echo ""
printf 'Results: %d passed, %d failed\n' "$PASS" "$FAIL"
if [ "$FAIL" -eq 0 ]; then
echo "All tests passed."
exit 0
else
echo "Some tests failed."
exit 1
fi

178
docs/PHASE-PROTOCOL.md Normal file
View file

@ -0,0 +1,178 @@
# Phase-Signaling Protocol for Persistent Claude Sessions
## Overview
When dev-agent runs Claude in a persistent tmux session (rather than a
one-shot `claude -p` invocation), Claude needs a way to signal the
orchestrator (`dev-poll.sh`) that a phase has completed.
Claude writes a sentinel line to a **phase file** — a well-known path based
on project name and issue number. The orchestrator watches that file and
reacts accordingly.
## Phase File Path Convention
```
/tmp/dev-session-{project}-{issue}.phase
```
Where:
- `{project}` = the project name from the TOML (`name` field), e.g. `harb`
- `{issue}` = the issue number, e.g. `42`
Example: `/tmp/dev-session-harb-42.phase`
## Phase Values
Claude writes exactly one of these lines to the phase file when a phase ends:
| Sentinel | Meaning | Orchestrator action |
|----------|---------|---------------------|
| `PHASE:awaiting_ci` | PR pushed, waiting for CI to run | Poll CI; inject result when done |
| `PHASE:awaiting_review` | CI passed, PR open, waiting for review | Wait for `review-poll` to inject feedback |
| `PHASE:needs_human` | Blocked on human decision | Send Matrix notification; wait for reply |
| `PHASE:done` | Work complete, PR merged | Verify merge, kill tmux session, clean up |
| `PHASE:failed` | Unrecoverable failure | Escalate to gardener/supervisor |
### Writing a phase (from within Claude's session)
```bash
PHASE_FILE="/tmp/dev-session-${PROJECT_NAME}-${ISSUE}.phase"
# Signal awaiting CI
echo "PHASE:awaiting_ci" > "$PHASE_FILE"
# Signal awaiting review
echo "PHASE:awaiting_review" > "$PHASE_FILE"
# Signal needs human
echo "PHASE:needs_human" > "$PHASE_FILE"
# Signal done
echo "PHASE:done" > "$PHASE_FILE"
# Signal failure
echo "PHASE:failed" > "$PHASE_FILE"
```
The orchestrator reads with:
```bash
phase=$(cat "$PHASE_FILE" 2>/dev/null | tr -d '[:space:]')
```
## Orchestrator Reaction Matrix
```
PHASE:awaiting_ci → poll CI every 30s
on success → inject "CI passed" into tmux session
on failure → inject CI error log into tmux session
on timeout → inject "CI timeout" + escalate
PHASE:awaiting_review → wait for review-poll.sh to post review comment
on REQUEST_CHANGES → inject review text into session
on APPROVE → inject "approved" into session
on timeout (3h) → inject "no review, escalating"
PHASE:needs_human → send Matrix notification with issue/PR link
on reply → inject human reply into session
on timeout → re-notify, then escalate after 24h
PHASE:done → verify PR merged on Codeberg
if merged → kill tmux session, clean labels, close issue
if not → inject "PR not merged yet" into session
PHASE:failed → write escalation to supervisor/escalations-{project}.jsonl
kill tmux session
restore backlog label on issue
```
## Crash Recovery
If the tmux session dies (Claude crash, OOM, kernel OOM-kill, compaction):
### Detection
`dev-poll.sh` detects a crash via:
1. `tmux has-session -t "dev-{project}-{issue}"` returns non-zero, OR
2. Phase file is stale (mtime > `CLAUDE_TIMEOUT` seconds with no `PHASE:done`)
### Recovery procedure
```bash
# 1. Read current state from disk
git_diff=$(git -C "$WORKTREE" diff origin/main..HEAD --stat 2>/dev/null)
last_phase=$(cat "$PHASE_FILE" 2>/dev/null || echo "PHASE:unknown")
last_ci=$(cat "/tmp/ci-result-${PROJECT_NAME}-${ISSUE}.txt" 2>/dev/null || echo "")
review_comments=$(curl -sf ... "${API}/issues/${PR}/comments" | jq ...)
# 2. Spawn new tmux session in same worktree
tmux new-session -d -s "dev-${PROJECT_NAME}-${ISSUE}" \
-c "$WORKTREE" \
"claude --dangerously-skip-permissions"
# 3. Inject recovery context
tmux send-keys -t "dev-${PROJECT_NAME}-${ISSUE}" \
"$(cat recovery-prompt.txt)" Enter
```
**Recovery context injected into new session:**
- Issue body (what to implement)
- `git diff` of work done so far (git is the checkpoint, not memory)
- Last known phase (where we left off)
- Last CI result (if phase was `awaiting_ci`)
- Latest review comments (if phase was `awaiting_review`)
**Key principle:** Git is the checkpoint. The worktree persists across crashes.
Claude can read `git log`, `git diff`, and `git status` to understand exactly
what was done before the crash. No state needs to be stored beyond the phase
file and git history.
### State files summary
| File | Created by | Purpose |
|------|-----------|---------|
| `/tmp/dev-session-{proj}-{issue}.phase` | Claude (in session) | Current phase |
| `/tmp/ci-result-{proj}-{issue}.txt` | Orchestrator | Last CI output for injection |
| `/tmp/dev-{proj}-{issue}.log` | Orchestrator | Session transcript |
| `WORKTREE` (git worktree) | dev-agent.sh | Code checkpoint |
## Sequence Diagram
```
Claude session Orchestrator (dev-poll.sh)
────────────── ──────────────────────────
implement issue
push PR branch
echo "PHASE:awaiting_ci" ───→ read phase file
poll CI
CI passes
←── tmux send-keys "CI passed"
echo "PHASE:awaiting_review" → read phase file
wait for review-poll
review: REQUEST_CHANGES
←── tmux send-keys "Review: ..."
address review comments
push fixes
echo "PHASE:awaiting_review" → read phase file
review: APPROVE
←── tmux send-keys "Approved"
merge PR
echo "PHASE:done" ──────────→ read phase file
verify merged
kill session
close issue
```
## Notes
- The phase file is write-once-per-phase (always overwritten with `>`).
The orchestrator reads it, acts, then waits for the next write.
- Claude should write the phase sentinel **as the last action** of each phase,
after any git push or other side effects are complete.
- If Claude writes `PHASE:failed`, it should include a reason on the next line:
```bash
printf 'PHASE:failed\nReason: %s\n' "$reason" > "$PHASE_FILE"
```
- Phase files are cleaned up by the orchestrator after `PHASE:done` or
`PHASE:failed`.