Merge pull request 'fix: feat: define phase-signaling protocol for persistent Claude sessions (#79)' (#84) from fix/issue-79 into main
This commit is contained in:
commit
64b53aaad5
4 changed files with 444 additions and 0 deletions
115
AGENTS.md
Normal file
115
AGENTS.md
Normal file
|
|
@ -0,0 +1,115 @@
|
|||
# Disinto — Agent Instructions
|
||||
|
||||
## What this repo is
|
||||
|
||||
Disinto is an autonomous code factory. It manages four agents (dev, review,
|
||||
gardener, supervisor) that pick up issues from Codeberg, implement them,
|
||||
review PRs, and keep the system healthy — all via cron and `claude -p`.
|
||||
|
||||
See `README.md` for the full architecture and `BOOTSTRAP.md` for setup.
|
||||
|
||||
## Directory layout
|
||||
|
||||
```
|
||||
disinto/
|
||||
├── dev/ dev-poll.sh, dev-agent.sh — issue implementation
|
||||
├── review/ review-poll.sh, review-pr.sh — PR review
|
||||
├── gardener/ gardener-poll.sh — backlog grooming
|
||||
├── supervisor/ supervisor-poll.sh — health monitoring
|
||||
├── lib/ env.sh, ci-debug.sh, matrix_listener.sh
|
||||
├── projects/ *.toml — per-project config
|
||||
└── docs/ Protocol docs (PHASE-PROTOCOL.md, etc.)
|
||||
```
|
||||
|
||||
## Tech stack
|
||||
|
||||
- **Shell**: bash (all agents are bash scripts)
|
||||
- **AI**: `claude -p` (one-shot) or `claude` (interactive/tmux sessions)
|
||||
- **CI**: Woodpecker CI (queried via REST API + Postgres)
|
||||
- **VCS**: Codeberg (git + Gitea REST API)
|
||||
- **Notifications**: Matrix (optional)
|
||||
|
||||
## Coding conventions
|
||||
|
||||
- All scripts start with `#!/usr/bin/env bash` and `set -euo pipefail`
|
||||
- Source shared environment: `source "$(dirname "$0")/../lib/env.sh"`
|
||||
- Log to `$LOGFILE` using the `log()` function from env.sh or defined locally
|
||||
- Never hardcode secrets — all come from `.env` or TOML project files
|
||||
- ShellCheck must pass (CI runs `shellcheck` on all `.sh` files)
|
||||
- Avoid duplicate code — shared helpers go in `lib/`
|
||||
|
||||
## How to lint and test
|
||||
|
||||
```bash
|
||||
# ShellCheck all scripts
|
||||
shellcheck dev/dev-poll.sh dev/dev-agent.sh dev/phase-test.sh \
|
||||
review/review-poll.sh review/review-pr.sh \
|
||||
gardener/gardener-poll.sh \
|
||||
supervisor/supervisor-poll.sh supervisor/update-prompt.sh \
|
||||
lib/env.sh lib/ci-debug.sh lib/load-project.sh \
|
||||
lib/parse-deps.sh lib/matrix_listener.sh
|
||||
|
||||
# Run phase protocol test
|
||||
bash dev/phase-test.sh
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Phase-Signaling Protocol (for persistent tmux sessions)
|
||||
|
||||
When running as a **persistent tmux session** (issue #80+), Claude must signal
|
||||
the orchestrator at each phase boundary by writing to a well-known file.
|
||||
|
||||
### Phase file path
|
||||
|
||||
```
|
||||
/tmp/dev-session-{project}-{issue}.phase
|
||||
```
|
||||
|
||||
### Required phase sentinels
|
||||
|
||||
Write exactly one of these lines (with `>`, not `>>`) when a phase ends:
|
||||
|
||||
```bash
|
||||
PHASE_FILE="/tmp/dev-session-${PROJECT_NAME:-project}-${ISSUE:-0}.phase"
|
||||
|
||||
# After pushing a PR branch — waiting for CI
|
||||
echo "PHASE:awaiting_ci" > "$PHASE_FILE"
|
||||
|
||||
# After CI passes — waiting for review
|
||||
echo "PHASE:awaiting_review" > "$PHASE_FILE"
|
||||
|
||||
# Blocked on human decision (ambiguous spec, architectural question)
|
||||
echo "PHASE:needs_human" > "$PHASE_FILE"
|
||||
|
||||
# PR is merged and issue is done
|
||||
echo "PHASE:done" > "$PHASE_FILE"
|
||||
|
||||
# Unrecoverable failure
|
||||
printf 'PHASE:failed\nReason: %s\n' "describe what failed" > "$PHASE_FILE"
|
||||
```
|
||||
|
||||
### When to write each phase
|
||||
|
||||
1. **After `git push origin $BRANCH`** → write `PHASE:awaiting_ci`
|
||||
2. **After receiving "CI passed" injection** → write `PHASE:awaiting_review`
|
||||
3. **After receiving review feedback** → address it, push, write `PHASE:awaiting_review`
|
||||
4. **After receiving "Approved" injection** → merge (or wait for orchestrator to merge), write `PHASE:done`
|
||||
5. **When stuck on human-only decision** → write `PHASE:needs_human`, then wait for input
|
||||
6. **When a step fails unrecoverably** → write `PHASE:failed`
|
||||
|
||||
### Crash recovery
|
||||
|
||||
If this session was restarted after a crash, the orchestrator will inject:
|
||||
- The issue body
|
||||
- `git diff` of work completed before the crash
|
||||
- The last known phase
|
||||
- Any CI results or review comments
|
||||
|
||||
Read that context, then resume from where you left off. The git worktree is
|
||||
the checkpoint — your code changes survived the crash.
|
||||
|
||||
### Full protocol reference
|
||||
|
||||
See `docs/PHASE-PROTOCOL.md` for the complete spec including the orchestrator
|
||||
reaction matrix and sequence diagram.
|
||||
|
|
@ -11,6 +11,7 @@ Before starting, ensure you have:
|
|||
- [ ] A **second Codeberg account** for the review bot (branch protection requires reviews from a different user)
|
||||
- [ ] A **local clone** of the target repo on the same machine as disinto
|
||||
- [ ] `claude` CLI installed and authenticated (`claude --version`)
|
||||
- [ ] `tmux` installed (`tmux -V`) — required for persistent dev sessions (issue #80+)
|
||||
|
||||
## 1. Configure `.env`
|
||||
|
||||
|
|
|
|||
146
dev/phase-test.sh
Executable file
146
dev/phase-test.sh
Executable file
|
|
@ -0,0 +1,146 @@
|
|||
#!/usr/bin/env bash
|
||||
# phase-test.sh — Integration test for the phase-signaling protocol
|
||||
#
|
||||
# Simulates a Claude session writing phases and an orchestrator reading them.
|
||||
# Tests all phase values and verifies the read/write contract.
|
||||
#
|
||||
# Usage: bash dev/phase-test.sh
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
PROJECT="testproject"
|
||||
ISSUE="999"
|
||||
PHASE_FILE="/tmp/dev-session-${PROJECT}-${ISSUE}.phase"
|
||||
|
||||
PASS=0
|
||||
FAIL=0
|
||||
|
||||
ok() {
|
||||
printf '[PASS] %s\n' "$1"
|
||||
PASS=$((PASS + 1))
|
||||
}
|
||||
|
||||
fail() {
|
||||
printf '[FAIL] %s\n' "$1"
|
||||
FAIL=$((FAIL + 1))
|
||||
}
|
||||
|
||||
# Cleanup
|
||||
rm -f "$PHASE_FILE"
|
||||
|
||||
# ── Test 1: phase file path convention ────────────────────────────────────────
|
||||
expected_path="/tmp/dev-session-${PROJECT}-${ISSUE}.phase"
|
||||
if [ "$PHASE_FILE" = "$expected_path" ]; then
|
||||
ok "phase file path follows /tmp/dev-session-{project}-{issue}.phase convention"
|
||||
else
|
||||
fail "phase file path mismatch: got $PHASE_FILE, expected $expected_path"
|
||||
fi
|
||||
|
||||
# ── Test 2: write and read each phase sentinel ─────────────────────────────────
|
||||
check_phase() {
|
||||
local sentinel="$1"
|
||||
echo "$sentinel" > "$PHASE_FILE"
|
||||
local got
|
||||
got=$(tr -d '[:space:]' < "$PHASE_FILE")
|
||||
if [ "$got" = "$sentinel" ]; then
|
||||
ok "write/read: $sentinel"
|
||||
else
|
||||
fail "write/read: expected '$sentinel', got '$got'"
|
||||
fi
|
||||
}
|
||||
|
||||
check_phase "PHASE:awaiting_ci"
|
||||
check_phase "PHASE:awaiting_review"
|
||||
check_phase "PHASE:needs_human"
|
||||
check_phase "PHASE:done"
|
||||
check_phase "PHASE:failed"
|
||||
|
||||
# ── Test 3: write overwrites (not appends) ─────────────────────────────────────
|
||||
echo "PHASE:awaiting_ci" > "$PHASE_FILE"
|
||||
echo "PHASE:awaiting_review" > "$PHASE_FILE"
|
||||
line_count=$(wc -l < "$PHASE_FILE")
|
||||
file_content=$(< "$PHASE_FILE")
|
||||
if [ "$line_count" -eq 1 ]; then
|
||||
ok "phase file overwrite (single line after two writes)"
|
||||
else
|
||||
fail "phase file should have 1 line, got $line_count"
|
||||
fi
|
||||
if [ "$file_content" = "PHASE:awaiting_review" ]; then
|
||||
ok "phase file overwrite (content is second write, not first)"
|
||||
else
|
||||
fail "phase file content should be 'PHASE:awaiting_review', got '$file_content'"
|
||||
fi
|
||||
|
||||
# ── Test 4: failed phase with reason ──────────────────────────────────────────
|
||||
printf 'PHASE:failed\nReason: %s\n' "shellcheck failed on ci.sh" > "$PHASE_FILE"
|
||||
first_line=$(head -1 "$PHASE_FILE")
|
||||
second_line=$(sed -n '2p' "$PHASE_FILE")
|
||||
if [ "$first_line" = "PHASE:failed" ] && echo "$second_line" | grep -q "^Reason:"; then
|
||||
ok "PHASE:failed with reason line"
|
||||
else
|
||||
fail "PHASE:failed format: first='$first_line' second='$second_line'"
|
||||
fi
|
||||
|
||||
# ── Test 5: orchestrator read function ────────────────────────────────────────
|
||||
read_phase() {
|
||||
local pfile="$1"
|
||||
# Allow cat to fail (missing file) — pipeline exits 0 via || true
|
||||
{ cat "$pfile" 2>/dev/null || true; } | head -1 | tr -d '[:space:]'
|
||||
}
|
||||
|
||||
echo "PHASE:awaiting_ci" > "$PHASE_FILE"
|
||||
phase=$(read_phase "$PHASE_FILE")
|
||||
if [ "$phase" = "PHASE:awaiting_ci" ]; then
|
||||
ok "orchestrator read_phase() extracts first line"
|
||||
else
|
||||
fail "orchestrator read_phase() got: '$phase'"
|
||||
fi
|
||||
|
||||
# ── Test 6: missing file returns empty ────────────────────────────────────────
|
||||
rm -f "$PHASE_FILE"
|
||||
phase=$(read_phase "$PHASE_FILE")
|
||||
if [ -z "$phase" ]; then
|
||||
ok "missing phase file returns empty string"
|
||||
else
|
||||
fail "missing phase file should return empty, got: '$phase'"
|
||||
fi
|
||||
|
||||
# ── Test 7: all valid phase values are recognized ─────────────────────────────
|
||||
is_valid_phase() {
|
||||
local p="$1"
|
||||
case "$p" in
|
||||
PHASE:awaiting_ci|PHASE:awaiting_review|PHASE:needs_human|PHASE:done|PHASE:failed)
|
||||
return 0 ;;
|
||||
*)
|
||||
return 1 ;;
|
||||
esac
|
||||
}
|
||||
|
||||
for p in "PHASE:awaiting_ci" "PHASE:awaiting_review" "PHASE:needs_human" \
|
||||
"PHASE:done" "PHASE:failed"; do
|
||||
if is_valid_phase "$p"; then
|
||||
ok "is_valid_phase: $p"
|
||||
else
|
||||
fail "is_valid_phase rejected valid phase: $p"
|
||||
fi
|
||||
done
|
||||
|
||||
if ! is_valid_phase "PHASE:unknown"; then
|
||||
ok "is_valid_phase rejects unknown phase"
|
||||
else
|
||||
fail "is_valid_phase should reject PHASE:unknown"
|
||||
fi
|
||||
|
||||
# ── Cleanup ───────────────────────────────────────────────────────────────────
|
||||
rm -f "$PHASE_FILE"
|
||||
|
||||
# ── Summary ───────────────────────────────────────────────────────────────────
|
||||
echo ""
|
||||
printf 'Results: %d passed, %d failed\n' "$PASS" "$FAIL"
|
||||
if [ "$FAIL" -eq 0 ]; then
|
||||
echo "All tests passed."
|
||||
exit 0
|
||||
else
|
||||
echo "Some tests failed."
|
||||
exit 1
|
||||
fi
|
||||
182
docs/PHASE-PROTOCOL.md
Normal file
182
docs/PHASE-PROTOCOL.md
Normal file
|
|
@ -0,0 +1,182 @@
|
|||
# Phase-Signaling Protocol for Persistent Claude Sessions
|
||||
|
||||
## Overview
|
||||
|
||||
When dev-agent runs Claude in a persistent tmux session (rather than a
|
||||
one-shot `claude -p` invocation), Claude needs a way to signal the
|
||||
orchestrator (`dev-poll.sh`) that a phase has completed.
|
||||
|
||||
Claude writes a sentinel line to a **phase file** — a well-known path based
|
||||
on project name and issue number. The orchestrator watches that file and
|
||||
reacts accordingly.
|
||||
|
||||
## Phase File Path Convention
|
||||
|
||||
```
|
||||
/tmp/dev-session-{project}-{issue}.phase
|
||||
```
|
||||
|
||||
Where:
|
||||
- `{project}` = the project name from the TOML (`name` field), e.g. `harb`
|
||||
- `{issue}` = the issue number, e.g. `42`
|
||||
|
||||
Example: `/tmp/dev-session-harb-42.phase`
|
||||
|
||||
## Phase Values
|
||||
|
||||
Claude writes exactly one of these lines to the phase file when a phase ends:
|
||||
|
||||
| Sentinel | Meaning | Orchestrator action |
|
||||
|----------|---------|---------------------|
|
||||
| `PHASE:awaiting_ci` | PR pushed, waiting for CI to run | Poll CI; inject result when done |
|
||||
| `PHASE:awaiting_review` | CI passed, PR open, waiting for review | Wait for `review-poll` to inject feedback |
|
||||
| `PHASE:needs_human` | Blocked on human decision | Send Matrix notification; wait for reply |
|
||||
| `PHASE:done` | Work complete, PR merged | Verify merge, kill tmux session, clean up |
|
||||
| `PHASE:failed` | Unrecoverable failure | Escalate to gardener/supervisor |
|
||||
|
||||
### Writing a phase (from within Claude's session)
|
||||
|
||||
```bash
|
||||
PHASE_FILE="/tmp/dev-session-${PROJECT_NAME:-project}-${ISSUE:-0}.phase"
|
||||
|
||||
# Signal awaiting CI
|
||||
echo "PHASE:awaiting_ci" > "$PHASE_FILE"
|
||||
|
||||
# Signal awaiting review
|
||||
echo "PHASE:awaiting_review" > "$PHASE_FILE"
|
||||
|
||||
# Signal needs human
|
||||
echo "PHASE:needs_human" > "$PHASE_FILE"
|
||||
|
||||
# Signal done
|
||||
echo "PHASE:done" > "$PHASE_FILE"
|
||||
|
||||
# Signal failure
|
||||
echo "PHASE:failed" > "$PHASE_FILE"
|
||||
```
|
||||
|
||||
The orchestrator reads with:
|
||||
|
||||
```bash
|
||||
phase=$(head -1 "$PHASE_FILE" 2>/dev/null | tr -d '[:space:]')
|
||||
```
|
||||
|
||||
Using `head -1` is required: `PHASE:failed` may have a reason line on line 2,
|
||||
and reading all lines would produce `PHASE:failedReason:...` which never matches.
|
||||
|
||||
## Orchestrator Reaction Matrix
|
||||
|
||||
```
|
||||
PHASE:awaiting_ci → poll CI every 30s
|
||||
on success → inject "CI passed" into tmux session
|
||||
on failure → inject CI error log into tmux session
|
||||
on timeout → inject "CI timeout" + escalate
|
||||
|
||||
PHASE:awaiting_review → wait for review-poll.sh to post review comment
|
||||
on REQUEST_CHANGES → inject review text into session
|
||||
on APPROVE → inject "approved" into session
|
||||
on timeout (3h) → inject "no review, escalating"
|
||||
|
||||
PHASE:needs_human → send Matrix notification with issue/PR link
|
||||
on reply → inject human reply into session
|
||||
on timeout → re-notify, then escalate after 24h
|
||||
|
||||
PHASE:done → verify PR merged on Codeberg
|
||||
if merged → kill tmux session, clean labels, close issue
|
||||
if not → inject "PR not merged yet" into session
|
||||
|
||||
PHASE:failed → write escalation to supervisor/escalations-{project}.jsonl
|
||||
kill tmux session
|
||||
restore backlog label on issue
|
||||
```
|
||||
|
||||
## Crash Recovery
|
||||
|
||||
If the tmux session dies (Claude crash, OOM, kernel OOM-kill, compaction):
|
||||
|
||||
### Detection
|
||||
|
||||
`dev-poll.sh` detects a crash via:
|
||||
1. `tmux has-session -t "dev-{project}-{issue}"` returns non-zero, OR
|
||||
2. Phase file is stale (mtime > `CLAUDE_TIMEOUT` seconds with no `PHASE:done`)
|
||||
|
||||
### Recovery procedure
|
||||
|
||||
```bash
|
||||
# 1. Read current state from disk
|
||||
git_diff=$(git -C "$WORKTREE" diff origin/main..HEAD --stat 2>/dev/null)
|
||||
last_phase=$(head -1 "$PHASE_FILE" 2>/dev/null | tr -d '[:space:]')
|
||||
last_phase="${last_phase:-PHASE:unknown}"
|
||||
last_ci=$(cat "/tmp/ci-result-${PROJECT_NAME}-${ISSUE}.txt" 2>/dev/null || echo "")
|
||||
review_comments=$(curl -sf ... "${API}/issues/${PR}/comments" | jq ...)
|
||||
|
||||
# 2. Spawn new tmux session in same worktree
|
||||
tmux new-session -d -s "dev-${PROJECT_NAME}-${ISSUE}" \
|
||||
-c "$WORKTREE" \
|
||||
"claude --dangerously-skip-permissions"
|
||||
|
||||
# 3. Inject recovery context
|
||||
tmux send-keys -t "dev-${PROJECT_NAME}-${ISSUE}" \
|
||||
"$(cat recovery-prompt.txt)" Enter
|
||||
```
|
||||
|
||||
**Recovery context injected into new session:**
|
||||
- Issue body (what to implement)
|
||||
- `git diff` of work done so far (git is the checkpoint, not memory)
|
||||
- Last known phase (where we left off)
|
||||
- Last CI result (if phase was `awaiting_ci`)
|
||||
- Latest review comments (if phase was `awaiting_review`)
|
||||
|
||||
**Key principle:** Git is the checkpoint. The worktree persists across crashes.
|
||||
Claude can read `git log`, `git diff`, and `git status` to understand exactly
|
||||
what was done before the crash. No state needs to be stored beyond the phase
|
||||
file and git history.
|
||||
|
||||
### State files summary
|
||||
|
||||
| File | Created by | Purpose |
|
||||
|------|-----------|---------|
|
||||
| `/tmp/dev-session-{proj}-{issue}.phase` | Claude (in session) | Current phase |
|
||||
| `/tmp/ci-result-{proj}-{issue}.txt` | Orchestrator | Last CI output for injection |
|
||||
| `/tmp/dev-{proj}-{issue}.log` | Orchestrator | Session transcript (aspirational — path TBD when tmux session manager is implemented in #80) |
|
||||
| `WORKTREE` (git worktree) | dev-agent.sh | Code checkpoint |
|
||||
|
||||
## Sequence Diagram
|
||||
|
||||
```
|
||||
Claude session Orchestrator (dev-poll.sh)
|
||||
────────────── ──────────────────────────
|
||||
implement issue
|
||||
push PR branch
|
||||
echo "PHASE:awaiting_ci" ───→ read phase file
|
||||
poll CI
|
||||
CI passes
|
||||
←── tmux send-keys "CI passed"
|
||||
echo "PHASE:awaiting_review" → read phase file
|
||||
wait for review-poll
|
||||
review: REQUEST_CHANGES
|
||||
←── tmux send-keys "Review: ..."
|
||||
address review comments
|
||||
push fixes
|
||||
echo "PHASE:awaiting_review" → read phase file
|
||||
review: APPROVE
|
||||
←── tmux send-keys "Approved"
|
||||
merge PR
|
||||
echo "PHASE:done" ──────────→ read phase file
|
||||
verify merged
|
||||
kill session
|
||||
close issue
|
||||
```
|
||||
|
||||
## Notes
|
||||
|
||||
- The phase file is write-once-per-phase (always overwritten with `>`).
|
||||
The orchestrator reads it, acts, then waits for the next write.
|
||||
- Claude should write the phase sentinel **as the last action** of each phase,
|
||||
after any git push or other side effects are complete.
|
||||
- If Claude writes `PHASE:failed`, it should include a reason on the next line:
|
||||
```bash
|
||||
printf 'PHASE:failed\nReason: %s\n' "$reason" > "$PHASE_FILE"
|
||||
```
|
||||
- Phase files are cleaned up by the orchestrator after `PHASE:done` or
|
||||
`PHASE:failed`.
|
||||
Loading…
Add table
Add a link
Reference in a new issue