diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 0000000..a6750e2 --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,114 @@ + +# Disinto — Agent Instructions + +## What this repo is + +Disinto is an autonomous code factory. It manages four agents (dev, review, +gardener, supervisor) that pick up issues from Codeberg, implement them, +review PRs, and keep the system healthy — all via cron and `claude -p`. + +See `README.md` for the full architecture and `BOOTSTRAP.md` for setup. + +## Directory layout + +``` +disinto/ +├── dev/ dev-poll.sh, dev-agent.sh — issue implementation +├── review/ review-poll.sh, review-pr.sh — PR review +├── gardener/ gardener-poll.sh — backlog grooming +├── supervisor/ supervisor-poll.sh — health monitoring +├── lib/ env.sh, ci-debug.sh, matrix_listener.sh +├── projects/ *.toml — per-project config +└── docs/ Protocol docs (PHASE-PROTOCOL.md, etc.) +``` + +## Tech stack + +- **Shell**: bash (all agents are bash scripts) +- **AI**: `claude -p` (one-shot) or `claude` (interactive/tmux sessions) +- **CI**: Woodpecker CI (queried via REST API + Postgres) +- **VCS**: Codeberg (git + Gitea REST API) +- **Notifications**: Matrix (optional) + +## Coding conventions + +- All scripts start with `#!/usr/bin/env bash` and `set -euo pipefail` +- Source shared environment: `source "$(dirname "$0")/../lib/env.sh"` +- Log to `$LOGFILE` using the `log()` function from env.sh or defined locally +- Never hardcode secrets — all come from `.env` or TOML project files +- ShellCheck must pass (CI runs `shellcheck` on all `.sh` files) +- Avoid duplicate code — shared helpers go in `lib/` + +## How to lint and test + +```bash +# ShellCheck all scripts +shellcheck dev/dev-poll.sh dev/dev-agent.sh review/review-poll.sh \ + review/review-pr.sh gardener/gardener-poll.sh \ + supervisor/supervisor-poll.sh lib/env.sh lib/ci-debug.sh \ + lib/parse-deps.sh lib/matrix_listener.sh + +# Run phase protocol test +bash dev/phase-test.sh +``` + +--- + +## Phase-Signaling Protocol (for persistent tmux sessions) + +When running as a **persistent tmux session** (issue #80+), Claude must signal +the orchestrator at each phase boundary by writing to a well-known file. + +### Phase file path + +``` +/tmp/dev-session-{PROJECT_NAME}-{ISSUE}.phase +``` + +### Required phase sentinels + +Write exactly one of these lines (with `>`, not `>>`) when a phase ends: + +```bash +PHASE_FILE="/tmp/dev-session-${PROJECT_NAME:-project}-${ISSUE:-0}.phase" + +# After pushing a PR branch — waiting for CI +echo "PHASE:awaiting_ci" > "$PHASE_FILE" + +# After CI passes — waiting for review +echo "PHASE:awaiting_review" > "$PHASE_FILE" + +# Blocked on human decision (ambiguous spec, architectural question) +echo "PHASE:needs_human" > "$PHASE_FILE" + +# PR is merged and issue is done +echo "PHASE:done" > "$PHASE_FILE" + +# Unrecoverable failure +printf 'PHASE:failed\nReason: %s\n' "describe what failed" > "$PHASE_FILE" +``` + +### When to write each phase + +1. **After `git push origin $BRANCH`** → write `PHASE:awaiting_ci` +2. **After receiving "CI passed" injection** → write `PHASE:awaiting_review` +3. **After receiving review feedback** → address it, push, write `PHASE:awaiting_review` +4. **After receiving "Approved" injection** → merge (or wait for orchestrator to merge), write `PHASE:done` +5. **When stuck on human-only decision** → write `PHASE:needs_human`, then wait for input +6. **When a step fails unrecoverably** → write `PHASE:failed` + +### Crash recovery + +If this session was restarted after a crash, the orchestrator will inject: +- The issue body +- `git diff` of work completed before the crash +- The last known phase +- Any CI results or review comments + +Read that context, then resume from where you left off. The git worktree is +the checkpoint — your code changes survived the crash. + +### Full protocol reference + +See `docs/PHASE-PROTOCOL.md` for the complete spec including the orchestrator +reaction matrix and sequence diagram. diff --git a/BOOTSTRAP.md b/BOOTSTRAP.md index 80530a5..87bf1fd 100644 --- a/BOOTSTRAP.md +++ b/BOOTSTRAP.md @@ -11,6 +11,7 @@ Before starting, ensure you have: - [ ] A **second Codeberg account** for the review bot (branch protection requires reviews from a different user) - [ ] A **local clone** of the target repo on the same machine as disinto - [ ] `claude` CLI installed and authenticated (`claude --version`) +- [ ] `tmux` installed (`tmux -V`) — required for persistent dev sessions (issue #80+) ## 1. Configure `.env` diff --git a/dev/phase-test.sh b/dev/phase-test.sh new file mode 100755 index 0000000..f89a0c0 --- /dev/null +++ b/dev/phase-test.sh @@ -0,0 +1,140 @@ +#!/usr/bin/env bash +# phase-test.sh — Integration test for the phase-signaling protocol +# +# Simulates a Claude session writing phases and an orchestrator reading them. +# Tests all phase values and verifies the read/write contract. +# +# Usage: bash dev/phase-test.sh + +set -euo pipefail + +PROJECT="testproject" +ISSUE="999" +PHASE_FILE="/tmp/dev-session-${PROJECT}-${ISSUE}.phase" + +PASS=0 +FAIL=0 + +ok() { + printf '[PASS] %s\n' "$1" + PASS=$((PASS + 1)) +} + +fail() { + printf '[FAIL] %s\n' "$1" + FAIL=$((FAIL + 1)) +} + +# Cleanup +rm -f "$PHASE_FILE" + +# ── Test 1: phase file path convention ──────────────────────────────────────── +expected_path="/tmp/dev-session-${PROJECT}-${ISSUE}.phase" +if [ "$PHASE_FILE" = "$expected_path" ]; then + ok "phase file path follows /tmp/dev-session-{project}-{issue}.phase convention" +else + fail "phase file path mismatch: got $PHASE_FILE, expected $expected_path" +fi + +# ── Test 2: write and read each phase sentinel ───────────────────────────────── +check_phase() { + local sentinel="$1" + echo "$sentinel" > "$PHASE_FILE" + local got + got=$(cat "$PHASE_FILE" | tr -d '[:space:]') + if [ "$got" = "$sentinel" ]; then + ok "write/read: $sentinel" + else + fail "write/read: expected '$sentinel', got '$got'" + fi +} + +check_phase "PHASE:awaiting_ci" +check_phase "PHASE:awaiting_review" +check_phase "PHASE:needs_human" +check_phase "PHASE:done" +check_phase "PHASE:failed" + +# ── Test 3: write overwrites (not appends) ───────────────────────────────────── +echo "PHASE:awaiting_ci" > "$PHASE_FILE" +echo "PHASE:awaiting_review" > "$PHASE_FILE" +line_count=$(wc -l < "$PHASE_FILE") +if [ "$line_count" -eq 1 ]; then + ok "phase file overwrite (single line after two writes)" +else + fail "phase file should have 1 line, got $line_count" +fi + +# ── Test 4: failed phase with reason ────────────────────────────────────────── +printf 'PHASE:failed\nReason: %s\n' "shellcheck failed on ci.sh" > "$PHASE_FILE" +first_line=$(head -1 "$PHASE_FILE") +second_line=$(sed -n '2p' "$PHASE_FILE") +if [ "$first_line" = "PHASE:failed" ] && echo "$second_line" | grep -q "^Reason:"; then + ok "PHASE:failed with reason line" +else + fail "PHASE:failed format: first='$first_line' second='$second_line'" +fi + +# ── Test 5: orchestrator read function ──────────────────────────────────────── +read_phase() { + local pfile="$1" + # Allow cat to fail (missing file) — pipeline exits 0 via || true + { cat "$pfile" 2>/dev/null || true; } | head -1 | tr -d '[:space:]' +} + +echo "PHASE:awaiting_ci" > "$PHASE_FILE" +phase=$(read_phase "$PHASE_FILE") +if [ "$phase" = "PHASE:awaiting_ci" ]; then + ok "orchestrator read_phase() extracts first line" +else + fail "orchestrator read_phase() got: '$phase'" +fi + +# ── Test 6: missing file returns empty ──────────────────────────────────────── +rm -f "$PHASE_FILE" +phase=$(read_phase "$PHASE_FILE") +if [ -z "$phase" ]; then + ok "missing phase file returns empty string" +else + fail "missing phase file should return empty, got: '$phase'" +fi + +# ── Test 7: all valid phase values are recognized ───────────────────────────── +is_valid_phase() { + local p="$1" + case "$p" in + PHASE:awaiting_ci|PHASE:awaiting_review|PHASE:needs_human|PHASE:done|PHASE:failed) + return 0 ;; + *) + return 1 ;; + esac +} + +for p in "PHASE:awaiting_ci" "PHASE:awaiting_review" "PHASE:needs_human" \ + "PHASE:done" "PHASE:failed"; do + if is_valid_phase "$p"; then + ok "is_valid_phase: $p" + else + fail "is_valid_phase rejected valid phase: $p" + fi +done + +if ! is_valid_phase "PHASE:unknown"; then + ok "is_valid_phase rejects unknown phase" +else + fail "is_valid_phase should reject PHASE:unknown" +fi + +# ── Cleanup ─────────────────────────────────────────────────────────────────── +rm -f "$PHASE_FILE" + +# ── Summary ─────────────────────────────────────────────────────────────────── +echo "" +printf 'Results: %d passed, %d failed\n' "$PASS" "$FAIL" +if [ "$FAIL" -eq 0 ]; then + echo "All tests passed." + exit 0 +else + echo "Some tests failed." + exit 1 +fi diff --git a/docs/PHASE-PROTOCOL.md b/docs/PHASE-PROTOCOL.md new file mode 100644 index 0000000..b981a65 --- /dev/null +++ b/docs/PHASE-PROTOCOL.md @@ -0,0 +1,178 @@ +# Phase-Signaling Protocol for Persistent Claude Sessions + +## Overview + +When dev-agent runs Claude in a persistent tmux session (rather than a +one-shot `claude -p` invocation), Claude needs a way to signal the +orchestrator (`dev-poll.sh`) that a phase has completed. + +Claude writes a sentinel line to a **phase file** — a well-known path based +on project name and issue number. The orchestrator watches that file and +reacts accordingly. + +## Phase File Path Convention + +``` +/tmp/dev-session-{project}-{issue}.phase +``` + +Where: +- `{project}` = the project name from the TOML (`name` field), e.g. `harb` +- `{issue}` = the issue number, e.g. `42` + +Example: `/tmp/dev-session-harb-42.phase` + +## Phase Values + +Claude writes exactly one of these lines to the phase file when a phase ends: + +| Sentinel | Meaning | Orchestrator action | +|----------|---------|---------------------| +| `PHASE:awaiting_ci` | PR pushed, waiting for CI to run | Poll CI; inject result when done | +| `PHASE:awaiting_review` | CI passed, PR open, waiting for review | Wait for `review-poll` to inject feedback | +| `PHASE:needs_human` | Blocked on human decision | Send Matrix notification; wait for reply | +| `PHASE:done` | Work complete, PR merged | Verify merge, kill tmux session, clean up | +| `PHASE:failed` | Unrecoverable failure | Escalate to gardener/supervisor | + +### Writing a phase (from within Claude's session) + +```bash +PHASE_FILE="/tmp/dev-session-${PROJECT_NAME}-${ISSUE}.phase" + +# Signal awaiting CI +echo "PHASE:awaiting_ci" > "$PHASE_FILE" + +# Signal awaiting review +echo "PHASE:awaiting_review" > "$PHASE_FILE" + +# Signal needs human +echo "PHASE:needs_human" > "$PHASE_FILE" + +# Signal done +echo "PHASE:done" > "$PHASE_FILE" + +# Signal failure +echo "PHASE:failed" > "$PHASE_FILE" +``` + +The orchestrator reads with: + +```bash +phase=$(cat "$PHASE_FILE" 2>/dev/null | tr -d '[:space:]') +``` + +## Orchestrator Reaction Matrix + +``` +PHASE:awaiting_ci → poll CI every 30s + on success → inject "CI passed" into tmux session + on failure → inject CI error log into tmux session + on timeout → inject "CI timeout" + escalate + +PHASE:awaiting_review → wait for review-poll.sh to post review comment + on REQUEST_CHANGES → inject review text into session + on APPROVE → inject "approved" into session + on timeout (3h) → inject "no review, escalating" + +PHASE:needs_human → send Matrix notification with issue/PR link + on reply → inject human reply into session + on timeout → re-notify, then escalate after 24h + +PHASE:done → verify PR merged on Codeberg + if merged → kill tmux session, clean labels, close issue + if not → inject "PR not merged yet" into session + +PHASE:failed → write escalation to supervisor/escalations-{project}.jsonl + kill tmux session + restore backlog label on issue +``` + +## Crash Recovery + +If the tmux session dies (Claude crash, OOM, kernel OOM-kill, compaction): + +### Detection + +`dev-poll.sh` detects a crash via: +1. `tmux has-session -t "dev-{project}-{issue}"` returns non-zero, OR +2. Phase file is stale (mtime > `CLAUDE_TIMEOUT` seconds with no `PHASE:done`) + +### Recovery procedure + +```bash +# 1. Read current state from disk +git_diff=$(git -C "$WORKTREE" diff origin/main..HEAD --stat 2>/dev/null) +last_phase=$(cat "$PHASE_FILE" 2>/dev/null || echo "PHASE:unknown") +last_ci=$(cat "/tmp/ci-result-${PROJECT_NAME}-${ISSUE}.txt" 2>/dev/null || echo "") +review_comments=$(curl -sf ... "${API}/issues/${PR}/comments" | jq ...) + +# 2. Spawn new tmux session in same worktree +tmux new-session -d -s "dev-${PROJECT_NAME}-${ISSUE}" \ + -c "$WORKTREE" \ + "claude --dangerously-skip-permissions" + +# 3. Inject recovery context +tmux send-keys -t "dev-${PROJECT_NAME}-${ISSUE}" \ + "$(cat recovery-prompt.txt)" Enter +``` + +**Recovery context injected into new session:** +- Issue body (what to implement) +- `git diff` of work done so far (git is the checkpoint, not memory) +- Last known phase (where we left off) +- Last CI result (if phase was `awaiting_ci`) +- Latest review comments (if phase was `awaiting_review`) + +**Key principle:** Git is the checkpoint. The worktree persists across crashes. +Claude can read `git log`, `git diff`, and `git status` to understand exactly +what was done before the crash. No state needs to be stored beyond the phase +file and git history. + +### State files summary + +| File | Created by | Purpose | +|------|-----------|---------| +| `/tmp/dev-session-{proj}-{issue}.phase` | Claude (in session) | Current phase | +| `/tmp/ci-result-{proj}-{issue}.txt` | Orchestrator | Last CI output for injection | +| `/tmp/dev-{proj}-{issue}.log` | Orchestrator | Session transcript | +| `WORKTREE` (git worktree) | dev-agent.sh | Code checkpoint | + +## Sequence Diagram + +``` +Claude session Orchestrator (dev-poll.sh) +────────────── ────────────────────────── +implement issue +push PR branch +echo "PHASE:awaiting_ci" ───→ read phase file + poll CI + CI passes + ←── tmux send-keys "CI passed" +echo "PHASE:awaiting_review" → read phase file + wait for review-poll + review: REQUEST_CHANGES + ←── tmux send-keys "Review: ..." +address review comments +push fixes +echo "PHASE:awaiting_review" → read phase file + review: APPROVE + ←── tmux send-keys "Approved" +merge PR +echo "PHASE:done" ──────────→ read phase file + verify merged + kill session + close issue +``` + +## Notes + +- The phase file is write-once-per-phase (always overwritten with `>`). + The orchestrator reads it, acts, then waits for the next write. +- Claude should write the phase sentinel **as the last action** of each phase, + after any git push or other side effects are complete. +- If Claude writes `PHASE:failed`, it should include a reason on the next line: + ```bash + printf 'PHASE:failed\nReason: %s\n' "$reason" > "$PHASE_FILE" + ``` +- Phase files are cleaned up by the orchestrator after `PHASE:done` or + `PHASE:failed`.