Commit graph

34 commits

Author SHA1 Message Date
openhands
23949083c0 fix: Remove Matrix integration — notifications move to forge + OpenClaw (#732)
Remove all Matrix/Dendrite infrastructure:
- Delete lib/matrix_listener.sh (long-poll daemon), lib/matrix_listener.service
  (systemd unit), lib/hooks/on-stop-matrix.sh (response streaming hook)
- Remove matrix_send() and matrix_send_ctx() from lib/env.sh
- Remove MATRIX_HOMESERVER auto-detection, MATRIX_THREAD_MAP from lib/env.sh
- Remove [matrix] section parsing from lib/load-project.sh
- Remove Matrix hook installation from lib/agent-session.sh
- Remove notify/notify_ctx helpers and Matrix thread tracking from
  dev/dev-agent.sh and action/action-agent.sh
- Remove all matrix_send calls from dev-poll.sh, phase-handler.sh,
  action-poll.sh, vault-poll.sh, vault-fire.sh, vault-reject.sh,
  review-poll.sh, review-pr.sh, supervisor-poll.sh, formula-session.sh
- Remove Matrix listener startup from docker/agents/entrypoint.sh
- Remove append_dendrite_compose() and setup_matrix() from bin/disinto
- Remove --matrix flag from disinto init
- Clean Matrix references from .env.example, projects/*.toml.example,
  formulas/*.toml, AGENTS.md, BOOTSTRAP.md, README.md, RESOURCES.md,
  PHASE-PROTOCOL.md, and all agent AGENTS.md/PROMPT.md files

Status visibility now via Codeberg PR/issue activity. Human interaction
via vault items through forge. Proactive alerts via OpenClaw heartbeats.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 14:53:56 +00:00
openhands
af39b833af fix: Session lock must not block during idle phases (awaiting_review/awaiting_ci) (#724)
Restructure session.lock from command-wrapper flock to fd-based flock so
the lock can be released when Claude is idle and re-acquired before
injecting the next prompt.

- agent-session.sh: add session_lock_acquire/release helpers, open fd in
  create_agent_session instead of wrapping claude with flock, auto-acquire
  in agent_inject_into_session before injecting
- phase-handler.sh: call session_lock_release at start of awaiting_ci and
  awaiting_review handlers (Claude is idle during CI polling / review wait)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 10:11:50 +00:00
openhands
ff8d773d7a fix: use flock -w 300 instead of -n to queue concurrent agent sessions
Non-blocking flock (-n) silently drops work items when concurrent agents
race for the lock. Switch to -w 300 so sessions queue up to 5 minutes,
and single-quote the lock path to handle spaces in $HOME.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 17:54:48 +00:00
openhands
cf6400e8f3 fix: shared Claude OAuth credentials in containers — mount + flock to prevent token rotation race (#693)
- Make ~/.claude volume mount read-write (was :ro) so containers can
  write back refreshed OAuth tokens
- Wrap Claude CLI in flock(1) inside tmux sessions using
  ~/.claude/session.lock — prevents concurrent token refresh races
  across agents sharing the same credentials
- Add ANTHROPIC_API_KEY detection in entrypoint.sh: when set, skips
  OAuth entirely (no rotation issues, metered billing)
- Log active auth method (API key vs OAuth vs missing) at container
  startup for easier 401 debugging
- Document 'claude auth login' requirement in disinto init output

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 17:48:21 +00:00
openhands
742b64e743 fix: stop hook should nudge Claude when PHASE file is empty — prevents silent exit without PHASE:done (#585)
When Claude finishes a response but hasn't written to the PHASE file,
the stop hook now injects a nudge into the tmux session instead of just
marking idle. This gives Claude another chance to complete the phase
protocol before the monitor loop times out.

Key changes:
- on-idle-stop.sh: check phase file emptiness, nudge via tmux (max 2)
- agent-session.sh: pass phase_file + session to stop hook, clean up
  nudge counter on session teardown

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 09:56:49 +00:00
openhands
b769eaa182 fix: monitor_phase_loop docstring lists 'break' as a possible _MONITOR_LOOP_EXIT value but it is never set (#435)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 09:39:48 +00:00
openhands
575ab427d2 Revert "Merge pull request 'fix: inject skipDangerousModePermissionPrompt into worktree settings (#514)' (#522) from fix/agent-session-skip-permissions into main"
This reverts commit 0631b71aa5, reversing
changes made to 93d8249d3a.
2026-03-21 20:48:41 +00:00
johba
0631b71aa5 Merge pull request 'fix: inject skipDangerousModePermissionPrompt into worktree settings (#514)' (#522) from fix/agent-session-skip-permissions into main 2026-03-21 20:54:02 +01:00
openhands
bc6fe1beee fix: inject skipDangerousModePermissionPrompt into worktree settings
Project-level .claude/settings.json overrides global ~/.claude/settings.json.
When agent-session.sh creates settings with hooks but without the skip flag,
Claude shows an interactive bypass-permissions confirmation dialog that blocks
all non-interactive tmux agent sessions.

Fixes #514.
2026-03-21 19:50:49 +00:00
openhands
5822dc89d9 fix: feat: unified escalation — single PHASE:escalate path for all agents (#510)
Replace PHASE:needs_human with PHASE:escalate across all agent types.
Consolidates 6 overlapping escalation mechanisms into one unified path:
detect → notify via Matrix → session stays alive → human reply injected → resume.

Key changes:
- PHASE:escalate replaces PHASE:needs_human everywhere (16 files)
- CI exhausted now escalates instead of immediately marking blocked
- Matrix listener routes free-text replies to vault tmux sessions
- Vault agent writes PHASE:escalate files for procurement requests
- Supervisor monitors PHASE:escalate sessions in health checks
- 24h timeout on escalation → blocked label + session killed
- All 38 phase protocol tests updated and passing

Supersedes #462, #458, #465.
2026-03-21 19:39:04 +00:00
openhands
f6dd91389f fix: PreToolUse guard — allow formula agents to access FACTORY_ROOT from worktrees (#487)
- Add session name as third arg to guard hook (passed from agent-session.sh)
- Detect formula sessions (supervisor-*, gardener-*, planner-*, predictor-*)
- Guard 6: block filesystem access to factory root from worktrees, exempt formulas
- Guard 7: restrict system commands (kill, docker, tmux) to supervisor only
- Guard 2: allow formula agents rm -rf within factory root

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 18:09:28 +00:00
openhands
1604d1e062 fix: idle_pane_count not reset in phase-changed branch of monitor_phase_loop (#436)
When a phase change is detected (mtime changes), idle_elapsed was reset
but idle_pane_count was not. This meant idle counts accumulated before a
phase write carried into subsequent polls, so N consecutive idle polls
could be reached with fewer than N actual consecutive idle polls.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 17:14:54 +00:00
openhands
ab122c9701 fix: PHASE:needs_human missing from crash-path terminal set in monitor_phase_loop (#342)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 03:50:21 +00:00
openhands
e3895ad3ac fix: feat: SessionStart compact hook re-injects phase protocol after context compaction (#274)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 23:27:32 +00:00
openhands
898f958196 fix: add model=opus to run-planner formula and wire through action-agent
TOML declares model = "opus". planner-poll.sh includes model: opus in
the issue YAML front matter. action-agent.sh extracts it and exports
CLAUDE_MODEL. create_agent_session passes --model to claude if set.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 09:08:17 +00:00
openhands
f66fcd666c fix: address review — terminal phase guard, explicit marker var, test coverage
- Guard against overwriting terminal phases (PHASE:done, PHASE:merged)
  in on-stop-failure.sh to prevent false failures from same-turn race
- Declare sf_phase_marker explicitly in StopFailure block instead of
  relying on phase_marker from PostToolUse block
- Add authentication_failed test (10c) and terminal phase guard tests
  (10g, 10h)
- Fix fragile nested command substitution in test 10f fail() message

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 01:52:46 +00:00
openhands
eaf2841494 fix: feat: StopFailure hook writes phase file on API error / rate limit (#275)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 01:43:00 +00:00
openhands
0a095c4656 fix: feat: SessionEnd hook for guaranteed cleanup on session exit (#276)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 01:11:26 +00:00
openhands
de8dcef81e fix: feat: PreToolUse hook guards destructive operations in dev-agent sessions (#277)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 00:10:27 +00:00
openhands
424a53c9f7 fix: feat: stream action-agent Claude output to Matrix thread (#293)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 23:43:29 +00:00
openhands
63d4179f38 fix: Deduplicate hook entries in settings.json on repeated create_agent_session calls (#299)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 21:19:58 +00:00
openhands
809dd93c3b fix: distinguish phase file writes from reads in PostToolUse hook
- Parse tool_name via jq: Write tool checks file_path match,
  Bash tool checks for redirect operator (>) with phase file path
- Reads (cat, head) no longer trigger false-positive markers
- Split guard into separate statements for clarity
- Move marker cleanup inside hook-install guard
- Expand tests: 5 cases covering Bash write, Write tool, Bash read,
  unrelated Bash, and Write to different file

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 18:14:49 +00:00
openhands
ac04dc29a6 fix: feat: PostToolUse hook detects phase file writes in real-time (eliminates polling latency) (#278)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 17:55:06 +00:00
johba
ab3efa2402 fix: replace fragile pane grep with Stop hook for idle detection (#272)
## Summary

- Claude Code v2.1.79 permanently shows `❯` in the input area even while actively thinking, causing `monitor_phase_loop` to false-positive on idle detection and kill working sessions after 90 seconds
- Replace `tmux capture-pane | grep ❯` with a Claude Code Stop hook (`lib/hooks/on-idle-stop.sh`) that writes a marker file only when Claude actually finishes responding
- Hook is installed per-worktree in `.claude/settings.json` by `create_agent_session`; marker cleaned up on inject/kill

## Test plan

- [x] Verified hook installs correctly in fresh worktree
- [x] Verified marker file appears only after Claude finishes responding (not during active thinking)
- [x] Verified live dev-agent session picks up fix and Claude works without being killed
- [x] Verified `agent_inject_into_session` clears marker before new work

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: openhands <openhands@all-hands.dev>
Reviewed-on: https://codeberg.org/johba/disinto/pulls/272
2026-03-19 14:57:54 +01:00
johba
d5c2c213a3 fix: bug: gardener hangs forever when Claude finishes without writing phase file (#261) (#263)
Fixes #261

## Changes
Fixed gardener hanging forever when Claude skips phase protocol. Three changes: (1) gardener-agent.sh: replaced 999999s timeout with 7200s (2h, matching dev-agent); (2) lib/agent-session.sh: added idle-prompt detection to monitor_phase_loop — if Claude returns to the ❯ prompt for 3 consecutive polls with no phase file written, exits immediately with _MONITOR_LOOP_EXIT=idle_prompt (only fires when phase file is empty, so awaiting_ci/review waits are unaffected); (3) gardener prompt: removed 'no time limit' wording, replaced with explicit phase-write requirement.

Co-authored-by: openhands <openhands@all-hands.dev>
Reviewed-on: https://codeberg.org/johba/disinto/pulls/263
Reviewed-by: Disinto_bot <disinto_bot@noreply.codeberg.org>
2026-03-19 13:47:10 +01:00
openhands
e853949b47 fix: Callbacks can't see the resolved _session from monitor_phase_loop (#200)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-19 01:05:21 +00:00
openhands
e0f37ede12 fix: agent-session.sh: monitor_phase_loop should accept SESSION_NAME as a parameter (#187)
Add optional 4th parameter to monitor_phase_loop for SESSION_NAME,
falling back to the $SESSION_NAME global for backwards-compatibility.
Document the full function signature in both the file header and inline comment.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-18 20:15:14 +00:00
openhands
61c518116a fix: fix: phase-handler.sh references LAST_PHASE_MTIME which is now internal to monitor_phase_loop (#181)
Export LAST_PHASE_MTIME from monitor_phase_loop before invoking the callback
so that phase-handler.sh can compare phase file mtimes inside the awaiting_review
inner poll loop without hitting an unbound variable error under set -u.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-18 17:26:54 +00:00
openhands
55b5628e24 feat: CI smoke test — syntax check + function resolution for all agent scripts (#177)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-18 17:11:02 +00:00
openhands
014965494b fix: implement monitor_phase_loop in agent-session.sh
Generic phase monitoring loop with callback for agent-specific handling.
Handles: idle timeout, session crash detection, terminal phases (done,
failed, needs_human, merged). Sets _MONITOR_LOOP_EXIT for the caller.
2026-03-18 16:40:51 +00:00
openhands
d83098f382 fix: pass SESSION_NAME to all agent-session.sh function calls
Library functions need explicit session name argument — they no longer
have closure over $SESSION_NAME from the parent script.

- agent_kill_session: add $SESSION_NAME to all 11 call sites
- agent_inject_into_session: add $SESSION_NAME to all call sites in
  phase-handler.sh and gardener-agent.sh
- agent_kill_session: guard against missing arg (defensive)
2026-03-18 16:24:58 +00:00
openhands
350acccd8b fix: add create_agent_session and inject_formula to agent-session.sh
Both dev-agent.sh and gardener-agent.sh call these functions but they
were never implemented during the #158 extraction. Adds:
- create_agent_session(session, workdir) — tmux + claude + wait for ready
- inject_formula(session, text) — alias for agent_inject_into_session
2026-03-18 16:21:05 +00:00
johba
6d5cc4458f fix: feat: gardener-agent.sh — tmux + Claude interactive gardener using agent-session.sh (#159) (#163)
Fixes #159

## Changes
Add gardener-agent.sh (tmux+Claude) and lib/agent-session.sh (shared helpers). gardener-poll.sh slimmed to cron wrapper; grooming delegated to new agent; recipe engine for CI escalations unchanged.

Co-authored-by: openhands <openhands@all-hands.dev>
Reviewed-on: https://codeberg.org/johba/disinto/pulls/163
Reviewed-by: review_bot <review_bot@noreply.codeberg.org>
2026-03-18 16:21:07 +01:00
openhands
cbd8c81da8 refactor: extract lib/agent-session.sh — reusable tmux + Claude agent runtime (#158)
Move generic agent infrastructure from dev/dev-agent.sh into lib/agent-session.sh:
- log, status, notify, notify_ctx, read_phase, wait_for_claude_ready,
  inject_into_session, kill_tmux_session extracted verbatim
- create_agent_session(session_name, workdir) — new: tmux session creation
- inject_formula(session_name, formula_text, context) — new: prompt injection
- monitor_phase_loop(phase_file, idle_timeout, callback_fn) — new: phase loop
  with session health check, crash recovery, and idle timeout detection

dev-agent.sh: sources the library, implements _on_phase_change() callback,
calls monitor_phase_loop(); idle-timeout and crash-recovery-failed cleanup
handled via _MONITOR_LOOP_EXIT signal variable. Behavior unchanged.
2026-03-18 14:36:36 +00:00