disinto/predictor/AGENTS.md
openhands 23949083c0 fix: Remove Matrix integration — notifications move to forge + OpenClaw (#732)
Remove all Matrix/Dendrite infrastructure:
- Delete lib/matrix_listener.sh (long-poll daemon), lib/matrix_listener.service
  (systemd unit), lib/hooks/on-stop-matrix.sh (response streaming hook)
- Remove matrix_send() and matrix_send_ctx() from lib/env.sh
- Remove MATRIX_HOMESERVER auto-detection, MATRIX_THREAD_MAP from lib/env.sh
- Remove [matrix] section parsing from lib/load-project.sh
- Remove Matrix hook installation from lib/agent-session.sh
- Remove notify/notify_ctx helpers and Matrix thread tracking from
  dev/dev-agent.sh and action/action-agent.sh
- Remove all matrix_send calls from dev-poll.sh, phase-handler.sh,
  action-poll.sh, vault-poll.sh, vault-fire.sh, vault-reject.sh,
  review-poll.sh, review-pr.sh, supervisor-poll.sh, formula-session.sh
- Remove Matrix listener startup from docker/agents/entrypoint.sh
- Remove append_dendrite_compose() and setup_matrix() from bin/disinto
- Remove --matrix flag from disinto init
- Clean Matrix references from .env.example, projects/*.toml.example,
  formulas/*.toml, AGENTS.md, BOOTSTRAP.md, README.md, RESOURCES.md,
  PHASE-PROTOCOL.md, and all agent AGENTS.md/PROMPT.md files

Status visibility now via Codeberg PR/issue activity. Human interaction
via vault items through forge. Proactive alerts via OpenClaw heartbeats.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 14:53:56 +00:00

3 KiB

Predictor Agent

Role: Abstract adversary (the "goblin"). Runs a 2-step formula (preflight → find-weakness-and-act) via interactive tmux Claude session (sonnet). Finds the project's biggest weakness, challenges planner claims, and generates evidence through explore/exploit decisions:

  • Explore (low confidence) — file a prediction/unreviewed issue for the planner to triage
  • Exploit (high confidence) — file a prediction AND dispatch a formula via an action issue to generate evidence before the planner even runs

The predictor's own prediction history (open + closed issues) serves as its memory — it reviews what was actioned, dismissed, or deferred to decide where to focus next. No hardcoded signal categories; Claude decides where to look based on available data: prerequisite tree, evidence directories, VISION.md, RESOURCES.md, open issues, agent logs, and external signals (via web search).

Files up to 5 actions per run (predictions + dispatches combined). Each exploit counts as 2 (prediction + action dispatch). The predictor MUST NOT emit feature work — only observations challenging claims, exposing gaps, and surfacing risks.

Trigger: predictor-run.sh runs daily at 06:00 UTC via cron (1h before the planner at 07:00). Sources lib/guard.sh and calls check_active predictor first — skips if $FACTORY_ROOT/state/.predictor-active is absent. Also guarded by PID lock (/tmp/predictor-run.lock) and memory check (skips if available RAM < 2000 MB).

Key files:

  • predictor/predictor-run.sh — Cron wrapper + orchestrator: active-state guard, lock, memory guard, sources disinto project config, builds structural analysis via lib/formula-session.sh:build_graph_section() (full-project scan — results included in prompt as ## Structural analysis; failures non-fatal), builds prompt with formula + forge API reference, creates tmux session (sonnet), monitors phase file, handles crash recovery via run_formula_and_monitor
  • formulas/run-predictor.toml — Execution spec: two steps (preflight, find-weakness-and-act) with needs dependencies. Claude reviews prediction history, explores/exploits weaknesses, and files issues in a single interactive session

Environment variables consumed:

  • FORGE_TOKEN, FORGE_REPO, FORGE_API, PROJECT_NAME, PROJECT_REPO_ROOT
  • PRIMARY_BRANCH, CLAUDE_MODEL (set to sonnet by predictor-run.sh)

Lifecycle: predictor-run.sh (daily 06:00 cron) → lock + memory guard → load formula + context (AGENTS.md, RESOURCES.md, VISION.md, prerequisite-tree.md) → create tmux session → Claude fetches prediction history (open + closed) → reviews track record (actioned/dismissed/watching) → finds weaknesses (prerequisite tree gaps, thin evidence, stale watches, external risks) → dedup against existing open predictions → explore (file prediction) or exploit (file prediction + dispatch formula via action issue) → PHASE:done. The planner's Phase 1 later triages these predictions.