fix: feat: predictor v3 — abstract adversary with explore/exploit and formula dispatch (#609)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
537a4ae567
commit
14e1c9ecde
3 changed files with 191 additions and 386 deletions
|
|
@ -1,22 +1,26 @@
|
|||
<!-- last-reviewed: eb7e24cb1df028c6061f47ddfdf9b4ebec33e1cf -->
|
||||
# Predictor Agent
|
||||
|
||||
**Role**: Risk oracle and opportunity spotter (the "goblin"). Runs a 4-step
|
||||
formula (preflight → collect-signals → re-evaluate-backlog → analyze-and-predict)
|
||||
via interactive tmux Claude session (sonnet). Collects three categories of signals:
|
||||
**Role**: Abstract adversary (the "goblin"). Runs a 2-step formula
|
||||
(preflight → find-weakness-and-act) via interactive tmux Claude session
|
||||
(sonnet). Finds the project's biggest weakness, challenges planner claims,
|
||||
and generates evidence through explore/exploit decisions:
|
||||
|
||||
1. **Health signals** — CI pipeline trends (Woodpecker), stale issues, agent
|
||||
health (tmux sessions + logs), resource patterns (RAM, disk, load, containers)
|
||||
2. **Outcome signals** — output freshness (formula journals/artifacts), capacity
|
||||
utilization (idle agents vs dispatchable backlog), throughput (closed issues,
|
||||
merged PRs, churn detection)
|
||||
3. **External signals** — dependency security advisories, upstream breaking
|
||||
changes, deprecation notices, ecosystem shifts (via targeted web search)
|
||||
- **Explore** (low confidence) — file a `prediction/unreviewed` issue for
|
||||
the planner to triage
|
||||
- **Exploit** (high confidence) — file a prediction AND dispatch a formula
|
||||
via an `action` issue to generate evidence before the planner even runs
|
||||
|
||||
Files up to 5 `prediction/unreviewed` issues for the Planner to triage.
|
||||
Predictions cover both "things going wrong" and "opportunities being missed".
|
||||
The predictor MUST NOT emit feature work — only observations about health,
|
||||
outcomes, and external risks/opportunities.
|
||||
The predictor's own prediction history (open + closed issues) serves as its
|
||||
memory — it reviews what was actioned, dismissed, or deferred to decide where
|
||||
to focus next. No hardcoded signal categories; Claude decides where to look
|
||||
based on available data: prerequisite tree, evidence directories, VISION.md,
|
||||
RESOURCES.md, open issues, agent logs, and external signals (via web search).
|
||||
|
||||
Files up to 5 actions per run (predictions + dispatches combined). Each
|
||||
exploit counts as 2 (prediction + action dispatch). The predictor MUST NOT
|
||||
emit feature work — only observations challenging claims, exposing gaps,
|
||||
and surfacing risks.
|
||||
|
||||
**Trigger**: `predictor-run.sh` runs daily at 06:00 UTC via cron (1h before
|
||||
the planner at 07:00). Guarded by PID lock (`/tmp/predictor-run.lock`) and
|
||||
|
|
@ -27,22 +31,21 @@ memory check (skips if available RAM < 2000 MB).
|
|||
sources disinto project config, builds prompt with formula + Codeberg API
|
||||
reference, creates tmux session (sonnet), monitors phase file, handles crash
|
||||
recovery via `run_formula_and_monitor`
|
||||
- `formulas/run-predictor.toml` — Execution spec: four steps (preflight,
|
||||
collect-signals, re-evaluate-backlog, analyze-and-predict) with `needs`
|
||||
dependencies. Claude collects signals, re-evaluates watched predictions,
|
||||
and files prediction issues in a single interactive session
|
||||
- `formulas/run-predictor.toml` — Execution spec: two steps (preflight,
|
||||
find-weakness-and-act) with `needs` dependencies. Claude reviews prediction
|
||||
history, explores/exploits weaknesses, and files issues in a single
|
||||
interactive session
|
||||
|
||||
**Environment variables consumed**:
|
||||
- `CODEBERG_TOKEN`, `CODEBERG_REPO`, `CODEBERG_API`, `PROJECT_NAME`, `PROJECT_REPO_ROOT`
|
||||
- `PRIMARY_BRANCH`, `CLAUDE_MODEL` (set to sonnet by predictor-run.sh)
|
||||
- `WOODPECKER_TOKEN`, `WOODPECKER_SERVER` — CI pipeline trend queries (optional; skipped if unset)
|
||||
- `MATRIX_TOKEN`, `MATRIX_ROOM_ID`, `MATRIX_HOMESERVER` — Notifications (optional)
|
||||
|
||||
**Lifecycle**: predictor-run.sh (daily 06:00 cron) → lock + memory guard →
|
||||
load formula + context → create tmux session → Claude collects signals
|
||||
(health: CI trends, stale issues, agent health, resources; outcomes: output
|
||||
freshness, capacity utilization, throughput; external: dependency advisories,
|
||||
ecosystem changes via web search) → dedup against existing open predictions →
|
||||
re-evaluate prediction/backlog watches (close stale, supersede changed) →
|
||||
file `prediction/unreviewed` issues → `PHASE:done`.
|
||||
load formula + context (AGENTS.md, RESOURCES.md, VISION.md, prerequisite-tree.md)
|
||||
→ create tmux session → Claude fetches prediction history (open + closed) →
|
||||
reviews track record (actioned/dismissed/watching) → finds weaknesses
|
||||
(prerequisite tree gaps, thin evidence, stale watches, external risks) →
|
||||
dedup against existing open predictions → explore (file prediction) or exploit
|
||||
(file prediction + dispatch formula via action issue) → `PHASE:done`.
|
||||
The planner's Phase 1 later triages these predictions.
|
||||
|
|
|
|||
|
|
@ -45,7 +45,7 @@ log "--- Predictor run start ---"
|
|||
|
||||
# ── Load formula + context ───────────────────────────────────────────────
|
||||
load_formula "$FACTORY_ROOT/formulas/run-predictor.toml"
|
||||
build_context_block AGENTS.md RESOURCES.md
|
||||
build_context_block AGENTS.md RESOURCES.md VISION.md planner/prerequisite-tree.md
|
||||
|
||||
# ── Read scratch file (compaction survival) ───────────────────────────────
|
||||
SCRATCH_CONTEXT=$(read_scratch_context "$SCRATCH_FILE")
|
||||
|
|
@ -57,16 +57,16 @@ build_prompt_footer
|
|||
# shellcheck disable=SC2034 # consumed by run_formula_and_monitor
|
||||
PROMPT="You are the prediction agent (goblin) for ${CODEBERG_REPO}. Work through the formula below. You MUST write PHASE:done to '${PHASE_FILE}' when finished — the orchestrator will time you out if you return to the prompt without signalling.
|
||||
|
||||
Your role: spot patterns across three signal categories and file them as prediction issues:
|
||||
1. Health signals — CI trends, agent status, resource pressure, stale issues
|
||||
2. Outcome signals — output freshness, capacity utilization, throughput
|
||||
3. External signals — dependency advisories, upstream changes, ecosystem shifts
|
||||
Your role: abstract adversary. Find the project's biggest weakness, challenge
|
||||
planner claims, and generate evidence. Explore when uncertain (file a prediction),
|
||||
exploit when confident (file a prediction AND dispatch a formula via an action issue).
|
||||
|
||||
Your prediction history IS your memory — review it to decide where to focus.
|
||||
The planner (adult) will triage every prediction before acting.
|
||||
You MUST NOT emit feature work or implementation issues — only predictions
|
||||
about health, outcomes, and external risks/opportunities.
|
||||
challenging claims, exposing gaps, and surfacing risks.
|
||||
Use WebSearch for external signal scanning — be targeted (project dependencies
|
||||
and tools only, not general news). Limit to 5 web searches per run.
|
||||
and tools only, not general news). Limit to 3 web searches per run.
|
||||
|
||||
## Project context
|
||||
${CONTEXT_BLOCK}
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue