From 514de48f583dfd13087d2446e707b750717551c7 Mon Sep 17 00:00:00 2001 From: Claude Date: Tue, 7 Apr 2026 18:05:41 +0000 Subject: [PATCH 1/4] chore: gardener housekeeping 2026-04-07 --- AGENTS.md | 2 +- architect/AGENTS.md | 2 +- dev/AGENTS.md | 4 ++-- gardener/AGENTS.md | 2 +- gardener/pending-actions.json | 8 +++++++- lib/AGENTS.md | 8 ++++---- planner/AGENTS.md | 2 +- predictor/AGENTS.md | 2 +- review/AGENTS.md | 2 +- supervisor/AGENTS.md | 2 +- 10 files changed, 20 insertions(+), 14 deletions(-) diff --git a/AGENTS.md b/AGENTS.md index d68b85a..78f1c29 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -1,4 +1,4 @@ - + # Disinto — Agent Instructions ## What this repo is diff --git a/architect/AGENTS.md b/architect/AGENTS.md index 64521ed..64b325e 100644 --- a/architect/AGENTS.md +++ b/architect/AGENTS.md @@ -1,4 +1,4 @@ - + # Architect — Agent Instructions ## What this agent is diff --git a/dev/AGENTS.md b/dev/AGENTS.md index 3d649b9..e8a0ead 100644 --- a/dev/AGENTS.md +++ b/dev/AGENTS.md @@ -1,4 +1,4 @@ - + # Dev Agent **Role**: Implement issues autonomously — write code, push branches, address @@ -14,7 +14,7 @@ in-progress issues are also picked up. The direct-merge scan runs before the loc check so approved PRs get merged even while a dev-agent session is active. **Key files**: -- `dev/dev-poll.sh` — Cron scheduler: finds next ready issue, handles merge/rebase of approved PRs, tracks CI fix attempts. Formula guard skips issues labeled `formula`, `prediction/dismissed`, or `prediction/unreviewed`. **Race prevention**: checks issue assignee before claiming — skips if assigned to a different bot user. **Stale branch abandonment**: closes PRs and deletes branches that are behind `$PRIMARY_BRANCH` (restarts poll cycle for a fresh start). **Stale in-progress recovery**: on each poll cycle, scans for issues labeled `in-progress`. If an issue has no assignee, no open PR, and no agent lock file — removes `in-progress`, adds `blocked` with a human-triage comment. If the issue has an assignee, trusts active work and skips (agent may be running in another container). +- `dev/dev-poll.sh` — Cron scheduler: finds next ready issue, handles merge/rebase of approved PRs, tracks CI fix attempts. Formula guard skips issues labeled `formula`, `prediction/dismissed`, or `prediction/unreviewed`. **Race prevention**: checks issue assignee before claiming — skips if assigned to a different bot user. **Stale branch abandonment**: closes PRs and deletes branches that are behind `$PRIMARY_BRANCH` (restarts poll cycle for a fresh start). **Stale in-progress recovery**: on each poll cycle, scans for issues labeled `in-progress`. If the issue is assigned to `$BOT_USER` (this agent), sets `BLOCKED_BY_INPROGRESS=true` — my thread is busy. If assigned to another agent, logs and falls through (does not block). If no assignee, no open PR, and no agent lock file — removes `in-progress`, adds `blocked` with a human-triage comment. **Per-agent open-PR gate**: before starting new work, filters open waiting PRs to only those assigned to this agent (`$BOT_USER`). Other agents' PRs do not block this agent's pipeline (#358, #369). - `dev/dev-agent.sh` — Orchestrator: claims issue, creates worktree + tmux session with interactive `claude`, monitors phase file, injects CI results and review feedback, merges on approval - `dev/phase-test.sh` — Integration test for the phase protocol diff --git a/gardener/AGENTS.md b/gardener/AGENTS.md index f898e63..2a5dcb3 100644 --- a/gardener/AGENTS.md +++ b/gardener/AGENTS.md @@ -1,4 +1,4 @@ - + # Gardener Agent **Role**: Backlog grooming — detect duplicate issues, missing acceptance diff --git a/gardener/pending-actions.json b/gardener/pending-actions.json index fe51488..a148369 100644 --- a/gardener/pending-actions.json +++ b/gardener/pending-actions.json @@ -1 +1,7 @@ -[] +[ + { + "action": "edit_body", + "issue": 356, + "body": "## Problem\n\nThe entrypoint hardcodes `REPRODUCE_FORMULA` to `formulas/reproduce.toml` (line 26) and never checks the `DISINTO_FORMULA` environment variable passed by the dispatcher for triage runs.\n\nThe dispatcher sets `-e DISINTO_FORMULA=triage` for triage dispatch, but the entrypoint ignores it — always running the reproduce formula.\n\n## Fix\n\nAt line 26, select the formula based on `DISINTO_FORMULA`:\n\n```bash\ncase \"${DISINTO_FORMULA:-reproduce}\" in\n triage)\n ACTIVE_FORMULA=\"${DISINTO_DIR}/formulas/triage.toml\"\n ;;\n *)\n ACTIVE_FORMULA=\"${DISINTO_DIR}/formulas/reproduce.toml\"\n ;;\nesac\n```\n\nThen use `ACTIVE_FORMULA` everywhere `REPRODUCE_FORMULA` is currently used.\n\nAlso update log messages to reflect which formula is running (\"Starting triage-agent\" vs \"Starting reproduce-agent\").\n\n## Affected files\n\n- `docker/reproduce/entrypoint-reproduce.sh` — line 26 and all references to REPRODUCE_FORMULA\n\n## Acceptance criteria\n\n- [ ] `DISINTO_FORMULA=triage` selects `formulas/triage.toml` in the entrypoint\n- [ ] `DISINTO_FORMULA=reproduce` (or unset) still runs `formulas/reproduce.toml`\n- [ ] Log messages reflect which formula is active (\"Starting triage-agent\" / \"Starting reproduce-agent\")\n- [ ] All `REPRODUCE_FORMULA` references replaced with `ACTIVE_FORMULA`\n" + } +] diff --git a/lib/AGENTS.md b/lib/AGENTS.md index e684824..a70e9a7 100644 --- a/lib/AGENTS.md +++ b/lib/AGENTS.md @@ -1,4 +1,4 @@ - + # Shared Helpers (`lib/`) All agents source `lib/env.sh` as their first action. Additional helpers are @@ -6,7 +6,7 @@ sourced as needed. | File | What it provides | Sourced by | |---|---|---| -| `lib/env.sh` | Loads `.env`, sets `FACTORY_ROOT`, exports project config (`FORGE_REPO`, `PROJECT_NAME`, etc.), defines `log()`, `forge_api()`, `forge_api_all()` (paginates all pages; accepts optional second TOKEN parameter, defaults to `$FORGE_TOKEN`; handles invalid/empty JSON responses gracefully — returns empty on parse error instead of crashing), `woodpecker_api()`, `wpdb()`, `memory_guard()` (skips agent if RAM < threshold). Auto-loads project TOML if `PROJECT_TOML` is set. Exports per-agent tokens (`FORGE_PLANNER_TOKEN`, `FORGE_GARDENER_TOKEN`, `FORGE_VAULT_TOKEN`, `FORGE_SUPERVISOR_TOKEN`, `FORGE_PREDICTOR_TOKEN`) — each falls back to `$FORGE_TOKEN` if not set. **Vault-only token guard (AD-006)**: `unset GITHUB_TOKEN CLAWHUB_TOKEN` so agents never hold external-action tokens — only the runner container receives them. **Container note**: when `DISINTO_CONTAINER=1`, `.env` is NOT re-sourced — compose already injects env vars (including `FORGE_URL=http://forgejo:3000`) and re-sourcing would clobber them. | Every agent | +| `lib/env.sh` | Loads `.env`, sets `FACTORY_ROOT`, exports project config (`FORGE_REPO`, `PROJECT_NAME`, etc.), defines `log()`, `forge_api()`, `forge_api_all()` (paginates all pages; accepts optional second TOKEN parameter, defaults to `$FORGE_TOKEN`; handles invalid/empty JSON responses gracefully — returns empty on parse error instead of crashing), `woodpecker_api()`, `wpdb()`, `memory_guard()` (skips agent if RAM < threshold). Auto-loads project TOML if `PROJECT_TOML` is set. Exports per-agent tokens (`FORGE_PLANNER_TOKEN`, `FORGE_GARDENER_TOKEN`, `FORGE_VAULT_TOKEN`, `FORGE_SUPERVISOR_TOKEN`, `FORGE_PREDICTOR_TOKEN`) — each falls back to `$FORGE_TOKEN` if not set. **Vault-only token guard (AD-006)**: `unset GITHUB_TOKEN CLAWHUB_TOKEN` so agents never hold external-action tokens — only the runner container receives them. **Container note**: when `DISINTO_CONTAINER=1`, `.env` is NOT re-sourced — compose already injects env vars (including `FORGE_URL=http://forgejo:3000`) and re-sourcing would clobber them. **Save/restore scope (#364)**: only `FORGE_URL` is preserved across `.env` re-sourcing (compose injects `http://forgejo:3000`, `.env` has `http://localhost:3000`). `FORGE_TOKEN` is NOT preserved so refreshed tokens in `.env` take effect immediately. **Required env var**: `FORGE_PASS` — bot password for git HTTP push (Forgejo 11.x rejects API tokens for `git push`, #361). | Every agent | | `lib/ci-helpers.sh` | `ci_passed()` — returns 0 if CI state is "success" (or no CI configured). `ci_required_for_pr()` — returns 0 if PR has code files (CI required), 1 if non-code only (CI not required). `is_infra_step()` — returns 0 if a single CI step failure matches infra heuristics (clone/git exit 128, any exit 137, log timeout patterns). `classify_pipeline_failure()` — returns "infra \" if any failed Woodpecker step matches infra heuristics via `is_infra_step()`, else "code". `ensure_priority_label()` — looks up (or creates) the `priority` label and returns its ID; caches in `_PRIORITY_LABEL_ID`. `ci_commit_status ` — queries Woodpecker directly for CI state, falls back to forge commit status API. `ci_pipeline_number ` — returns the Woodpecker pipeline number for a commit, falls back to parsing forge status `target_url`. `ci_promote ` — promotes a pipeline to a named Woodpecker environment (vault-gated deployment: vault approves, vault-fire calls this — vault redesign in progress, see #73-#77). `ci_get_logs [--step ]` — reads CI logs from Woodpecker SQLite database via `lib/ci-log-reader.py`; outputs last 200 lines to stdout. Requires mounted woodpecker-data volume at /woodpecker-data. | dev-poll, review-poll, review-pr | | `lib/ci-debug.sh` | CLI tool for Woodpecker CI: `list`, `status`, `logs`, `failures` subcommands. Not sourced — run directly. | Humans / dev-agent (tool access) | | `lib/ci-log-reader.py` | Python tool: reads CI logs from Woodpecker SQLite database. ` [--step ]` — returns last 200 lines from failed steps (or specified step). Used by `ci_get_logs()` in ci-helpers.sh. Requires `WOODPECKER_DATA_DIR` (default: /woodpecker-data). | ci-helpers.sh | @@ -25,8 +25,8 @@ sourced as needed. | `lib/vault.sh` | **Vault PR helper** — create vault action PRs on ops repo via Forgejo API (works from containers without SSH). `vault_request ` validates TOML (using `validate_vault_action` from `vault/vault-env.sh`), creates branch `vault/`, writes `vault/actions/.toml`, creates PR targeting `main` with title `vault: ` and body from context field, returns PR number. Idempotent: if PR exists, returns existing number. Requires `FORGE_TOKEN`, `FORGE_URL`, `FORGE_REPO`, `FORGE_OPS_REPO`. Uses the calling agent's own token (saves/restores `FORGE_TOKEN` around sourcing `vault-env.sh`), so approval workflow respects individual agent identities. | dev-agent (vault actions), future vault dispatcher | | `lib/branch-protection.sh` | Branch protection helpers for Forgejo repos. `setup_vault_branch_protection()` — configures admin-only merge protection on main (require 1 approval, restrict merge to admin role, block direct pushes). `setup_profile_branch_protection()` — same protection for `.profile` repos. `verify_branch_protection()` — checks protection is correctly configured. `remove_branch_protection()` — removes protection (cleanup/testing). Handles race condition after initial push: retries with backoff if Forgejo hasn't processed the branch yet. Requires `FORGE_TOKEN`, `FORGE_URL`, `FORGE_OPS_REPO`. | bin/disinto (hire-an-agent) | | `lib/agent-sdk.sh` | `agent_run([--resume SESSION_ID] [--worktree DIR] PROMPT)` — one-shot `claude -p` invocation with session persistence. Saves session ID to `SID_FILE`, reads it back on resume. `agent_recover_session()` — restore previous session ID from `SID_FILE` on startup. **Nudge guard**: skips nudge injection if the worktree is clean and no push is expected, preventing spurious re-invocations. Callers must define `SID_FILE`, `LOGFILE`, and `log()` before sourcing. | formula-driven agents (dev-agent, planner-run, predictor-run, gardener-run) | -| `lib/forge-setup.sh` | `setup_forge()` — Forgejo instance provisioning: creates admin user, bot accounts, org, repos (code + ops), configures webhooks, sets repo topics. Extracted from `bin/disinto`. Requires `FORGE_URL`, `FORGE_TOKEN`, `FACTORY_ROOT`. | bin/disinto (init) | -| `lib/forge-push.sh` | `push_to_forge()` — pushes a local clone to the Forgejo remote and verifies the push. `_assert_forge_push_globals()` validates required env vars before use. Requires `FORGE_URL`, `FORGE_TOKEN`, `FACTORY_ROOT`, `PRIMARY_BRANCH`. | bin/disinto (init) | +| `lib/forge-setup.sh` | `setup_forge()` — Forgejo instance provisioning: creates admin user, bot accounts, org, repos (code + ops), configures webhooks, sets repo topics. Extracted from `bin/disinto`. Requires `FORGE_URL`, `FORGE_TOKEN`, `FACTORY_ROOT`. **Password storage (#361)**: after creating each bot account, stores its password in `.env` as `FORGE__PASS` (e.g. `FORGE_PASS`, `FORGE_REVIEW_PASS`, etc.) for use by `forge-push.sh`. | bin/disinto (init) | +| `lib/forge-push.sh` | `push_to_forge()` — pushes a local clone to the Forgejo remote and verifies the push. `_assert_forge_push_globals()` validates required env vars before use. Requires `FORGE_URL`, `FORGE_PASS`, `FACTORY_ROOT`, `PRIMARY_BRANCH`. **Auth**: uses `FORGE_PASS` (bot password) for git HTTP push — Forgejo 11.x rejects API tokens for `git push` (#361). | bin/disinto (init) | | `lib/ops-setup.sh` | `setup_ops_repo()` — creates ops repo on Forgejo if it doesn't exist, configures bot collaborators, clones/initializes ops repo locally, seeds directory structure (vault, knowledge, evidence). Exports `_ACTUAL_OPS_SLUG`. | bin/disinto (init) | | `lib/ci-setup.sh` | `_install_cron_impl()` — installs crontab entries for project agents. `_create_woodpecker_oauth_impl()` — creates OAuth2 app on Forgejo for Woodpecker. `_generate_woodpecker_token_impl()` — auto-generates WOODPECKER_TOKEN via OAuth2 flow. `_activate_woodpecker_repo_impl()` — activates repo in Woodpecker. All gated by `_load_ci_context()` which validates required env vars. | bin/disinto (init) | | `lib/generators.sh` | Template generation for `disinto init`: `generate_compose()` — docker-compose.yml, `generate_caddyfile()` — Caddyfile, `generate_staging_index()` — staging index, `generate_deploy_pipelines()` — Woodpecker deployment pipeline configs. Requires `FACTORY_ROOT`, `PROJECT_NAME`, `PRIMARY_BRANCH`. | bin/disinto (init) | diff --git a/planner/AGENTS.md b/planner/AGENTS.md index 9914835..7343b7c 100644 --- a/planner/AGENTS.md +++ b/planner/AGENTS.md @@ -1,4 +1,4 @@ - + # Planner Agent **Role**: Strategic planning using a Prerequisite Tree (Theory of Constraints), diff --git a/predictor/AGENTS.md b/predictor/AGENTS.md index b9e3edc..d0bae51 100644 --- a/predictor/AGENTS.md +++ b/predictor/AGENTS.md @@ -1,4 +1,4 @@ - + # Predictor Agent **Role**: Abstract adversary (the "goblin"). Runs a 2-step formula diff --git a/review/AGENTS.md b/review/AGENTS.md index e75ca21..6976c04 100644 --- a/review/AGENTS.md +++ b/review/AGENTS.md @@ -1,4 +1,4 @@ - + # Review Agent **Role**: AI-powered PR review — post structured findings and formal diff --git a/supervisor/AGENTS.md b/supervisor/AGENTS.md index 0478d93..3348c86 100644 --- a/supervisor/AGENTS.md +++ b/supervisor/AGENTS.md @@ -1,4 +1,4 @@ - + # Supervisor Agent **Role**: Health monitoring and auto-remediation, executed as a formula-driven From d077b1455fb832395a1124d9462da9bcf9b0ae22 Mon Sep 17 00:00:00 2001 From: Agent Date: Tue, 7 Apr 2026 18:13:13 +0000 Subject: [PATCH 2/4] =?UTF-8?q?fix:=20fix:=20entrypoint-reproduce.sh=20ign?= =?UTF-8?q?ores=20DISINTO=5FFORMULA=20env=20var=20=E2=80=94=20always=20run?= =?UTF-8?q?s=20reproduce=20formula=20(#356)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- docker/reproduce/entrypoint-reproduce.sh | 55 ++++++++++++++++++------ 1 file changed, 42 insertions(+), 13 deletions(-) diff --git a/docker/reproduce/entrypoint-reproduce.sh b/docker/reproduce/entrypoint-reproduce.sh index 2cbb3f9..723c5dc 100644 --- a/docker/reproduce/entrypoint-reproduce.sh +++ b/docker/reproduce/entrypoint-reproduce.sh @@ -23,7 +23,17 @@ set -euo pipefail DISINTO_DIR="${DISINTO_DIR:-/home/agent/disinto}" -REPRODUCE_FORMULA="${DISINTO_DIR}/formulas/reproduce.toml" + +# Select formula based on DISINTO_FORMULA env var (set by dispatcher) +case "${DISINTO_FORMULA:-reproduce}" in + triage) + ACTIVE_FORMULA="${DISINTO_DIR}/formulas/triage.toml" + ;; + *) + ACTIVE_FORMULA="${DISINTO_DIR}/formulas/reproduce.toml" + ;; +esac + REPRODUCE_TIMEOUT="${REPRODUCE_TIMEOUT_MINUTES:-15}" LOGFILE="/home/agent/data/logs/reproduce.log" SCREENSHOT_DIR="/home/agent/data/screenshots" @@ -75,7 +85,11 @@ export PROJECT_NAME PROJECT_REPO_ROOT="/home/agent/repos/${PROJECT_NAME}" -log "Starting reproduce-agent for issue #${ISSUE_NUMBER} (project: ${PROJECT_NAME})" +if [ "${DISINTO_FORMULA:-reproduce}" = "triage" ]; then + log "Starting triage-agent for issue #${ISSUE_NUMBER} (project: ${PROJECT_NAME})" +else + log "Starting reproduce-agent for issue #${ISSUE_NUMBER} (project: ${PROJECT_NAME})" +fi # --------------------------------------------------------------------------- # Verify claude CLI is available (mounted from host) @@ -99,20 +113,20 @@ LOCK_HOLDER="reproduce-agent-${ISSUE_NUMBER}" FORMULA_STACK_SCRIPT="" FORMULA_TIMEOUT_MINUTES="${REPRODUCE_TIMEOUT}" -if [ -f "$REPRODUCE_FORMULA" ]; then +if [ -f "$ACTIVE_FORMULA" ]; then FORMULA_STACK_SCRIPT=$(python3 -c " import sys, tomllib with open(sys.argv[1], 'rb') as f: d = tomllib.load(f) print(d.get('stack_script', '')) -" "$REPRODUCE_FORMULA" 2>/dev/null || echo "") +" "$ACTIVE_FORMULA" 2>/dev/null || echo "") _tm=$(python3 -c " import sys, tomllib with open(sys.argv[1], 'rb') as f: d = tomllib.load(f) print(d.get('timeout_minutes', '${REPRODUCE_TIMEOUT}')) -" "$REPRODUCE_FORMULA" 2>/dev/null || echo "${REPRODUCE_TIMEOUT}") +" "$ACTIVE_FORMULA" 2>/dev/null || echo "${REPRODUCE_TIMEOUT}") FORMULA_TIMEOUT_MINUTES="$_tm" fi @@ -255,7 +269,11 @@ PROMPT # --------------------------------------------------------------------------- # Run Claude with Playwright MCP # --------------------------------------------------------------------------- -log "Starting Claude reproduction session (timeout: ${FORMULA_TIMEOUT_MINUTES}m)..." +if [ "${DISINTO_FORMULA:-reproduce}" = "triage" ]; then + log "Starting triage-agent session (timeout: ${FORMULA_TIMEOUT_MINUTES}m)..." +else + log "Starting Claude reproduction session (timeout: ${FORMULA_TIMEOUT_MINUTES}m)..." +fi CLAUDE_EXIT=0 timeout "$(( FORMULA_TIMEOUT_MINUTES * 60 ))" \ @@ -296,7 +314,11 @@ FINDINGS="" if [ -f "/tmp/reproduce-findings-${ISSUE_NUMBER}.md" ]; then FINDINGS=$(cat "/tmp/reproduce-findings-${ISSUE_NUMBER}.md") else - FINDINGS="Reproduce-agent completed but did not write a findings report. Claude output:\n\`\`\`\n$(tail -100 "/tmp/reproduce-claude-output-${ISSUE_NUMBER}.txt" 2>/dev/null || echo '(no output)')\n\`\`\`" + if [ "${DISINTO_FORMULA:-reproduce}" = "triage" ]; then + FINDINGS="Triage-agent completed but did not write a findings report. Claude output:\n\`\`\`\n$(tail -100 "/tmp/reproduce-claude-output-${ISSUE_NUMBER}.txt" 2>/dev/null || echo '(no output)')\n\`\`\`" + else + FINDINGS="Reproduce-agent completed but did not write a findings report. Claude output:\n\`\`\`\n$(tail -100 "/tmp/reproduce-claude-output-${ISSUE_NUMBER}.txt" 2>/dev/null || echo '(no output)')\n\`\`\`" + fi fi # --------------------------------------------------------------------------- @@ -381,6 +403,13 @@ _post_comment() { BUG_REPORT_ID=$(_label_id "bug-report" "#e4e669") _remove_label "$ISSUE_NUMBER" "$BUG_REPORT_ID" +# Determine agent name for comments +if [ "${DISINTO_FORMULA:-reproduce}" = "triage" ]; then + AGENT_NAME="Triage-agent" +else + AGENT_NAME="Reproduce-agent" +fi + # Determine outcome and apply appropriate labels LABEL_NAME="" LABEL_COLOR="" @@ -396,13 +425,13 @@ case "$OUTCOME" in # Obvious cause → add reproduced status label, create backlog issue for dev-agent LABEL_NAME="reproduced" LABEL_COLOR="#0075ca" - COMMENT_HEADER="## Reproduce-agent: **Reproduced with obvious cause** :white_check_mark: :zap:" + COMMENT_HEADER="## ${AGENT_NAME}: **Reproduced with obvious cause** :white_check_mark: :zap:" CREATE_BACKLOG_ISSUE=true else # Cause unclear → in-triage → Triage-agent LABEL_NAME="in-triage" LABEL_COLOR="#d93f0b" - COMMENT_HEADER="## Reproduce-agent: **Reproduced, cause unclear** :white_check_mark: :mag:" + COMMENT_HEADER="## ${AGENT_NAME}: **Reproduced, cause unclear** :white_check_mark: :mag:" fi ;; @@ -410,14 +439,14 @@ case "$OUTCOME" in # Cannot reproduce → rejected → Human review LABEL_NAME="rejected" LABEL_COLOR="#e4e669" - COMMENT_HEADER="## Reproduce-agent: **Cannot reproduce** :x:" + COMMENT_HEADER="## ${AGENT_NAME}: **Cannot reproduce** :x:" ;; needs-triage) # Inconclusive (timeout, env issues) → blocked → Gardener/human LABEL_NAME="blocked" LABEL_COLOR="#e11d48" - COMMENT_HEADER="## Reproduce-agent: **Inconclusive, blocked** :construction:" + COMMENT_HEADER="## ${AGENT_NAME}: **Inconclusive, blocked** :construction:" ;; esac @@ -460,9 +489,9 @@ COMMENT_BODY="${COMMENT_HEADER} ${FINDINGS}${SCREENSHOT_LIST} --- -*Reproduce-agent run at $(date -u '+%Y-%m-%d %H:%M:%S UTC') — project: ${PROJECT_NAME}*" +*${AGENT_NAME} run at $(date -u '+%Y-%m-%d %H:%M:%S UTC') — project: ${PROJECT_NAME}*" _post_comment "$ISSUE_NUMBER" "$COMMENT_BODY" log "Posted findings to issue #${ISSUE_NUMBER}" -log "Reproduce-agent done. Outcome: ${OUTCOME}" +log "${AGENT_NAME} done. Outcome: ${OUTCOME}" From 071cf83f5355a297dc741b4c1ab94f0ed010e6b7 Mon Sep 17 00:00:00 2001 From: Agent Date: Tue, 7 Apr 2026 18:13:13 +0000 Subject: [PATCH 3/4] =?UTF-8?q?fix:=20fix:=20entrypoint-reproduce.sh=20ign?= =?UTF-8?q?ores=20DISINTO=5FFORMULA=20env=20var=20=E2=80=94=20always=20run?= =?UTF-8?q?s=20reproduce=20formula=20(#356)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- docker/reproduce/entrypoint-reproduce.sh | 55 ++++++++++++++++++------ 1 file changed, 42 insertions(+), 13 deletions(-) diff --git a/docker/reproduce/entrypoint-reproduce.sh b/docker/reproduce/entrypoint-reproduce.sh index 2cbb3f9..723c5dc 100644 --- a/docker/reproduce/entrypoint-reproduce.sh +++ b/docker/reproduce/entrypoint-reproduce.sh @@ -23,7 +23,17 @@ set -euo pipefail DISINTO_DIR="${DISINTO_DIR:-/home/agent/disinto}" -REPRODUCE_FORMULA="${DISINTO_DIR}/formulas/reproduce.toml" + +# Select formula based on DISINTO_FORMULA env var (set by dispatcher) +case "${DISINTO_FORMULA:-reproduce}" in + triage) + ACTIVE_FORMULA="${DISINTO_DIR}/formulas/triage.toml" + ;; + *) + ACTIVE_FORMULA="${DISINTO_DIR}/formulas/reproduce.toml" + ;; +esac + REPRODUCE_TIMEOUT="${REPRODUCE_TIMEOUT_MINUTES:-15}" LOGFILE="/home/agent/data/logs/reproduce.log" SCREENSHOT_DIR="/home/agent/data/screenshots" @@ -75,7 +85,11 @@ export PROJECT_NAME PROJECT_REPO_ROOT="/home/agent/repos/${PROJECT_NAME}" -log "Starting reproduce-agent for issue #${ISSUE_NUMBER} (project: ${PROJECT_NAME})" +if [ "${DISINTO_FORMULA:-reproduce}" = "triage" ]; then + log "Starting triage-agent for issue #${ISSUE_NUMBER} (project: ${PROJECT_NAME})" +else + log "Starting reproduce-agent for issue #${ISSUE_NUMBER} (project: ${PROJECT_NAME})" +fi # --------------------------------------------------------------------------- # Verify claude CLI is available (mounted from host) @@ -99,20 +113,20 @@ LOCK_HOLDER="reproduce-agent-${ISSUE_NUMBER}" FORMULA_STACK_SCRIPT="" FORMULA_TIMEOUT_MINUTES="${REPRODUCE_TIMEOUT}" -if [ -f "$REPRODUCE_FORMULA" ]; then +if [ -f "$ACTIVE_FORMULA" ]; then FORMULA_STACK_SCRIPT=$(python3 -c " import sys, tomllib with open(sys.argv[1], 'rb') as f: d = tomllib.load(f) print(d.get('stack_script', '')) -" "$REPRODUCE_FORMULA" 2>/dev/null || echo "") +" "$ACTIVE_FORMULA" 2>/dev/null || echo "") _tm=$(python3 -c " import sys, tomllib with open(sys.argv[1], 'rb') as f: d = tomllib.load(f) print(d.get('timeout_minutes', '${REPRODUCE_TIMEOUT}')) -" "$REPRODUCE_FORMULA" 2>/dev/null || echo "${REPRODUCE_TIMEOUT}") +" "$ACTIVE_FORMULA" 2>/dev/null || echo "${REPRODUCE_TIMEOUT}") FORMULA_TIMEOUT_MINUTES="$_tm" fi @@ -255,7 +269,11 @@ PROMPT # --------------------------------------------------------------------------- # Run Claude with Playwright MCP # --------------------------------------------------------------------------- -log "Starting Claude reproduction session (timeout: ${FORMULA_TIMEOUT_MINUTES}m)..." +if [ "${DISINTO_FORMULA:-reproduce}" = "triage" ]; then + log "Starting triage-agent session (timeout: ${FORMULA_TIMEOUT_MINUTES}m)..." +else + log "Starting Claude reproduction session (timeout: ${FORMULA_TIMEOUT_MINUTES}m)..." +fi CLAUDE_EXIT=0 timeout "$(( FORMULA_TIMEOUT_MINUTES * 60 ))" \ @@ -296,7 +314,11 @@ FINDINGS="" if [ -f "/tmp/reproduce-findings-${ISSUE_NUMBER}.md" ]; then FINDINGS=$(cat "/tmp/reproduce-findings-${ISSUE_NUMBER}.md") else - FINDINGS="Reproduce-agent completed but did not write a findings report. Claude output:\n\`\`\`\n$(tail -100 "/tmp/reproduce-claude-output-${ISSUE_NUMBER}.txt" 2>/dev/null || echo '(no output)')\n\`\`\`" + if [ "${DISINTO_FORMULA:-reproduce}" = "triage" ]; then + FINDINGS="Triage-agent completed but did not write a findings report. Claude output:\n\`\`\`\n$(tail -100 "/tmp/reproduce-claude-output-${ISSUE_NUMBER}.txt" 2>/dev/null || echo '(no output)')\n\`\`\`" + else + FINDINGS="Reproduce-agent completed but did not write a findings report. Claude output:\n\`\`\`\n$(tail -100 "/tmp/reproduce-claude-output-${ISSUE_NUMBER}.txt" 2>/dev/null || echo '(no output)')\n\`\`\`" + fi fi # --------------------------------------------------------------------------- @@ -381,6 +403,13 @@ _post_comment() { BUG_REPORT_ID=$(_label_id "bug-report" "#e4e669") _remove_label "$ISSUE_NUMBER" "$BUG_REPORT_ID" +# Determine agent name for comments +if [ "${DISINTO_FORMULA:-reproduce}" = "triage" ]; then + AGENT_NAME="Triage-agent" +else + AGENT_NAME="Reproduce-agent" +fi + # Determine outcome and apply appropriate labels LABEL_NAME="" LABEL_COLOR="" @@ -396,13 +425,13 @@ case "$OUTCOME" in # Obvious cause → add reproduced status label, create backlog issue for dev-agent LABEL_NAME="reproduced" LABEL_COLOR="#0075ca" - COMMENT_HEADER="## Reproduce-agent: **Reproduced with obvious cause** :white_check_mark: :zap:" + COMMENT_HEADER="## ${AGENT_NAME}: **Reproduced with obvious cause** :white_check_mark: :zap:" CREATE_BACKLOG_ISSUE=true else # Cause unclear → in-triage → Triage-agent LABEL_NAME="in-triage" LABEL_COLOR="#d93f0b" - COMMENT_HEADER="## Reproduce-agent: **Reproduced, cause unclear** :white_check_mark: :mag:" + COMMENT_HEADER="## ${AGENT_NAME}: **Reproduced, cause unclear** :white_check_mark: :mag:" fi ;; @@ -410,14 +439,14 @@ case "$OUTCOME" in # Cannot reproduce → rejected → Human review LABEL_NAME="rejected" LABEL_COLOR="#e4e669" - COMMENT_HEADER="## Reproduce-agent: **Cannot reproduce** :x:" + COMMENT_HEADER="## ${AGENT_NAME}: **Cannot reproduce** :x:" ;; needs-triage) # Inconclusive (timeout, env issues) → blocked → Gardener/human LABEL_NAME="blocked" LABEL_COLOR="#e11d48" - COMMENT_HEADER="## Reproduce-agent: **Inconclusive, blocked** :construction:" + COMMENT_HEADER="## ${AGENT_NAME}: **Inconclusive, blocked** :construction:" ;; esac @@ -460,9 +489,9 @@ COMMENT_BODY="${COMMENT_HEADER} ${FINDINGS}${SCREENSHOT_LIST} --- -*Reproduce-agent run at $(date -u '+%Y-%m-%d %H:%M:%S UTC') — project: ${PROJECT_NAME}*" +*${AGENT_NAME} run at $(date -u '+%Y-%m-%d %H:%M:%S UTC') — project: ${PROJECT_NAME}*" _post_comment "$ISSUE_NUMBER" "$COMMENT_BODY" log "Posted findings to issue #${ISSUE_NUMBER}" -log "Reproduce-agent done. Outcome: ${OUTCOME}" +log "${AGENT_NAME} done. Outcome: ${OUTCOME}" From ac17585d5a7c8d964b9e423ea572d3b7df2566f4 Mon Sep 17 00:00:00 2001 From: Agent Date: Tue, 7 Apr 2026 18:22:30 +0000 Subject: [PATCH 4/4] =?UTF-8?q?fix:=20fix:=20entrypoint-reproduce.sh=20ign?= =?UTF-8?q?ores=20DISINTO=5FFORMULA=20env=20var=20=E2=80=94=20always=20run?= =?UTF-8?q?s=20reproduce=20formula=20(#356)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- docker/reproduce/entrypoint-reproduce.sh | 218 ++++++++++++++++++++++- 1 file changed, 209 insertions(+), 9 deletions(-) diff --git a/docker/reproduce/entrypoint-reproduce.sh b/docker/reproduce/entrypoint-reproduce.sh index 723c5dc..c36192a 100644 --- a/docker/reproduce/entrypoint-reproduce.sh +++ b/docker/reproduce/entrypoint-reproduce.sh @@ -38,11 +38,20 @@ REPRODUCE_TIMEOUT="${REPRODUCE_TIMEOUT_MINUTES:-15}" LOGFILE="/home/agent/data/logs/reproduce.log" SCREENSHOT_DIR="/home/agent/data/screenshots" +# --------------------------------------------------------------------------- +# Determine agent type early for log prefix +# --------------------------------------------------------------------------- +if [ "${DISINTO_FORMULA:-reproduce}" = "triage" ]; then + AGENT_TYPE="triage" +else + AGENT_TYPE="reproduce" +fi + # --------------------------------------------------------------------------- # Logging # --------------------------------------------------------------------------- log() { - printf '[%s] reproduce: %s\n' "$(date -u '+%Y-%m-%d %H:%M:%S UTC')" "$*" | tee -a "$LOGFILE" + printf '[%s] %s: %s\n' "$(date -u '+%Y-%m-%d %H:%M:%S UTC')" "$AGENT_TYPE" "$*" | tee -a "$LOGFILE" } # --------------------------------------------------------------------------- @@ -85,7 +94,7 @@ export PROJECT_NAME PROJECT_REPO_ROOT="/home/agent/repos/${PROJECT_NAME}" -if [ "${DISINTO_FORMULA:-reproduce}" = "triage" ]; then +if [ "$AGENT_TYPE" = "triage" ]; then log "Starting triage-agent for issue #${ISSUE_NUMBER} (project: ${PROJECT_NAME})" else log "Starting reproduce-agent for issue #${ISSUE_NUMBER} (project: ${PROJECT_NAME})" @@ -198,12 +207,202 @@ elif [ -n "$FORMULA_STACK_SCRIPT" ]; then fi # --------------------------------------------------------------------------- -# Build Claude prompt for reproduction +# Build Claude prompt based on agent type # --------------------------------------------------------------------------- TIMESTAMP=$(date -u '+%Y%m%d-%H%M%S') SCREENSHOT_PREFIX="${SCREENSHOT_DIR}/issue-${ISSUE_NUMBER}-${TIMESTAMP}" -CLAUDE_PROMPT=$(cat < + e. Search for related issues or TODOs in the code: + grep -r "TODO\|FIXME\|HACK" -- + +Capture for each layer: + - The data shape flowing in and out (field names, types, nullability) + - Whether the layer's behavior matches its documented contract + - Any discrepancy found + +If a clear root cause becomes obvious during tracing, note it and continue +checking whether additional causes exist downstream. + +### Step 3: Add debug instrumentation on a throwaway branch +Use ~30% of your total turn budget here. Only instrument after tracing has +identified the most likely failure points — do not instrument blindly. + +1. Create a throwaway debug branch (NEVER commit this to main): + cd "\$PROJECT_REPO_ROOT" + git checkout -b debug/triage-\${ISSUE_NUMBER} + +2. Add targeted logging at the layer boundaries identified during tracing: + - Console.log / structured log statements around the suspicious code path + - Log the actual values flowing through: inputs, outputs, intermediate state + - Add verbose mode flags if the stack supports them + - Keep instrumentation minimal — only what confirms or refutes the hypothesis + +3. Restart the stack using the configured script (if set): + \${stack_script:-"# No stack_script configured — restart manually or connect to staging"} + +4. Re-run the reproduction steps from the reproduce-agent findings. + +5. Observe and capture new output: + - Paste relevant log lines into your working notes + - Note whether the observed values match or contradict the hypothesis + +6. If the first instrumentation pass is inconclusive, iterate: + - Narrow the scope to the next most suspicious boundary + - Re-instrument, restart, re-run + - Maximum 2-3 instrumentation rounds before declaring inconclusive + +Do NOT push the debug branch. It will be deleted in the cleanup step. + +### Step 4: Decompose root causes into backlog issues +After tracing and instrumentation, articulate each distinct root cause. + +For each root cause found: + +1. Determine the relationship to other causes: + - Layered (one causes another) → use Depends-on in the issue body + - Independent (separate code paths fail independently) → use Related + +2. Create a backlog issue for each root cause: + curl -sf -X POST "\${FORGE_API}/issues" \\ + -H "Authorization: token \${FORGE_TOKEN}" \\ + -H "Content-Type: application/json" \\ + -d '{ + "title": "fix: ", + "body": "## Root cause\\n\\n\\n## Fix suggestion\\n\\n\\n## Context\\nDecomposed from #\${ISSUE_NUMBER} (cause N of M)\\n\\n## Dependencies\\n<#X if this depends on another cause being fixed first>", + "labels": ["backlog"] + }' + +3. Note the newly created issue numbers. + +If only one root cause is found, still create a single backlog issue with +the specific code location and fix suggestion. + +If the investigation is inconclusive (no clear root cause found), skip this +step and proceed directly to link-back with the inconclusive outcome. + +### Step 5: Update original issue and relabel +Post a summary comment on the original issue and update its labels. + +#### If root causes were found (conclusive): + +Post a comment: + "## Triage findings + + Found N root cause(s): + - #X — (cause 1 of N) + - #Y — (cause 2 of N, depends on #X) + + Data flow traced: + Instrumentation: + + Next step: backlog issues above will be implemented in dependency order." + +Then swap labels: + - Remove: in-triage + - Add: in-progress + +#### If investigation was inconclusive (turn budget exhausted): + +Post a comment: + "## Triage — inconclusive + + Traced: + Tried: + Hypothesis: + + No definitive root cause identified. Leaving in-triage for supervisor + to handle as a stale triage session." + +Do NOT relabel. Leave in-triage. The supervisor monitors stale triage +sessions and will escalate or reassign. + +### Step 6: Delete throwaway debug branch +Always delete the debug branch, even if the investigation was inconclusive. + +1. Switch back to the main branch: + cd "\$PROJECT_REPO_ROOT" + git checkout "\$PRIMARY_BRANCH" + +2. Delete the local debug branch: + git branch -D debug/triage-\${ISSUE_NUMBER} + +3. Confirm no remote was pushed (if accidentally pushed, delete it too): + git push origin --delete debug/triage-\${ISSUE_NUMBER} 2>/dev/null || true + +4. Verify the worktree is clean: + git status + git worktree list + +A clean repo is a prerequisite for the next dev-agent run. Never leave +debug branches behind — they accumulate and pollute the branch list. + +## Notes +- The application is accessible at localhost (network_mode: host) +- Budget: 70% tracing data flow, 30% instrumented re-runs +- Timeout: \${FORMULA_TIMEOUT_MINUTES} minutes total (or until turn limit) +- Stack lock is held for the full run +- If stack_script is empty, connect to existing staging environment + +Begin now. +PROMPT + ) +else + # Reproduce-agent prompt: reproduce the bug and report findings + CLAUDE_PROMPT=$(cat </dev/null || echo '(no output)')\n\`\`\`" else FINDINGS="Reproduce-agent completed but did not write a findings report. Claude output:\n\`\`\`\n$(tail -100 "/tmp/reproduce-claude-output-${ISSUE_NUMBER}.txt" 2>/dev/null || echo '(no output)')\n\`\`\`" @@ -403,8 +603,8 @@ _post_comment() { BUG_REPORT_ID=$(_label_id "bug-report" "#e4e669") _remove_label "$ISSUE_NUMBER" "$BUG_REPORT_ID" -# Determine agent name for comments -if [ "${DISINTO_FORMULA:-reproduce}" = "triage" ]; then +# Determine agent name for comments (based on AGENT_TYPE set at script start) +if [ "$AGENT_TYPE" = "triage" ]; then AGENT_NAME="Triage-agent" else AGENT_NAME="Reproduce-agent"