4.6 KiB
Dev Agent
Role: Implement issues autonomously — write code, push branches, address CI failures and review feedback.
Trigger: dev-poll.sh is invoked by the polling loop in docker/agents/entrypoint.sh
every 5 minutes (iteration math at line 171-175). Sources lib/guard.sh and calls
check_active dev first — skips if $FACTORY_ROOT/state/.dev-active is absent. Then
performs a direct-merge scan (approved + CI green PRs — including chore/gardener PRs
without issue numbers), then checks the agent lock and scans for ready issues using a
two-tier priority queue: (1) priority+backlog issues first (FIFO within tier), then
(2) plain backlog issues (FIFO). Orphaned in-progress issues are also picked up. The
direct-merge scan runs before the lock check so approved PRs get merged even while a
dev-agent session is active.
Key files:
dev/dev-poll.sh— Polling loop participant: finds next ready issue, handles merge/rebase of approved PRs, tracks CI fix attempts. Invoked bydocker/agents/entrypoint.shevery 5 minutes.BOT_USERis resolved once at startup via the Forge/userAPI and cached for all assignee checks. Formula guard skips issues labeledformula,prediction/dismissed, orprediction/unreviewed. Race prevention: checks issue assignee before claiming — skips if assigned to a different bot user. Stale branch abandonment: closes PRs and deletes branches that are behind$PRIMARY_BRANCH(restarts poll cycle for a fresh start). Stale in-progress recovery: on each poll cycle, scans for issues labeledin-progress. If the issue has avisionlabel, setsBLOCKED_BY_INPROGRESS=trueand skips further stale checks (vision issues are managed by the architect). If the issue is assigned to$BOT_USER(this agent), checks for pending review feedback first — if an open PR hasREQUEST_CHANGES, spawns the dev-agent to address it before settingBLOCKED_BY_INPROGRESS=true; otherwise just sets blocked. If assigned to another agent, logs and falls through (does not block). If no assignee, no open PR, and no agent lock file — removesin-progress, addsblockedwith a human-triage comment. Per-agent open-PR gate: before starting new work, filters open waiting PRs to only those assigned to this agent ($BOT_USER). Other agents' PRs do not block this agent's pipeline (#358, #369). Pre-lock merge scan own-PRs only: the direct-merge scan only merges PRs whose linked issue is assigned to this agent — skips PRs owned by other bot users (#374).dev/dev-agent.sh— Orchestrator: claims issue, creates worktree + tmux session with interactiveclaude, monitors phase file, injects CI results and review feedback, merges on approvaldev/phase-test.sh— Integration test for the phase protocol
Environment variables consumed (via lib/env.sh + project TOML):
FORGE_TOKEN— Dev-agent token (push, PR creation, merge) — use the dedicated bot accountFORGE_REPO,FORGE_API,FORGE_URL— Target repository (FORGE_URL used to auto-detect git remote)PROJECT_NAME,PROJECT_REPO_ROOT— Local checkout pathPRIMARY_BRANCH— Branch to merge into (e.g.main,master)WOODPECKER_REPO_ID— CI pipeline lookupsCLAUDE_TIMEOUT— Max seconds for a Claude session (default 7200)
FORGE_REMOTE: dev-agent.sh auto-detects which git remote corresponds to FORGE_URL by matching the remote's push URL hostname. This is exported as FORGE_REMOTE and used for all git push/pull/worktree operations. Defaults to origin if no match found. This ensures correct behaviour when the forge is local Forgejo (remote typically named forgejo) rather than Codeberg (origin).
Session lock: fd-based flock — released during idle phases (awaiting_review, awaiting_ci) so other agents can proceed; re-acquired before injecting the next prompt. This prevents the lock from blocking the whole factory while the dev session waits.
Crash recovery: on PHASE:crashed or non-zero exit, the worktree is preserved (not destroyed) for debugging. Location logged. Supervisor housekeeping removes stale crashed worktrees older than 24h.
Lifecycle: dev-poll.sh (invoked by polling loop, check_active dev) → dev-agent.sh →
tmux session → phase file drives CI/review loop → merge + mirror_push() → close issue.
On respawn after PHASE:escalate, the stale phase file is cleared first so the session
starts clean; the reinject prompt tells Claude not to re-escalate for the same reason.
On respawn for any active PR, the prompt explicitly tells Claude the PR already exists
and not to create a new one via API.