bug: hire-an-agent does not add the new agent as collaborator on the project repo #856
Labels
No labels
action
backlog
blocked
bug-report
cannot-reproduce
in-progress
in-triage
needs-triage
prediction/actioned
prediction/dismissed
prediction/unreviewed
priority
rejected
reproduced
tech-debt
underspecified
vision
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: disinto-admin/disinto#856
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Problem
disinto hire-an-agent <name> <role>creates the Forgejo user, token, password,.profilerepo, and project TOML entry — but does not add the new user as a collaborator on the project repo (e.g.disinto-admin/disinto). Without write collaborator status, the agent'sissue_claimPATCH request returns403 Forbiddenfrom Forgejo.lib/issue-lifecycle.sh:issue_claim()swallows the 403 because the PATCH usescurl -sf ... >/dev/null 2>&1 || return 1with silenced stderr, so no diagnostic surfaces. Control falls through to the post-PATCH verify added in #830, which reads the assignee as still empty and logsclaim lost to <none> — skipping. The caller (dev-agent.sh) emits the generic:which is misleading — nothing else is assigned; the agent simply has no write permission.
End result: a freshly hired agent container polls forever, launches
dev-agentevery cycle, each dev-agent fails to claim, exits, and no work ever gets done. Silent zombie.Repro
disinto hire-an-agent dev-qwen2 dev --local-model <url> --model <name>/home/agent/data/logs/dev/dev-agent.log— every cycle logsSKIP: failed to claim issue #<N> (already assigned to another agent).Fix
Two complementary changes:
1.
lib/hire-agent.sh— add collaborator step (primary fix)After the user + token are created, add the new agent as a write collaborator on the project repo(s). Mirror the pattern already used for the other bot users in
lib/forge-setup.sh(which is howdev-qwen,dev-bot,review-bot, etc. got collab status originally).Optionally extend to ops repo for roles that need it (planner/architect/vault). For plain
devrole, project repo is sufficient.2.
lib/issue-lifecycle.sh:issue_claim()— surface the 403 (defense in depth)The silent
|| return 1on PATCH hides the root cause. Capture the HTTP status and log it on failure:This would have made the collaborator miss diagnosable in seconds instead of minutes.
Acceptance
disinto hire-an-agent X dev,Xappears inGET /repos/<project>/collaboratorswithwritepermissiondev-agentsuccessfully claims its first ready issue on the first cycle (no moreclaim lost to <none>spam)issue_claimlogs a distinct, specific message on HTTP error (403/404/5xx) rather than the genericclaim lostmessagehire-an-agentfor an existing agent is idempotent (re-adding an existing collaborator returns 204 without error)Affected files
lib/hire-agent.sh— add collaborator PUT after user + token creationlib/issue-lifecycle.sh—issue_claim()surface PATCH HTTP errorsContext
Caught today while bringing up
dev-qwen2as a second parallel llama dev agent. Container polled for ~6 minutes, launching a new dev-agent every iteration, each failing silently. Unblocked by manually running:Next claim cycle succeeded.
Non-goals
cdfails silently #861