fix: standardize logging across all agents — capture errors, log exit codes, consistent format #367
Labels
No labels
action
backlog
blocked
bug-report
in-progress
prediction/actioned
prediction/dismissed
prediction/unreviewed
priority
tech-debt
underspecified
vision
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: disinto-admin/disinto#367
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Problem
Agent logging is inconsistent and swallows errors. When things fail, the logs say "failed" with no context — no exit code, no stderr, no output. This made a simple PATH issue (claude not found) take hours to diagnose.
Audit findings
1. Subprocess output discarded on failure
These call subprocesses but discard their output when they fail:
review/review-poll.shlines 129, 169: review-pr.sh output goes nowhere on failure, only logs "review failed"lib/agent-sdk.shline 51:|| trueswallows claude exit code — timeout (124) vs crash (1) vs success (0) are indistinguishablelib/agent-sdk.shline 82: same for nudge invocationlib/pr-lifecycle.shline 364: pr_close failure fully silencedlib/pr-lifecycle.shlines 403-405: git rebase/push output printed to stdout then discarded2. Inconsistent log format
Two formats in use:
[2026-04-03T14:00:00Z] messagevia echo[2026-04-03 14:00:00 UTC] messagevia printf3. Inconsistent log destinations
Some agents log to DISINTO_LOG_DIR, some to SCRIPT_DIR (which is read-only in containers). The gardener LOG_FILE bug (#210) was fixed but the pattern persists in other agents.
Fix
A. Capture and log subprocess errors
For every subprocess call that can fail, capture output and log last lines on error:
Specific changes:
review/review-poll.shlines 129, 169: capture review-pr.sh output, log tail on failurelib/agent-sdk.shlines 51, 82: capture claude exit code, log it (distinguish timeout/crash/success)lib/pr-lifecycle.shline 364: log pr_close failure with HTTP codelib/pr-lifecycle.shlines 403-405: log git rebase/push failure outputgardener-run.shmanifest API calls: log HTTP status on failure (replace -sf with -s -o /dev/null -w '%{http_code}')B. Standardize log format
Use a single log() function from lib/env.sh. All agents should use the same format and not redefine log(). Recommended format:
Where agent is the agent name (dev, review, gardener, etc.).
C. Standardize log destinations
All agents should log to
${DISINTO_LOG_DIR}/<agent>/<agent>.log. Never to$SCRIPT_DIR(read-only in containers).Affected files
Acceptance criteria
fix: review-poll discards review-pr.sh error output — failures have no contextto fix: standardize logging across all agents — capture errors, log exit codes, consistent format