fix: fix: make _generate_compose_impl the canonical compose source — remove tracked docker-compose.yml + update docs (#603)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful

This commit is contained in:
Claude 2026-04-10 16:29:06 +00:00
parent 532ce257d5
commit 398c618cc4
6 changed files with 69 additions and 208 deletions

3
.gitignore vendored
View file

@ -28,3 +28,6 @@ secrets/
# Pre-built binaries for Docker builds (avoid network calls during build)
docker/agents/bin/
# Generated docker-compose.yml (run 'bin/disinto init' to regenerate)
docker-compose.yml

View file

@ -72,6 +72,8 @@ cd disinto
disinto init https://github.com/yourorg/yourproject
```
This will generate a `docker-compose.yml` file.
Or configure manually — edit `.env` with your values:
```bash
@ -97,7 +99,7 @@ CLAUDE_TIMEOUT=7200 # max seconds per Claude invocation (default: 2h)
docker compose up -d
# 4. Verify the entrypoint loop is running
docker exec disinto-agents-1 tail -f /home/agent/data/agent-entrypoint.log
docker exec disinto-agents tail -f /home/agent/data/agent-entrypoint.log
```
## Directory Structure

View file

@ -1,35 +1,63 @@
# Lessons learned
## Remediation & deployment
## Debugging & Diagnostics
**Escalate gradually.** Cheapest fix first, re-measure, escalate only if it persists. Single-shot fixes are either too weak or cause collateral damage.
**Map the environment before changing code.** Silent failures often stem from runtime assumptions—missing paths, wrong user context, or unmet prerequisites. Verify the actual environment first.
**Parameterize deployment boundaries.** Entrypoint references to a specific project name are config values waiting to escape. `${VAR:-default}` preserves compat and unlocks reuse.
**Silent termination is a logging failure.** When a script exits non-zero with no output, the bug is in error handling, not the command. Log at operation entry points, not just on success.
**Fail loudly over silent defaults.** A fatal error with a clear message beats a wrong default that appears to work.
**Pipefail is not a silver bullet.** It propagates exit codes but doesn't guarantee visibility. Pair with explicit error logging for external commands (git, curl, etc.).
**Audit the whole file when fixing one value.** Hardcoded assumptions cluster. Fixing one while leaving siblings produces multi-commit churn.
**Debug the pattern, not the symptom.** If one HTTP call fails with 403, audit all similar calls. If one script has the same bug, find where it's duplicated.
## Documentation
## Shell Scripting Patterns
**Per-context rewrites, not batch replacement.** Each doc mention sits in a different narrative. Blanket substitution produces awkward text.
**Exit codes don't indicate output.** Commands like `grep -c` exit 1 when count is 0 but still output a number. Test both output and exit status independently.
**Search for implicit references too.** After keyword matches, check for instructions that assume the old mechanism without naming it.
**The `||` pattern is fragile.** It appends on failure, doesn't replace output. Use command grouping or conditionals when output clarity matters.
## Code review
**Arithmetic contexts are unforgiving.** `(( ))` fails on anything non-numeric. A stray newline or extra digit breaks everything.
**Approval means "safe to ship," not "how I'd write it."** Distinguish "wrong" from "different" — only the former blocks.
**Source file boundaries matter.** Variables defined in sourced files are local unless exported. Trace the lifecycle: definition → export → usage.
**Scale scrutiny to blast radius.** A targeted fix warrants less ceremony than a cross-cutting refactor.
## Environment & Deployment
**Be specific; separate blockers from preferences.** Concrete observations invite fixes; vague concerns invite debate.
**User context matters at every layer.** When using `gosu`/`su-exec`, ensure all file operations occur under the target user. Create resources with explicit `chown` before dropping privileges.
**Read diffs top-down: intent, behavior, edge cases.** Verify the change matches its stated goal before examining lines.
**Test under final runtime conditions.** Reproduce the exact user context the application will run under, not just "container runs."
## Issue authoring & retry
**Fail fast with actionable diagnostics.** Entrypoints should exit immediately on dependency failures with clear messages explaining *why* and *what to do*.
**Self-contained issue bodies.** The agent reads the body, not comments. On retry, update the body with exact error and fix guidance.
**Throttle retry loops.** Infinite retries without backoff mask underlying problems and look identical to healthy startups.
**Clean stale branches before retry.** Old branches trigger recovery on stale code. Close PR, delete branch, relabel.
## API & Integration
**Diagnose CI failures externally.** The agent sees pass/fail, not logs. After repeated failures, read logs yourself and put findings in the issue.
**Validate semantic types, not just names.** Don't infer resource type from naming conventions. Explicitly resolve whether an identifier is a user, org, or team before constructing URLs.
**403 errors can signal semantic mismatches.** When debugging auth failures, consider whether the request is going to the wrong resource type.
**Auth failures are rarely isolated.** If one endpoint requires credentials, scan for other unauthenticated calls. Environment assumptions about public access commonly break.
**Test against the most restrictive environment first.** If it works on a locked-down instance, it'll work everywhere.
## State & Configuration
**Idempotency requires state awareness.** Distinguish "needs setup" from "already configured." A naive always-rotate approach breaks reproducibility.
**Audit the full dependency chain.** When modifying shared resources, trace all consumers. Embedded tokens create hidden coupling.
**Check validity, not just existence.** Never assume a credential is invalid just because it exists. Verify expiry, permissions, or other validity criteria.
**Conservative defaults become problematic defaults.** Timeouts and limits should reflect real-world expectations, not worst-case scenarios. When in doubt, start aggressive and fail fast.
**Documentation and defaults must stay in sync.** When a default changes, docs should immediately reflect why.
## Validation & Testing
**Add validation after critical operations.** If a migration commits N commits, verify N commits exist afterward. The extra lines are cheaper than debugging incomplete work.
**Integration tests should cover both paths.** Test org and user scenarios, empty inputs, and edge cases explicitly.
**Reproduce with minimal examples.** Running the exact pipeline with test cases that trigger edge conditions catches bugs early.
**Treat "works locally but not in production" as environmental, not code.** The bug is in assumptions about the runtime, not the logic itself.

View file

@ -1,154 +0,0 @@
version: "3.8"
services:
agents:
build:
context: .
dockerfile: docker/agents/Dockerfile
image: disinto/agents:latest
container_name: disinto-agents
volumes:
- ./data/agents:/home/agent/data
- ./disinto:/home/agent/disinto:ro
- /usr/local/bin/claude:/usr/local/bin/claude:ro
environment:
- FORGE_URL=http://forgejo:3000
- FORGE_TOKEN=${FORGE_TOKEN:-}
- FORGE_REVIEW_TOKEN=${FORGE_REVIEW_TOKEN:-}
- FORGE_GARDENER_TOKEN=${FORGE_GARDENER_TOKEN:-}
- FORGE_SUPERVISOR_TOKEN=${FORGE_SUPERVISOR_TOKEN:-}
- FORGE_PREDICTOR_TOKEN=${FORGE_PREDICTOR_TOKEN:-}
- FORGE_ARCHITECT_TOKEN=${FORGE_ARCHITECT_TOKEN:-}
- FORGE_VAULT_TOKEN=${FORGE_VAULT_TOKEN:-}
- FORGE_PLANNER_TOKEN=${FORGE_PLANNER_TOKEN:-}
- FORGE_BOT_USERNAMES=${FORGE_BOT_USERNAMES:-}
- WOODPECKER_TOKEN=${WOODPECKER_TOKEN:-}
- CLAUDE_TIMEOUT=${CLAUDE_TIMEOUT:-7200}
- CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=${CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC:-1}
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY:-}
- FORGE_ADMIN_PASS=${FORGE_ADMIN_PASS:-}
- DISINTO_CONTAINER=1
- DISINTO_AGENTS=review,gardener
depends_on:
- forgejo
agents-llama:
build:
context: .
dockerfile: docker/agents/Dockerfile
image: disinto/agents-llama:latest
container_name: disinto-agents-llama
volumes:
- ./data/llama:/home/agent/data
- ./disinto:/home/agent/disinto:ro
- /usr/local/bin/claude:/usr/local/bin/claude:ro
environment:
- FORGE_URL=http://forgejo:3000
- FORGE_TOKEN=${FORGE_TOKEN_LLAMA:-}
- FORGE_PASS=${FORGE_PASS_LLAMA:-}
- FORGE_SUPERVISOR_TOKEN=${FORGE_SUPERVISOR_TOKEN:-}
- FORGE_PREDICTOR_TOKEN=${FORGE_PREDICTOR_TOKEN:-}
- FORGE_ARCHITECT_TOKEN=${FORGE_ARCHITECT_TOKEN:-}
- FORGE_VAULT_TOKEN=${FORGE_VAULT_TOKEN:-}
- FORGE_PLANNER_TOKEN=${FORGE_PLANNER_TOKEN:-}
- FORGE_BOT_USERNAMES=${FORGE_BOT_USERNAMES:-}
- WOODPECKER_TOKEN=${WOODPECKER_TOKEN:-}
- CLAUDE_TIMEOUT=${CLAUDE_TIMEOUT:-7200}
- CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=${CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC:-1}
- CLAUDE_AUTOCOMPACT_PCT_OVERRIDE=60
- CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS=1
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY:-}
- ANTHROPIC_BASE_URL=${ANTHROPIC_BASE_URL:-}
- FORGE_ADMIN_PASS=${FORGE_ADMIN_PASS:-}
- DISINTO_CONTAINER=1
- DISINTO_AGENTS=dev
- PROJECT_TOML=projects/disinto.toml
- FORGE_REPO=${FORGE_REPO:-disinto-admin/disinto}
- POLL_INTERVAL=${POLL_INTERVAL:-300}
- AGENT_ROLES=dev
depends_on:
- forgejo
runner:
image: disinto/agents:latest
profiles: ["runner"]
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- /usr/local/bin/claude:/usr/local/bin/claude:ro
- ${HOME}/.claude:/home/agent/.claude
- ${HOME}/.claude.json:/home/agent/.claude.json:ro
entrypoint: ["bash", "/home/agent/disinto/docker/runner/entrypoint-runner.sh"]
environment:
- DISINTO_CONTAINER=1
- FORGE_URL=${FORGE_URL:-}
- FORGE_TOKEN=${FORGE_TOKEN:-}
- FORGE_REPO=${FORGE_REPO:-disinto-admin/disinto}
- FORGE_OPS_REPO=${FORGE_OPS_REPO:-}
- PRIMARY_BRANCH=${PRIMARY_BRANCH:-main}
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY:-}
- CLAUDE_MODEL=${CLAUDE_MODEL:-}
networks:
- default
reproduce:
build:
context: .
dockerfile: docker/reproduce/Dockerfile
image: disinto-reproduce:latest
network_mode: host
profiles: ["reproduce"]
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- agent-data:/home/agent/data
- project-repos:/home/agent/repos
- ${HOME}/.claude:/home/agent/.claude
- /usr/local/bin/claude:/usr/local/bin/claude:ro
- ${HOME}/.ssh:/home/agent/.ssh:ro
env_file:
- .env
edge:
build:
context: docker/edge
dockerfile: Dockerfile
image: disinto/edge:latest
container_name: disinto-edge
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- /usr/local/bin/claude:/usr/local/bin/claude:ro
- ${HOME}/.claude:/home/agent/.claude
- ${HOME}/.claude.json:/home/agent/.claude.json:ro
- disinto-logs:/opt/disinto-logs
- ./docker-compose.yml:/opt/docker-compose.yml:ro
- ./projects:/opt/disinto-projects:ro
environment:
- FORGE_SUPERVISOR_TOKEN=${FORGE_SUPERVISOR_TOKEN:-}
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY:-}
- CLAUDE_MODEL=claude-sonnet-4-6
- FORGE_TOKEN=${FORGE_TOKEN:-}
- FORGE_URL=http://forgejo:3000
- DISINTO_CONTAINER=1
- HOST_PROJECT_DIR=${HOST_PROJECT_DIR:-.}
- PROJECTS_DIR=/opt/disinto-projects
ports:
- "80:80"
- "443:443"
depends_on:
- forgejo
forgejo:
image: codeberg.org/forgejo/forgejo:1
container_name: disinto-forgejo
volumes:
- ./data/forgejo:/data
environment:
- FORGEJO__database__DB_TYPE=sqlite3
- FORGEJO__service__REGISTER_EMAIL_CONFIRMATION=false
- FORGEJO__service__ENABLE_NOTIFY_MAIL=false
- FORGEJO__service__DISABLE_REGISTRATION=true
- FORGEJO__service__REQUIRE_SIGNIN_VIEW=true
ports:
- "3000:3000"
volumes:
disinto-logs:

View file

@ -18,7 +18,12 @@ git stash # save any local fixes
git merge devbox/main
```
If merge conflicts on `docker-compose.yml`: delete it and regenerate in step 3.
## Note: docker-compose.yml is generator-only
The `docker-compose.yml` file is now generated exclusively by `bin/disinto init`.
The tracked file has been removed. If you have a local `docker-compose.yml` from
before this change, it is now "yours" and won't be touched by future updates.
To pick up generator improvements, delete the existing file and run `bin/disinto init`.
## Step 2: Preserve local config
@ -31,9 +36,9 @@ cp projects/harb.toml projects/harb.toml.backup
cp docker-compose.override.yml docker-compose.override.yml.backup 2>/dev/null
```
## Step 3: Regenerate docker-compose.yml (if needed)
## Step 3: Regenerate docker-compose.yml
Only needed if `generate_compose()` changed or the compose was deleted.
If `generate_compose()` changed or you need a fresh compose file:
```bash
rm docker-compose.yml
@ -47,41 +52,15 @@ init errors out.
### Known post-regeneration fixes (until #429 lands)
The generated compose has several issues on LXD deployments:
Most generator issues have been fixed. The following items no longer apply:
**1. AppArmor (#492)** — Add to ALL services:
```bash
sed -i '/^ forgejo:/a\ security_opt:\n - apparmor=unconfined' docker-compose.yml
sed -i '/^ agents:/a\ security_opt:\n - apparmor=unconfined' docker-compose.yml
# repeat for: agents-llama, edge, woodpecker, woodpecker-agent, staging, reproduce
```
- **AppArmor (#492)** — Fixed: all services now have `apparmor=unconfined`
- **Forgejo image tag (#493)** — Fixed: generator uses `forgejo:11.0`
- **Agent credential mounts (#495)** — Fixed: `.claude`, `.claude.json`, `.ssh`, and `project-repos` volumes are auto-generated
- **Repo path (#494)** — Not applicable: `projects/*.toml` files are gitignored and preserved
**2. Forgejo image tag (#493)**:
```bash
sed -i 's|forgejo/forgejo:.*|forgejo/forgejo:11.0|' docker-compose.yml
```
**3. Agent credential mounts (#495)** — Add to agents volumes:
```yaml
- ${HOME}/.claude:/home/agent/.claude
- ${HOME}/.claude.json:/home/agent/.claude.json:ro
- ${HOME}/.ssh:/home/agent/.ssh:ro
- project-repos:/home/agent/repos
```
**4. Repo path (#494)** — Fix `projects/harb.toml` if init overwrote it:
```bash
sed -i 's|repo_root.*=.*"/home/johba/harb"|repo_root = "/home/agent/repos/harb"|' projects/harb.toml
sed -i 's|ops_repo_root.*=.*"/home/johba/harb-ops"|ops_repo_root = "/home/agent/repos/harb-ops"|' projects/harb.toml
```
**5. Add missing volumes** to the `volumes:` section at the bottom:
```yaml
volumes:
project-repos:
project-repos-llama:
disinto-logs:
```
If you need to add custom volumes, edit the generated `docker-compose.yml` directly.
It will not be overwritten by future `init` runs (the generator skips existing files).
## Step 4: Rebuild and restart

View file

@ -221,6 +221,9 @@ for name, config in agents.items():
}
# Generate docker-compose.yml in the factory root.
# **CANONICAL SOURCE**: This generator is the single source of truth for docker-compose.yml.
# The tracked docker-compose.yml file has been removed. Operators must run 'bin/disinto init'
# to materialize a working stack on a fresh checkout.
_generate_compose_impl() {
local forge_port="${1:-3000}"
local compose_file="${FACTORY_ROOT}/docker-compose.yml"