From 398c618cc4828bdc2ae83eb4f43ff7b071a4772b Mon Sep 17 00:00:00 2001 From: Claude Date: Fri, 10 Apr 2026 16:29:06 +0000 Subject: [PATCH] =?UTF-8?q?fix:=20fix:=20make=20=5Fgenerate=5Fcompose=5Fim?= =?UTF-8?q?pl=20the=20canonical=20compose=20source=20=E2=80=94=20remove=20?= =?UTF-8?q?tracked=20docker-compose.yml=20+=20update=20docs=20(#603)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- .gitignore | 3 + README.md | 4 +- disinto-factory/lessons-learned.md | 62 ++++++++---- docker-compose.yml | 154 ----------------------------- docs/updating-factory.md | 51 +++------- lib/generators.sh | 3 + 6 files changed, 69 insertions(+), 208 deletions(-) delete mode 100644 docker-compose.yml diff --git a/.gitignore b/.gitignore index fc2d715..be3ca82 100644 --- a/.gitignore +++ b/.gitignore @@ -28,3 +28,6 @@ secrets/ # Pre-built binaries for Docker builds (avoid network calls during build) docker/agents/bin/ + +# Generated docker-compose.yml (run 'bin/disinto init' to regenerate) +docker-compose.yml diff --git a/README.md b/README.md index 836d74c..bea1caf 100644 --- a/README.md +++ b/README.md @@ -72,6 +72,8 @@ cd disinto disinto init https://github.com/yourorg/yourproject ``` +This will generate a `docker-compose.yml` file. + Or configure manually — edit `.env` with your values: ```bash @@ -97,7 +99,7 @@ CLAUDE_TIMEOUT=7200 # max seconds per Claude invocation (default: 2h) docker compose up -d # 4. Verify the entrypoint loop is running -docker exec disinto-agents-1 tail -f /home/agent/data/agent-entrypoint.log +docker exec disinto-agents tail -f /home/agent/data/agent-entrypoint.log ``` ## Directory Structure diff --git a/disinto-factory/lessons-learned.md b/disinto-factory/lessons-learned.md index 9f618cb..398f108 100644 --- a/disinto-factory/lessons-learned.md +++ b/disinto-factory/lessons-learned.md @@ -1,35 +1,63 @@ # Lessons learned -## Remediation & deployment +## Debugging & Diagnostics -**Escalate gradually.** Cheapest fix first, re-measure, escalate only if it persists. Single-shot fixes are either too weak or cause collateral damage. +**Map the environment before changing code.** Silent failures often stem from runtime assumptions—missing paths, wrong user context, or unmet prerequisites. Verify the actual environment first. -**Parameterize deployment boundaries.** Entrypoint references to a specific project name are config values waiting to escape. `${VAR:-default}` preserves compat and unlocks reuse. +**Silent termination is a logging failure.** When a script exits non-zero with no output, the bug is in error handling, not the command. Log at operation entry points, not just on success. -**Fail loudly over silent defaults.** A fatal error with a clear message beats a wrong default that appears to work. +**Pipefail is not a silver bullet.** It propagates exit codes but doesn't guarantee visibility. Pair with explicit error logging for external commands (git, curl, etc.). -**Audit the whole file when fixing one value.** Hardcoded assumptions cluster. Fixing one while leaving siblings produces multi-commit churn. +**Debug the pattern, not the symptom.** If one HTTP call fails with 403, audit all similar calls. If one script has the same bug, find where it's duplicated. -## Documentation +## Shell Scripting Patterns -**Per-context rewrites, not batch replacement.** Each doc mention sits in a different narrative. Blanket substitution produces awkward text. +**Exit codes don't indicate output.** Commands like `grep -c` exit 1 when count is 0 but still output a number. Test both output and exit status independently. -**Search for implicit references too.** After keyword matches, check for instructions that assume the old mechanism without naming it. +**The `||` pattern is fragile.** It appends on failure, doesn't replace output. Use command grouping or conditionals when output clarity matters. -## Code review +**Arithmetic contexts are unforgiving.** `(( ))` fails on anything non-numeric. A stray newline or extra digit breaks everything. -**Approval means "safe to ship," not "how I'd write it."** Distinguish "wrong" from "different" — only the former blocks. +**Source file boundaries matter.** Variables defined in sourced files are local unless exported. Trace the lifecycle: definition → export → usage. -**Scale scrutiny to blast radius.** A targeted fix warrants less ceremony than a cross-cutting refactor. +## Environment & Deployment -**Be specific; separate blockers from preferences.** Concrete observations invite fixes; vague concerns invite debate. +**User context matters at every layer.** When using `gosu`/`su-exec`, ensure all file operations occur under the target user. Create resources with explicit `chown` before dropping privileges. -**Read diffs top-down: intent, behavior, edge cases.** Verify the change matches its stated goal before examining lines. +**Test under final runtime conditions.** Reproduce the exact user context the application will run under, not just "container runs." -## Issue authoring & retry +**Fail fast with actionable diagnostics.** Entrypoints should exit immediately on dependency failures with clear messages explaining *why* and *what to do*. -**Self-contained issue bodies.** The agent reads the body, not comments. On retry, update the body with exact error and fix guidance. +**Throttle retry loops.** Infinite retries without backoff mask underlying problems and look identical to healthy startups. -**Clean stale branches before retry.** Old branches trigger recovery on stale code. Close PR, delete branch, relabel. +## API & Integration -**Diagnose CI failures externally.** The agent sees pass/fail, not logs. After repeated failures, read logs yourself and put findings in the issue. +**Validate semantic types, not just names.** Don't infer resource type from naming conventions. Explicitly resolve whether an identifier is a user, org, or team before constructing URLs. + +**403 errors can signal semantic mismatches.** When debugging auth failures, consider whether the request is going to the wrong resource type. + +**Auth failures are rarely isolated.** If one endpoint requires credentials, scan for other unauthenticated calls. Environment assumptions about public access commonly break. + +**Test against the most restrictive environment first.** If it works on a locked-down instance, it'll work everywhere. + +## State & Configuration + +**Idempotency requires state awareness.** Distinguish "needs setup" from "already configured." A naive always-rotate approach breaks reproducibility. + +**Audit the full dependency chain.** When modifying shared resources, trace all consumers. Embedded tokens create hidden coupling. + +**Check validity, not just existence.** Never assume a credential is invalid just because it exists. Verify expiry, permissions, or other validity criteria. + +**Conservative defaults become problematic defaults.** Timeouts and limits should reflect real-world expectations, not worst-case scenarios. When in doubt, start aggressive and fail fast. + +**Documentation and defaults must stay in sync.** When a default changes, docs should immediately reflect why. + +## Validation & Testing + +**Add validation after critical operations.** If a migration commits N commits, verify N commits exist afterward. The extra lines are cheaper than debugging incomplete work. + +**Integration tests should cover both paths.** Test org and user scenarios, empty inputs, and edge cases explicitly. + +**Reproduce with minimal examples.** Running the exact pipeline with test cases that trigger edge conditions catches bugs early. + +**Treat "works locally but not in production" as environmental, not code.** The bug is in assumptions about the runtime, not the logic itself. diff --git a/docker-compose.yml b/docker-compose.yml deleted file mode 100644 index d28529e..0000000 --- a/docker-compose.yml +++ /dev/null @@ -1,154 +0,0 @@ -version: "3.8" - -services: - agents: - build: - context: . - dockerfile: docker/agents/Dockerfile - image: disinto/agents:latest - container_name: disinto-agents - volumes: - - ./data/agents:/home/agent/data - - ./disinto:/home/agent/disinto:ro - - /usr/local/bin/claude:/usr/local/bin/claude:ro - environment: - - FORGE_URL=http://forgejo:3000 - - FORGE_TOKEN=${FORGE_TOKEN:-} - - FORGE_REVIEW_TOKEN=${FORGE_REVIEW_TOKEN:-} - - FORGE_GARDENER_TOKEN=${FORGE_GARDENER_TOKEN:-} - - FORGE_SUPERVISOR_TOKEN=${FORGE_SUPERVISOR_TOKEN:-} - - FORGE_PREDICTOR_TOKEN=${FORGE_PREDICTOR_TOKEN:-} - - FORGE_ARCHITECT_TOKEN=${FORGE_ARCHITECT_TOKEN:-} - - FORGE_VAULT_TOKEN=${FORGE_VAULT_TOKEN:-} - - FORGE_PLANNER_TOKEN=${FORGE_PLANNER_TOKEN:-} - - FORGE_BOT_USERNAMES=${FORGE_BOT_USERNAMES:-} - - WOODPECKER_TOKEN=${WOODPECKER_TOKEN:-} - - CLAUDE_TIMEOUT=${CLAUDE_TIMEOUT:-7200} - - CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=${CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC:-1} - - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY:-} - - FORGE_ADMIN_PASS=${FORGE_ADMIN_PASS:-} - - DISINTO_CONTAINER=1 - - DISINTO_AGENTS=review,gardener - depends_on: - - forgejo - - agents-llama: - build: - context: . - dockerfile: docker/agents/Dockerfile - image: disinto/agents-llama:latest - container_name: disinto-agents-llama - volumes: - - ./data/llama:/home/agent/data - - ./disinto:/home/agent/disinto:ro - - /usr/local/bin/claude:/usr/local/bin/claude:ro - environment: - - FORGE_URL=http://forgejo:3000 - - FORGE_TOKEN=${FORGE_TOKEN_LLAMA:-} - - FORGE_PASS=${FORGE_PASS_LLAMA:-} - - FORGE_SUPERVISOR_TOKEN=${FORGE_SUPERVISOR_TOKEN:-} - - FORGE_PREDICTOR_TOKEN=${FORGE_PREDICTOR_TOKEN:-} - - FORGE_ARCHITECT_TOKEN=${FORGE_ARCHITECT_TOKEN:-} - - FORGE_VAULT_TOKEN=${FORGE_VAULT_TOKEN:-} - - FORGE_PLANNER_TOKEN=${FORGE_PLANNER_TOKEN:-} - - FORGE_BOT_USERNAMES=${FORGE_BOT_USERNAMES:-} - - WOODPECKER_TOKEN=${WOODPECKER_TOKEN:-} - - CLAUDE_TIMEOUT=${CLAUDE_TIMEOUT:-7200} - - CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=${CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC:-1} - - CLAUDE_AUTOCOMPACT_PCT_OVERRIDE=60 - - CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS=1 - - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY:-} - - ANTHROPIC_BASE_URL=${ANTHROPIC_BASE_URL:-} - - FORGE_ADMIN_PASS=${FORGE_ADMIN_PASS:-} - - DISINTO_CONTAINER=1 - - DISINTO_AGENTS=dev - - PROJECT_TOML=projects/disinto.toml - - FORGE_REPO=${FORGE_REPO:-disinto-admin/disinto} - - POLL_INTERVAL=${POLL_INTERVAL:-300} - - AGENT_ROLES=dev - depends_on: - - forgejo - - runner: - image: disinto/agents:latest - profiles: ["runner"] - volumes: - - /var/run/docker.sock:/var/run/docker.sock - - /usr/local/bin/claude:/usr/local/bin/claude:ro - - ${HOME}/.claude:/home/agent/.claude - - ${HOME}/.claude.json:/home/agent/.claude.json:ro - entrypoint: ["bash", "/home/agent/disinto/docker/runner/entrypoint-runner.sh"] - environment: - - DISINTO_CONTAINER=1 - - FORGE_URL=${FORGE_URL:-} - - FORGE_TOKEN=${FORGE_TOKEN:-} - - FORGE_REPO=${FORGE_REPO:-disinto-admin/disinto} - - FORGE_OPS_REPO=${FORGE_OPS_REPO:-} - - PRIMARY_BRANCH=${PRIMARY_BRANCH:-main} - - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY:-} - - CLAUDE_MODEL=${CLAUDE_MODEL:-} - networks: - - default - - reproduce: - build: - context: . - dockerfile: docker/reproduce/Dockerfile - image: disinto-reproduce:latest - network_mode: host - profiles: ["reproduce"] - volumes: - - /var/run/docker.sock:/var/run/docker.sock - - agent-data:/home/agent/data - - project-repos:/home/agent/repos - - ${HOME}/.claude:/home/agent/.claude - - /usr/local/bin/claude:/usr/local/bin/claude:ro - - ${HOME}/.ssh:/home/agent/.ssh:ro - env_file: - - .env - - edge: - build: - context: docker/edge - dockerfile: Dockerfile - image: disinto/edge:latest - container_name: disinto-edge - volumes: - - /var/run/docker.sock:/var/run/docker.sock - - /usr/local/bin/claude:/usr/local/bin/claude:ro - - ${HOME}/.claude:/home/agent/.claude - - ${HOME}/.claude.json:/home/agent/.claude.json:ro - - disinto-logs:/opt/disinto-logs - - ./docker-compose.yml:/opt/docker-compose.yml:ro - - ./projects:/opt/disinto-projects:ro - environment: - - FORGE_SUPERVISOR_TOKEN=${FORGE_SUPERVISOR_TOKEN:-} - - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY:-} - - CLAUDE_MODEL=claude-sonnet-4-6 - - FORGE_TOKEN=${FORGE_TOKEN:-} - - FORGE_URL=http://forgejo:3000 - - DISINTO_CONTAINER=1 - - HOST_PROJECT_DIR=${HOST_PROJECT_DIR:-.} - - PROJECTS_DIR=/opt/disinto-projects - ports: - - "80:80" - - "443:443" - depends_on: - - forgejo - - forgejo: - image: codeberg.org/forgejo/forgejo:1 - container_name: disinto-forgejo - volumes: - - ./data/forgejo:/data - environment: - - FORGEJO__database__DB_TYPE=sqlite3 - - FORGEJO__service__REGISTER_EMAIL_CONFIRMATION=false - - FORGEJO__service__ENABLE_NOTIFY_MAIL=false - - FORGEJO__service__DISABLE_REGISTRATION=true - - FORGEJO__service__REQUIRE_SIGNIN_VIEW=true - ports: - - "3000:3000" - -volumes: - disinto-logs: diff --git a/docs/updating-factory.md b/docs/updating-factory.md index 51b17c8..07c156b 100644 --- a/docs/updating-factory.md +++ b/docs/updating-factory.md @@ -18,7 +18,12 @@ git stash # save any local fixes git merge devbox/main ``` -If merge conflicts on `docker-compose.yml`: delete it and regenerate in step 3. +## Note: docker-compose.yml is generator-only + +The `docker-compose.yml` file is now generated exclusively by `bin/disinto init`. +The tracked file has been removed. If you have a local `docker-compose.yml` from +before this change, it is now "yours" and won't be touched by future updates. +To pick up generator improvements, delete the existing file and run `bin/disinto init`. ## Step 2: Preserve local config @@ -31,9 +36,9 @@ cp projects/harb.toml projects/harb.toml.backup cp docker-compose.override.yml docker-compose.override.yml.backup 2>/dev/null ``` -## Step 3: Regenerate docker-compose.yml (if needed) +## Step 3: Regenerate docker-compose.yml -Only needed if `generate_compose()` changed or the compose was deleted. +If `generate_compose()` changed or you need a fresh compose file: ```bash rm docker-compose.yml @@ -47,41 +52,15 @@ init errors out. ### Known post-regeneration fixes (until #429 lands) -The generated compose has several issues on LXD deployments: +Most generator issues have been fixed. The following items no longer apply: -**1. AppArmor (#492)** — Add to ALL services: -```bash -sed -i '/^ forgejo:/a\ security_opt:\n - apparmor=unconfined' docker-compose.yml -sed -i '/^ agents:/a\ security_opt:\n - apparmor=unconfined' docker-compose.yml -# repeat for: agents-llama, edge, woodpecker, woodpecker-agent, staging, reproduce -``` +- **AppArmor (#492)** — Fixed: all services now have `apparmor=unconfined` +- **Forgejo image tag (#493)** — Fixed: generator uses `forgejo:11.0` +- **Agent credential mounts (#495)** — Fixed: `.claude`, `.claude.json`, `.ssh`, and `project-repos` volumes are auto-generated +- **Repo path (#494)** — Not applicable: `projects/*.toml` files are gitignored and preserved -**2. Forgejo image tag (#493)**: -```bash -sed -i 's|forgejo/forgejo:.*|forgejo/forgejo:11.0|' docker-compose.yml -``` - -**3. Agent credential mounts (#495)** — Add to agents volumes: -```yaml -- ${HOME}/.claude:/home/agent/.claude -- ${HOME}/.claude.json:/home/agent/.claude.json:ro -- ${HOME}/.ssh:/home/agent/.ssh:ro -- project-repos:/home/agent/repos -``` - -**4. Repo path (#494)** — Fix `projects/harb.toml` if init overwrote it: -```bash -sed -i 's|repo_root.*=.*"/home/johba/harb"|repo_root = "/home/agent/repos/harb"|' projects/harb.toml -sed -i 's|ops_repo_root.*=.*"/home/johba/harb-ops"|ops_repo_root = "/home/agent/repos/harb-ops"|' projects/harb.toml -``` - -**5. Add missing volumes** to the `volumes:` section at the bottom: -```yaml -volumes: - project-repos: - project-repos-llama: - disinto-logs: -``` +If you need to add custom volumes, edit the generated `docker-compose.yml` directly. +It will not be overwritten by future `init` runs (the generator skips existing files). ## Step 4: Rebuild and restart diff --git a/lib/generators.sh b/lib/generators.sh index 4185753..4de088e 100644 --- a/lib/generators.sh +++ b/lib/generators.sh @@ -221,6 +221,9 @@ for name, config in agents.items(): } # Generate docker-compose.yml in the factory root. +# **CANONICAL SOURCE**: This generator is the single source of truth for docker-compose.yml. +# The tracked docker-compose.yml file has been removed. Operators must run 'bin/disinto init' +# to materialize a working stack on a fresh checkout. _generate_compose_impl() { local forge_port="${1:-3000}" local compose_file="${FACTORY_ROOT}/docker-compose.yml"