fix: fix: make _generate_compose_impl the canonical compose source — remove tracked docker-compose.yml + update docs (#603)

2026-04-10 16:29:06 +00:00 · 2026-04-10 16:29:06 +00:00 · 398c618cc4
commit 398c618cc4
parent 532ce257d5
6 changed files with 69 additions and 208 deletions
--- a/disinto-factory/lessons-learned.md
+++ b/disinto-factory/lessons-learned.md
@ -1,35 +1,63 @@
 # Lessons learned

-## Remediation & deployment
+## Debugging & Diagnostics

-**Escalate gradually.** Cheapest fix first, re-measure, escalate only if it persists. Single-shot fixes are either too weak or cause collateral damage.
+**Map the environment before changing code.** Silent failures often stem from runtime assumptions—missing paths, wrong user context, or unmet prerequisites. Verify the actual environment first.

-**Parameterize deployment boundaries.** Entrypoint references to a specific project name are config values waiting to escape. `${VAR:-default}` preserves compat and unlocks reuse.
+**Silent termination is a logging failure.** When a script exits non-zero with no output, the bug is in error handling, not the command. Log at operation entry points, not just on success.

-**Fail loudly over silent defaults.** A fatal error with a clear message beats a wrong default that appears to work.
+**Pipefail is not a silver bullet.** It propagates exit codes but doesn't guarantee visibility. Pair with explicit error logging for external commands (git, curl, etc.).

-**Audit the whole file when fixing one value.** Hardcoded assumptions cluster. Fixing one while leaving siblings produces multi-commit churn.
+**Debug the pattern, not the symptom.** If one HTTP call fails with 403, audit all similar calls. If one script has the same bug, find where it's duplicated.

-## Documentation
+## Shell Scripting Patterns

-**Per-context rewrites, not batch replacement.** Each doc mention sits in a different narrative. Blanket substitution produces awkward text.
+**Exit codes don't indicate output.** Commands like `grep -c` exit 1 when count is 0 but still output a number. Test both output and exit status independently.

-**Search for implicit references too.** After keyword matches, check for instructions that assume the old mechanism without naming it.
+**The `||` pattern is fragile.** It appends on failure, doesn't replace output. Use command grouping or conditionals when output clarity matters.

-## Code review
+**Arithmetic contexts are unforgiving.** `(( ))` fails on anything non-numeric. A stray newline or extra digit breaks everything.

-**Approval means "safe to ship," not "how I'd write it."** Distinguish "wrong" from "different" — only the former blocks.
+**Source file boundaries matter.** Variables defined in sourced files are local unless exported. Trace the lifecycle: definition → export → usage.

-**Scale scrutiny to blast radius.** A targeted fix warrants less ceremony than a cross-cutting refactor.
+## Environment & Deployment

-**Be specific; separate blockers from preferences.** Concrete observations invite fixes; vague concerns invite debate.
+**User context matters at every layer.** When using `gosu`/`su-exec`, ensure all file operations occur under the target user. Create resources with explicit `chown` before dropping privileges.

-**Read diffs top-down: intent, behavior, edge cases.** Verify the change matches its stated goal before examining lines.
+**Test under final runtime conditions.** Reproduce the exact user context the application will run under, not just "container runs."

-## Issue authoring & retry
+**Fail fast with actionable diagnostics.** Entrypoints should exit immediately on dependency failures with clear messages explaining *why* and *what to do*.

-**Self-contained issue bodies.** The agent reads the body, not comments. On retry, update the body with exact error and fix guidance.
+**Throttle retry loops.** Infinite retries without backoff mask underlying problems and look identical to healthy startups.

-**Clean stale branches before retry.** Old branches trigger recovery on stale code. Close PR, delete branch, relabel.
+## API & Integration

-**Diagnose CI failures externally.** The agent sees pass/fail, not logs. After repeated failures, read logs yourself and put findings in the issue.
+**Validate semantic types, not just names.** Don't infer resource type from naming conventions. Explicitly resolve whether an identifier is a user, org, or team before constructing URLs.
+
+**403 errors can signal semantic mismatches.** When debugging auth failures, consider whether the request is going to the wrong resource type.
+
+**Auth failures are rarely isolated.** If one endpoint requires credentials, scan for other unauthenticated calls. Environment assumptions about public access commonly break.
+
+**Test against the most restrictive environment first.** If it works on a locked-down instance, it'll work everywhere.
+
+## State & Configuration
+
+**Idempotency requires state awareness.** Distinguish "needs setup" from "already configured." A naive always-rotate approach breaks reproducibility.
+
+**Audit the full dependency chain.** When modifying shared resources, trace all consumers. Embedded tokens create hidden coupling.
+
+**Check validity, not just existence.** Never assume a credential is invalid just because it exists. Verify expiry, permissions, or other validity criteria.
+
+**Conservative defaults become problematic defaults.** Timeouts and limits should reflect real-world expectations, not worst-case scenarios. When in doubt, start aggressive and fail fast.
+
+**Documentation and defaults must stay in sync.** When a default changes, docs should immediately reflect why.
+
+## Validation & Testing
+
+**Add validation after critical operations.** If a migration commits N commits, verify N commits exist afterward. The extra lines are cheaper than debugging incomplete work.
+
+**Integration tests should cover both paths.** Test org and user scenarios, empty inputs, and edge cases explicitly.
+
+**Reproduce with minimal examples.** Running the exact pipeline with test cases that trigger edge conditions catches bugs early.
+
+**Treat "works locally but not in production" as environmental, not code.** The bug is in assumptions about the runtime, not the logic itself.