diff --git a/.woodpecker/agent-smoke.sh b/.woodpecker/agent-smoke.sh index 9a37bf4..94e9258 100644 --- a/.woodpecker/agent-smoke.sh +++ b/.woodpecker/agent-smoke.sh @@ -199,9 +199,9 @@ check_script lib/ci-debug.sh check_script lib/parse-deps.sh # Agent scripts — list cross-sourced files where function scope flows across files. -# phase-handler.sh defines default callback stubs; sourcing agents may override. +# phase-handler.sh calls helpers defined by its sourcing agent (action-agent.sh). check_script dev/dev-agent.sh -check_script dev/phase-handler.sh lib/secret-scan.sh +check_script dev/phase-handler.sh action/action-agent.sh lib/secret-scan.sh check_script dev/dev-poll.sh check_script dev/phase-test.sh check_script gardener/gardener-run.sh @@ -215,7 +215,7 @@ check_script vault/vault-fire.sh check_script vault/vault-poll.sh check_script vault/vault-reject.sh check_script action/action-poll.sh -check_script action/action-agent.sh +check_script action/action-agent.sh dev/phase-handler.sh check_script supervisor/supervisor-run.sh check_script supervisor/preflight.sh check_script predictor/predictor-run.sh diff --git a/AGENTS.md b/AGENTS.md index ffc5561..4d5a91f 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -8,7 +8,7 @@ gardener, supervisor, planner, predictor, action, vault) that pick up issues fro implement them, review PRs, plan from the vision, gate dangerous actions, and keep the system healthy — all via cron and `claude -p`. -See `README.md` for the full architecture and `disinto-factory/SKILL.md` for setup. +See `README.md` for the full architecture and `BOOTSTRAP.md` for setup. ## Directory layout diff --git a/BOOTSTRAP.md b/BOOTSTRAP.md new file mode 100644 index 0000000..80e7408 --- /dev/null +++ b/BOOTSTRAP.md @@ -0,0 +1,460 @@ +# Bootstrapping a New Project + +How to point disinto at a new target project and get all agents running. + +## Prerequisites + +Before starting, ensure you have: + +- [ ] A **git repo** (GitHub, Codeberg, or any URL) with at least one issue labeled `backlog` +- [ ] A **Woodpecker CI** pipeline (`.woodpecker/` dir with at least one `.yml`) +- [ ] **Docker** installed (for local Forgejo provisioning) — or a running Forgejo instance +- [ ] A **local clone** of the target repo on the same machine as disinto +- [ ] `claude` CLI installed and authenticated (`claude --version`) +- [ ] `tmux` installed (`tmux -V`) — required for persistent dev sessions (issue #80+) + +## Quick Start + +The fastest path is `disinto init`, which provisions a local Forgejo instance, creates bot users and tokens, clones the repo, and sets up cron — all in one command: + +```bash +disinto init https://github.com/org/repo +``` + +This will: +1. Start a local Forgejo instance via Docker (at `http://localhost:3000`) +2. Create admin + bot users (dev-bot, review-bot) with API tokens +3. Create the repo on Forgejo and push your code +4. Generate a `projects/.toml` config +5. Create standard labels (backlog, in-progress, blocked, etc.) +6. Install cron entries for the agents + +No external accounts or tokens needed. + +## 1. Secret Management (SOPS + age) + +Disinto encrypts secrets at rest using [SOPS](https://github.com/getsops/sops) with [age](https://age-encryption.org/) encryption. When `sops` and `age` are installed, `disinto init` automatically: + +1. Generates an age key at `~/.config/sops/age/keys.txt` (if none exists) +2. Creates `.sops.yaml` pinning the age public key +3. Encrypts all secrets into `.env.enc` (safe to commit) +4. Removes the plaintext `.env` + +**Install the tools:** + +```bash +# age (key generation) +apt install age # Debian/Ubuntu +brew install age # macOS + +# sops (encryption/decryption) +# Download from https://github.com/getsops/sops/releases +``` + +**The age private key** at `~/.config/sops/age/keys.txt` is the single file that must be protected. Back it up securely — without it, `.env.enc` cannot be decrypted. LUKS disk encryption on the VPS protects this key at rest. + +**Managing secrets after setup:** + +```bash +disinto secrets edit # Opens .env.enc in $EDITOR, re-encrypts on save +disinto secrets show # Prints decrypted secrets (for debugging) +disinto secrets migrate # Converts existing plaintext .env -> .env.enc +``` + +**Fallback:** If `sops`/`age` are not installed, `disinto init` writes secrets to a plaintext `.env` file with a warning. All agents load secrets transparently — `lib/env.sh` checks for `.env.enc` first, then falls back to `.env`. + +## 2. Configure `.env` + +```bash +cp .env.example .env +``` + +Fill in: + +```bash +# ── Forge (auto-populated by disinto init) ───────────────── +FORGE_URL=http://localhost:3000 # local Forgejo instance +FORGE_TOKEN= # dev-bot token (auto-generated) +FORGE_REVIEW_TOKEN= # review-bot token (auto-generated) + +# ── Woodpecker CI ─────────────────────────────────────────── +WOODPECKER_TOKEN=tok_xxxxxxxx +WOODPECKER_SERVER=http://localhost:8000 +# WOODPECKER_REPO_ID — now per-project, set in projects/*.toml [ci] section + +# Woodpecker Postgres (for direct pipeline queries) +WOODPECKER_DB_PASSWORD=secret +WOODPECKER_DB_USER=woodpecker +WOODPECKER_DB_HOST=127.0.0.1 +WOODPECKER_DB_NAME=woodpecker + +# ── Tuning ────────────────────────────────────────────────── +CLAUDE_TIMEOUT=7200 # seconds per Claude invocation +``` + +### Backwards compatibility + +If you have an existing deployment using `CODEBERG_TOKEN` / `REVIEW_BOT_TOKEN` in `.env`, those still work — `env.sh` falls back to the old names automatically. No migration needed. + +## 3. Configure Project TOML + +Each project needs a `projects/.toml` file with box-specific settings +(absolute paths, Woodpecker CI IDs, forge URL). These files are +**gitignored** — they are local installation config, not shared code. + +To create one: + +```bash +# Automatic — generates TOML, clones repo, sets up cron: +disinto init https://github.com/org/repo + +# Manual — copy a template and fill in your values: +cp projects/myproject.toml.example projects/myproject.toml +vim projects/myproject.toml +``` + +The `forge_url` field in the TOML tells all agents where to find the forge API: + +```toml +name = "myproject" +repo = "org/myproject" +forge_url = "http://localhost:3000" +``` + +The repo ships `projects/*.toml.example` templates showing the expected +structure. See any `.toml.example` file for the full field reference. + +## 4. Claude Code Global Settings + +Configure `~/.claude/settings.json` with **only** permissions and `skipDangerousModePermissionPrompt`. Do not add hooks to the global settings — `agent-session.sh` injects per-worktree hooks automatically. + +Match the configuration from harb-staging exactly. The file should contain only permission grants and the dangerous-mode flag: + +```json +{ + "permissions": { + "allow": [ + "..." + ] + }, + "skipDangerousModePermissionPrompt": true +} +``` + +### Seed `~/.claude.json` + +Run `claude --dangerously-skip-permissions` once interactively to create `~/.claude.json`. This file must exist before cron-driven agents can run. + +```bash +claude --dangerously-skip-permissions +# Exit after it initializes successfully +``` + +## 5. File Ownership + +Everything under `/home/debian` must be owned by `debian:debian`. Root-owned files cause permission errors when agents run as the `debian` user. + +```bash +chown -R debian:debian /home/debian/harb /home/debian/dark-factory +``` + +Verify no root-owned files exist in agent temp directories: + +```bash +# These should return nothing +find /tmp/dev-* /tmp/harb-* /tmp/review-* -not -user debian 2>/dev/null +``` + +## 5b. Woodpecker CI + Forgejo Integration + +`disinto init` automatically configures Woodpecker to use the local Forgejo instance as its forge backend if `WOODPECKER_SERVER` is set in `.env`. This includes: + +1. Creating an OAuth2 application on Forgejo for Woodpecker +2. Writing `WOODPECKER_FORGEJO_*` env vars to `.env` +3. Activating the repo in Woodpecker + +### Manual setup (if Woodpecker runs outside of `disinto init`) + +If you manage Woodpecker separately, configure these env vars in its server config: + +```bash +WOODPECKER_FORGEJO=true +WOODPECKER_FORGEJO_URL=http://localhost:3000 +WOODPECKER_FORGEJO_CLIENT= +WOODPECKER_FORGEJO_SECRET= +``` + +To create the OAuth2 app on Forgejo: + +```bash +# Create OAuth2 application (redirect URI = Woodpecker authorize endpoint) +curl -X POST \ + -H "Authorization: token ${FORGE_TOKEN}" \ + -H "Content-Type: application/json" \ + "http://localhost:3000/api/v1/user/applications/oauth2" \ + -d '{"name":"woodpecker-ci","redirect_uris":["http://localhost:8000/authorize"],"confidential_client":true}' +``` + +The response contains `client_id` and `client_secret` for `WOODPECKER_FORGEJO_CLIENT` / `WOODPECKER_FORGEJO_SECRET`. + +To activate the repo in Woodpecker: + +```bash +woodpecker-cli repo add / +# Or via API: +curl -X POST \ + -H "Authorization: Bearer ${WOODPECKER_TOKEN}" \ + "http://localhost:8000/api/repos" \ + -d '{"forge_remote_id":"/"}' +``` + +Woodpecker will now trigger pipelines on pushes to Forgejo and push commit status back. Disinto queries Woodpecker directly for CI status (with a forge API fallback), so pipeline results are visible even if Woodpecker's status push to Forgejo is delayed. + +## 6. Prepare the Target Repo + +### Required: CI pipeline + +The repo needs at least one Woodpecker pipeline. Disinto monitors CI status to decide when a PR is ready for review and when it can merge. + +### Required: `CLAUDE.md` + +Create a `CLAUDE.md` in the repo root. This is the context document that dev-agent and review-agent read before working. It should cover: + +- **What the project is** (one paragraph) +- **Tech stack** (languages, frameworks, DB) +- **How to build/run/test** (`npm install`, `npm test`, etc.) +- **Coding conventions** (import style, naming, linting rules) +- **Project structure** (key directories and what lives where) + +The dev-agent reads this file via `claude -p` before implementing any issue. The better this file, the better the output. + +### Required: Issue labels + +`disinto init` creates these automatically. If setting up manually, create these labels on the forge repo: + +| Label | Purpose | +|-------|---------| +| `backlog` | Issues ready to be picked up by dev-agent | +| `in-progress` | Managed by dev-agent (auto-applied, auto-removed) | + +Optional but recommended: + +| Label | Purpose | +|-------|---------| +| `tech-debt` | Gardener can promote these to `backlog` | +| `blocked` | Dev-agent marks issues with unmet dependencies | +| `formula` | **Not yet functional.** Formula dispatch lives on the unmerged `feat/formula` branch. Dev-agent will skip any issue with this label until that branch is merged. Template files exist in `formulas/` for future use. | + +### Required: Branch protection + +On Forgejo, set up branch protection for your primary branch: + +- **Require pull request reviews**: enabled +- **Required approvals**: 1 (from the review bot account) +- **Restrict push**: only allow merges via PR + +This ensures dev-agent can't merge its own PRs — it must wait for review-agent (running as the bot account) to approve. + +> **Common pitfall:** Approvals alone are not enough. You must also: +> 1. Add `review-bot` as a **write** collaborator on the repo (Settings → Collaborators) +> 2. Set both `approvals_whitelist_username` **and** `merge_whitelist_usernames` to include `review-bot` in the branch protection rule +> +> Without write access, the bot's approval is counted but the merge API returns HTTP 405. + +### Required: Seed the `AGENTS.md` tree + +The planner maintains an `AGENTS.md` tree — architecture docs with +per-file `` watermarks. You must seed this before +the first planner run, otherwise the planner sees no watermarks and treats the +entire repo as "new", generating a noisy first-run diff. + +1. **Create `AGENTS.md` in the repo root** with a one-page overview of the + project: what it is, tech stack, directory layout, key conventions. Link + to sub-directory AGENTS.md files. + +2. **Create sub-directory `AGENTS.md` files** for each major directory + (e.g. `frontend/AGENTS.md`, `backend/AGENTS.md`). Keep each under ~200 + lines — architecture and conventions, not implementation details. + +3. **Set the watermark** on line 1 of every AGENTS.md file to the current HEAD: + ```bash + SHA=$(git rev-parse --short HEAD) + for f in $(find . -name "AGENTS.md" -not -path "./.git/*"); do + sed -i "1s/^/\n/" "$f" + done + ``` + +4. **Symlink `CLAUDE.md`** so Claude Code picks up the same file: + ```bash + ln -sf AGENTS.md CLAUDE.md + ``` + +5. Commit and push. The planner will now see 0 changes on its first run and + only update files when real commits land. + +See `formulas/run-planner.toml` (agents-update step) for the full AGENTS.md conventions. + +## 7. Write Good Issues + +Dev-agent works best with issues that have: + +- **Clear title** describing the change (e.g., "Add email validation to customer form") +- **Acceptance criteria** — what "done" looks like +- **Dependencies** — reference blocking issues with `#NNN` in the body or a `## Dependencies` section: + ``` + ## Dependencies + - #4 + - #7 + ``` + +Dev-agent checks that all referenced issues are closed (= merged) before starting work. If any are open, the issue is skipped and checked again next cycle. + +## 8. Install Cron + +```bash +crontab -e +``` + +### Single project + +Add (adjust paths): + +```cron +FACTORY_ROOT=/home/you/disinto + +# Supervisor — health checks, auto-healing (every 10 min) +0,10,20,30,40,50 * * * * $FACTORY_ROOT/supervisor/supervisor-poll.sh + +# Review agent — find unreviewed PRs (every 10 min, offset +3) +3,13,23,33,43,53 * * * * $FACTORY_ROOT/review/review-poll.sh $FACTORY_ROOT/projects/myproject.toml + +# Dev agent — find ready issues, implement (every 10 min, offset +6) +6,16,26,36,46,56 * * * * $FACTORY_ROOT/dev/dev-poll.sh $FACTORY_ROOT/projects/myproject.toml + +# Gardener — backlog grooming (daily) +15 8 * * * $FACTORY_ROOT/gardener/gardener-poll.sh + +# Planner — AGENTS.md maintenance + gap analysis (weekly) +0 9 * * 1 $FACTORY_ROOT/planner/planner-poll.sh +``` + +`review-poll.sh`, `dev-poll.sh`, and `gardener-poll.sh` all take a project TOML file as their first argument. + +### Multiple projects + +Stagger each project's polls so they don't overlap. With the example below, cross-project gaps are 2 minutes: + +```cron +FACTORY_ROOT=/home/you/disinto + +# Supervisor (shared) +0,10,20,30,40,50 * * * * $FACTORY_ROOT/supervisor/supervisor-poll.sh + +# Project A — review +3, dev +6 +3,13,23,33,43,53 * * * * $FACTORY_ROOT/review/review-poll.sh $FACTORY_ROOT/projects/project-a.toml +6,16,26,36,46,56 * * * * $FACTORY_ROOT/dev/dev-poll.sh $FACTORY_ROOT/projects/project-a.toml + +# Project B — review +8, dev +1 (2-min gap from project A) +8,18,28,38,48,58 * * * * $FACTORY_ROOT/review/review-poll.sh $FACTORY_ROOT/projects/project-b.toml +1,11,21,31,41,51 * * * * $FACTORY_ROOT/dev/dev-poll.sh $FACTORY_ROOT/projects/project-b.toml + +# Gardener — per-project backlog grooming (daily) +15 8 * * * $FACTORY_ROOT/gardener/gardener-poll.sh $FACTORY_ROOT/projects/project-a.toml +45 8 * * * $FACTORY_ROOT/gardener/gardener-poll.sh $FACTORY_ROOT/projects/project-b.toml + +# Planner — AGENTS.md maintenance + gap analysis (weekly) +0 9 * * 1 $FACTORY_ROOT/planner/planner-poll.sh +``` + +The staggered offsets prevent agents from competing for resources. Each project gets its own lock file (`/tmp/dev-agent-{name}.lock`) derived from the `name` field in its TOML, so concurrent runs across projects are safe. + +## 9. Verify + +```bash +# Should complete with "all clear" (no problems to fix) +bash supervisor/supervisor-poll.sh + +# Should list backlog issues (or "no backlog issues") +bash dev/dev-poll.sh + +# Should find no unreviewed PRs (or review one if exists) +bash review/review-poll.sh +``` + +Check logs after a few cycles: + +```bash +tail -30 supervisor/supervisor.log +tail -30 dev/dev-agent.log +tail -30 review/review.log +``` + +## Lifecycle + +Once running, the system operates autonomously: + +``` +You write issues (with backlog label) + → dev-poll finds ready issues + → dev-agent implements in a worktree, opens PR + → CI runs (Woodpecker) + → review-agent reviews, approves or requests changes + → dev-agent addresses feedback (if any) + → merge, close issue, clean up + +Meanwhile: + supervisor-poll monitors health, kills stale processes, manages resources + gardener grooms backlog: closes duplicates, promotes tech-debt, escalates ambiguity + planner rebuilds AGENTS.md from git history, gap-analyses against VISION.md +``` + +## Troubleshooting + +| Symptom | Check | +|---------|-------| +| Dev-agent not picking up issues | `cat /tmp/dev-agent.lock` — is another instance running? Issues labeled `backlog`? Dependencies met? | +| PR not getting reviewed | `tail review/review.log` — CI must pass first. Review bot token valid? | +| CI stuck | `bash lib/ci-debug.sh` — check Woodpecker. Rate-limited? (exit 128 = wait 15 min) | +| Claude not found | `which claude` — must be in PATH. Check `lib/env.sh` adds `~/.local/bin`. | +| Merge fails | Branch protection misconfigured? Review bot needs write access to the repo. | +| Memory issues | Supervisor auto-heals at <500 MB free. Check `supervisor/supervisor.log` for P0 alerts. | +| Works on one box but not another | Diff configs first (`~/.claude/settings.json`, `.env`, crontab, branch protection). Write code never — config mismatches are the #1 cause of cross-box failures. | + +### Multi-project common blockers + +| Symptom | Cause | Fix | +|---------|-------|-----| +| Dev-agent for project B never starts | Shared lock file path | Each TOML `name` field must be unique — lock is `/tmp/dev-agent-{name}.lock` | +| Review-poll skips all PRs | CI gate with no CI configured | Set `woodpecker_repo_id = 0` in the TOML `[ci]` section to bypass the CI check | +| Approved PRs never merge (HTTP 405) | `review-bot` not in merge/approvals whitelist | Add as write collaborator; set both `approvals_whitelist_username` and `merge_whitelist_usernames` in branch protection | +| Dev-agent churns through issues without waiting for open PRs to land | No single-threaded enforcement | `WAITING_PRS` check in dev-poll holds new work — verify TOML `name` is consistent across invocations | +| Label ping-pong (issue reopened then immediately re-closed) | `already_done` handler doesn't close issue | Review dev-agent log; `already_done` status should auto-close the issue | + +## Security: Docker Socket Sharing in CI + +The `woodpecker-agent` service mounts `/var/run/docker.sock` to execute `type: docker` CI pipelines. This grants root-equivalent access to the Docker host — any CI pipeline step can run privileged containers, mount arbitrary host paths, or access other containers' data. + +**Mitigations:** + +- **Run disinto in an LXD/VM container, not on bare metal.** When the Docker daemon runs inside an LXD container, LXD's user namespace mapping and resource limits contain the blast radius. A compromised CI step cannot reach the real host. +- **`WOODPECKER_MAX_WORKFLOWS: 1`** limits concurrent CI resource usage, preventing a runaway pipeline from exhausting host resources. +- **`WOODPECKER_AGENT_SECRET`** authenticates the agent↔server gRPC connection. `disinto init` auto-generates this secret and stores it in `.env` (or `.env.enc` when SOPS is available). +- Consider setting `WOODPECKER_BACKEND_DOCKER_VOLUMES` on the agent to restrict which host volumes CI pipelines can mount. + +**Threat model:** PRs are created by the dev-agent (Claude) and auto-reviewed by the review-bot. A crafted backlog issue could theoretically produce a PR whose CI step exploits the Docker socket. The LXD containment boundary is the primary defense — treat the LXD container as the trust boundary, not the Docker daemon inside it. + +## Action Runner — disinto (harb-staging) + +Added 2026-03-19. Polls disinto repo for `action`-labeled issues. + +``` +*/5 * * * * cd /home/debian/dark-factory && bash action/action-poll.sh projects/disinto.toml >> /tmp/action-disinto-cron.log 2>&1 +``` + +Runs locally on harb-staging — same box where Caddy/site live. For formulas that need local resources (publish-site, etc). + +### Fix applied: action-agent.sh needs +x +The script wasn't executable after git clone. Run: +```bash +chmod +x action/action-agent.sh action/action-poll.sh +``` diff --git a/CLAUDE.md b/CLAUDE.md deleted file mode 100644 index 9671180..0000000 --- a/CLAUDE.md +++ /dev/null @@ -1,6 +0,0 @@ -# CLAUDE.md - -This repo is **disinto** — an autonomous code factory. - -Read `AGENTS.md` for architecture, coding conventions, and per-file documentation. -For setup and operations, load the `disinto-factory` skill (`disinto-factory/SKILL.md`). diff --git a/dev/phase-handler.sh b/dev/phase-handler.sh index 8f3b3b4..ab099d6 100644 --- a/dev/phase-handler.sh +++ b/dev/phase-handler.sh @@ -34,17 +34,6 @@ source "$(dirname "${BASH_SOURCE[0]}")/../lib/ci-helpers.sh" # shellcheck source=../lib/mirrors.sh source "$(dirname "${BASH_SOURCE[0]}")/../lib/mirrors.sh" -# --- Default callback stubs (agents can override after sourcing) --- -# cleanup_worktree and cleanup_labels are called during phase transitions. -# Provide no-op defaults so phase-handler.sh is self-contained; sourcing -# agents override these with real implementations. -if ! declare -f cleanup_worktree >/dev/null 2>&1; then - cleanup_worktree() { :; } -fi -if ! declare -f cleanup_labels >/dev/null 2>&1; then - cleanup_labels() { :; } -fi - # --- Default globals (agents can override after sourcing) --- : "${CI_POLL_TIMEOUT:=1800}" : "${REVIEW_POLL_TIMEOUT:=10800}" diff --git a/disinto-factory/SKILL.md b/disinto-factory/SKILL.md deleted file mode 100644 index 8e17508..0000000 --- a/disinto-factory/SKILL.md +++ /dev/null @@ -1,209 +0,0 @@ ---- -name: disinto-factory -description: Set up and operate a disinto autonomous code factory. Use when bootstrapping a new factory instance, checking on agents and CI, managing the backlog, or troubleshooting the stack. ---- - -# Disinto Factory - -You are helping the user set up and operate a **disinto autonomous code factory** — a system -of bash scripts and Claude CLI that automates the full development lifecycle: picking up -issues, implementing via Claude, creating PRs, running CI, reviewing, merging, and mirroring. - -## First-time setup - -Walk the user through these steps interactively. Ask questions where marked with [ASK]. - -### 1. Environment - -[ASK] Where will the factory run? Options: -- **LXD container** (recommended for isolation) — need Debian 12, Docker, nesting enabled -- **Bare VM or server** — need Debian/Ubuntu with Docker -- **Existing container** — check prerequisites - -Verify prerequisites: -```bash -docker --version && git --version && jq --version && curl --version && tmux -V && python3 --version && claude --version -``` - -Any missing tool — help the user install it before continuing. - -### 2. Clone and init - -```bash -git clone https://codeberg.org/johba/disinto.git && cd disinto -``` - -[ASK] What repo should the factory develop? Options: -- **Itself** (self-development): `bin/disinto init https://codeberg.org/johba/disinto --yes --repo-root $(pwd)` -- **Another project**: `bin/disinto init --yes` - -Run the init and watch for: -- All bot users created (dev-bot, review-bot, etc.) -- `WOODPECKER_TOKEN` generated and saved -- Stack containers all started - -### 3. Post-init verification - -Run this checklist — fix any failures before proceeding: - -```bash -# Stack healthy? -docker ps --format "table {{.Names}}\t{{.Status}}" -# Expected: forgejo, woodpecker (healthy), woodpecker-agent (healthy), agents, edge, staging - -# Token generated? -grep WOODPECKER_TOKEN .env | grep -v "^$" && echo "OK" || echo "MISSING — see references/troubleshooting.md" - -# Agent cron active? -docker exec -u agent disinto-agents-1 crontab -l -u agent - -# Agent can reach Forgejo? -docker exec disinto-agents-1 bash -c "source /home/agent/disinto/.env && curl -sf http://forgejo:3000/api/v1/version | jq .version" - -# Agent repo cloned? -docker exec -u agent disinto-agents-1 ls /home/agent/repos/ -``` - -If the agent repo is missing, clone it: -```bash -docker exec disinto-agents-1 chown -R agent:agent /home/agent/repos -docker exec -u agent disinto-agents-1 bash -c "source /home/agent/disinto/.env && git clone http://dev-bot:\${FORGE_TOKEN}@forgejo:3000//.git /home/agent/repos/" -``` - -### 4. Mirrors (optional) - -[ASK] Should the factory mirror to external forges? If yes, which? -- GitHub: need repo URL and SSH key added to GitHub account -- Codeberg: need repo URL and SSH key added to Codeberg account - -Show the user their public key: -```bash -cat ~/.ssh/id_ed25519.pub -``` - -Test SSH access: -```bash -ssh -T git@github.com 2>&1; ssh -T git@codeberg.org 2>&1 -``` - -If SSH host keys are missing: `ssh-keyscan github.com codeberg.org >> ~/.ssh/known_hosts 2>/dev/null` - -Edit `projects/.toml` to add mirrors: -```toml -[mirrors] -github = "git@github.com:Org/repo.git" -codeberg = "git@codeberg.org:user/repo.git" -``` - -Test with a manual push: -```bash -source .env && source lib/env.sh && export PROJECT_TOML=projects/.toml && source lib/load-project.sh && source lib/mirrors.sh && mirror_push -``` - -### 5. Seed the backlog - -[ASK] What should the factory work on first? Brainstorm with the user. - -Help them create issues on the local Forgejo. Each issue needs: -- A clear title prefixed with `fix:`, `feat:`, or `chore:` -- A body describing what to change, which files, and any constraints -- The `backlog` label (so the dev-agent picks it up) - -```bash -source .env -BACKLOG_ID=$(curl -sf "http://localhost:3000/api/v1/repos///labels" \ - -H "Authorization: token $FORGE_TOKEN" | jq -r '.[] | select(.name=="backlog") | .id') - -curl -sf -X POST "http://localhost:3000/api/v1/repos///issues" \ - -H "Authorization: token $FORGE_TOKEN" \ - -H "Content-Type: application/json" \ - -d "{\"title\": \"\", \"body\": \"<body>\", \"labels\": [$BACKLOG_ID]}" -``` - -For issues with dependencies, add `Depends-on: #N` in the body — the dev-agent checks -these before starting. - -Use labels: -- `backlog` — ready for the dev-agent -- `blocked` — parked, not for the factory -- No label — tracked but not for autonomous work - -### 6. Watch it work - -The dev-agent polls every 5 minutes. Trigger manually to see it immediately: -```bash -docker exec -u agent disinto-agents-1 bash -c "cd /home/agent/disinto && bash dev/dev-poll.sh projects/<name>.toml" -``` - -Then monitor: -```bash -# Watch the agent work -docker exec disinto-agents-1 tail -f /home/agent/data/logs/dev/dev-agent.log - -# Check for Claude running -docker exec disinto-agents-1 bash -c "for f in /proc/[0-9]*/cmdline; do cmd=\$(tr '\0' ' ' < \$f 2>/dev/null); echo \$cmd | grep -q 'claude.*-p' && echo 'Claude is running'; done" -``` - -## Ongoing operations - -### Check factory status - -```bash -source .env - -# Issues -curl -sf "http://localhost:3000/api/v1/repos/<org>/<repo>/issues?state=open" \ - -H "Authorization: token $FORGE_TOKEN" \ - | jq -r '.[] | "#\(.number) [\(.labels | map(.name) | join(","))] \(.title)"' - -# PRs -curl -sf "http://localhost:3000/api/v1/repos/<org>/<repo>/pulls?state=open" \ - -H "Authorization: token $FORGE_TOKEN" \ - | jq -r '.[] | "PR #\(.number) [\(.head.ref)] \(.title)"' - -# Agent logs -docker exec disinto-agents-1 tail -20 /home/agent/data/logs/dev/dev-agent.log -``` - -### Check CI - -```bash -source .env -WP_CSRF=$(curl -sf -b "user_sess=$WOODPECKER_TOKEN" http://localhost:8000/web-config.js \ - | sed -n 's/.*WOODPECKER_CSRF = "\([^"]*\)".*/\1/p') -curl -sf -b "user_sess=$WOODPECKER_TOKEN" -H "X-CSRF-Token: $WP_CSRF" \ - "http://localhost:8000/api/repos/1/pipelines?page=1&per_page=5" \ - | jq '.[] | {number, status, event}' -``` - -### Unstick a blocked issue - -When a dev-agent run fails (CI timeout, implementation error), the issue gets labeled `blocked`: - -1. Close stale PR and delete the branch -2. `docker exec disinto-agents-1 rm -f /tmp/dev-agent-*.json /tmp/dev-agent-*.lock` -3. Relabel the issue to `backlog` -4. Update agent repo: `docker exec -u agent disinto-agents-1 bash -c "cd /home/agent/repos/<name> && git fetch origin && git reset --hard origin/main"` - -### Access Forgejo UI - -If running in an LXD container with reverse tunnel: -```bash -# From your machine: -ssh -L 3000:localhost:13000 user@jump-host -# Open http://localhost:3000 -``` - -Reset admin password if needed: -```bash -docker exec disinto-forgejo-1 su -c "forgejo admin user change-password --username disinto-admin --password <new-pw> --must-change-password=false" git -``` - -## Important context - -- Read `AGENTS.md` for per-agent architecture and file-level docs -- Read `VISION.md` for project philosophy -- The factory uses a single internal Forgejo as its forge, regardless of where mirrors go -- Dev-agent uses `claude -p --resume` for session continuity across CI/review cycles -- Mirror pushes happen automatically after every merge (fire-and-forget) -- Cron schedule: dev-poll every 5min, review-poll every 5min, gardener 4x/day diff --git a/disinto-factory/references/troubleshooting.md b/disinto-factory/references/troubleshooting.md deleted file mode 100644 index 0d1b282..0000000 --- a/disinto-factory/references/troubleshooting.md +++ /dev/null @@ -1,53 +0,0 @@ -# Troubleshooting - -## WOODPECKER_TOKEN empty after init - -The OAuth2 flow failed. Common causes: - -1. **URL-encoded redirect_uri mismatch**: Forgejo logs show "Unregistered Redirect URI". - The init script must rewrite both plain and URL-encoded Docker hostnames. - -2. **Forgejo must_change_password**: Admin user was created with forced password change. - The init script calls `--must-change-password=false` but Forgejo 11.x sometimes ignores it. - -3. **WOODPECKER_OPEN not set**: WP refuses first-user OAuth registration without it. - -Manual fix: reset admin password and re-run the token generation manually, or -use the Woodpecker UI to create a token. - -## WP CI agent won't connect (DeadlineExceeded) - -gRPC over Docker bridge fails in LXD (and possibly other nested container environments). -The compose template uses `network_mode: host` + `privileged: true` for the agent. -If you see this error, check: -- Server exposes port 9000: `grep "9000:9000" docker-compose.yml` -- Agent uses `localhost:9000`: `grep "WOODPECKER_SERVER" docker-compose.yml` -- Agent has `network_mode: host` - -## CI clone fails (could not resolve host) - -CI containers need to resolve Docker service names (e.g., `forgejo`). -Check `WOODPECKER_BACKEND_DOCKER_NETWORK` is set on the agent. - -## Webhooks not delivered - -Forgejo blocks outgoing webhooks by default. Check: -```bash -docker logs disinto-forgejo-1 2>&1 | grep "webhook.*ALLOWED_HOST_LIST" -``` -Fix: add `FORGEJO__webhook__ALLOWED_HOST_LIST: "private"` to Forgejo environment. - -Also verify the webhook exists: -```bash -curl -sf -u "disinto-admin:<password>" "http://localhost:3000/api/v1/repos/<org>/<repo>/hooks" | jq '.[].config.url' -``` -If missing, deactivate and reactivate the repo in Woodpecker to auto-create it. - -## Dev-agent fails with "cd: no such file or directory" - -`PROJECT_REPO_ROOT` inside the agents container points to a host path that doesn't -exist in the container. Check the compose env: -```bash -docker inspect disinto-agents-1 --format '{{range .Config.Env}}{{println .}}{{end}}' | grep PROJECT_REPO_ROOT -``` -Should be `/home/agent/repos/<name>`, not `/home/<user>/<name>`. diff --git a/disinto-factory/scripts/factory-status.sh b/disinto-factory/scripts/factory-status.sh deleted file mode 100755 index 457ac9a..0000000 --- a/disinto-factory/scripts/factory-status.sh +++ /dev/null @@ -1,44 +0,0 @@ -#!/usr/bin/env bash -# factory-status.sh — Quick status check for a running disinto factory -set -euo pipefail - -FACTORY_ROOT="${1:-$(cd "$(dirname "$0")/../.." && pwd)}" -source "${FACTORY_ROOT}/.env" 2>/dev/null || { echo "No .env found at ${FACTORY_ROOT}"; exit 1; } - -FORGE_URL="${FORGE_URL:-http://localhost:3000}" -REPO=$(grep '^repo ' "${FACTORY_ROOT}/projects/"*.toml 2>/dev/null | head -1 | sed 's/.*= *"//;s/"//') -[ -z "$REPO" ] && { echo "No project TOML found"; exit 1; } - -echo "=== Stack ===" -docker ps --format "table {{.Names}}\t{{.Status}}" 2>/dev/null | grep disinto - -echo "" -echo "=== Open Issues ===" -curl -sf "${FORGE_URL}/api/v1/repos/${REPO}/issues?state=open&limit=20" \ - -H "Authorization: token ${FORGE_TOKEN}" \ - | jq -r '.[] | "#\(.number) [\(.labels | map(.name) | join(","))] \(.title)"' 2>/dev/null || echo "(API error)" - -echo "" -echo "=== Open PRs ===" -curl -sf "${FORGE_URL}/api/v1/repos/${REPO}/pulls?state=open&limit=10" \ - -H "Authorization: token ${FORGE_TOKEN}" \ - | jq -r '.[] | "PR #\(.number) [\(.head.ref)] \(.title)"' 2>/dev/null || echo "none" - -echo "" -echo "=== Agent Activity ===" -docker exec disinto-agents-1 bash -c "tail -5 /home/agent/data/logs/dev/dev-agent.log 2>/dev/null" || echo "(no logs)" - -echo "" -echo "=== Claude Running? ===" -docker exec disinto-agents-1 bash -c " - found=false - for f in /proc/[0-9]*/cmdline; do - cmd=\$(tr '\0' ' ' < \"\$f\" 2>/dev/null) - if echo \"\$cmd\" | grep -q 'claude.*-p'; then found=true; echo 'Yes — Claude is actively working'; break; fi - done - \$found || echo 'No — idle' -" 2>/dev/null - -echo "" -echo "=== Mirrors ===" -cd "${FACTORY_ROOT}" 2>/dev/null && git remote -v | grep -E 'github|codeberg' | grep push || echo "none configured"