fix: Migrate action-agent.sh to SDK + shared libraries (#5 )

Rewrite action-agent from tmux session + phase-handler pattern to synchronous SDK pattern (agent_run via claude -p). Uses shared libraries: - agent-sdk.sh for one-shot Claude invocation - issue-lifecycle.sh for issue_check_deps/issue_close/issue_block - pr-lifecycle.sh for pr_create/pr_walk_to_merge - worktree.sh for worktree_create/worktree_cleanup Add default callback stubs to phase-handler.sh (cleanup_worktree, cleanup_labels) so it is self-contained now that action-agent.sh no longer sources it. Update agent-smoke.sh accordingly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 11:15:10 +00:00 · 2026-03-28 11:15:10 +00:00 · 2026-03-28 11:14:42 +00:00 · 2026-03-28 11:13:24 +00:00 · 2026-03-28 11:08:55 +00:00
8 changed files with 327 additions and 464 deletions
--- a/.woodpecker/agent-smoke.sh
+++ b/.woodpecker/agent-smoke.sh
@ -199,9 +199,9 @@ check_script lib/ci-debug.sh
 check_script lib/parse-deps.sh
 # Agent scripts — list cross-sourced files where function scope flows across files.
-# phase-handler.sh calls helpers defined by its sourcing agent (action-agent.sh).
+# phase-handler.sh defines default callback stubs; sourcing agents may override.
 check_script dev/dev-agent.sh
-check_script dev/phase-handler.sh      action/action-agent.sh lib/secret-scan.sh
+check_script dev/phase-handler.sh      lib/secret-scan.sh
 check_script dev/dev-poll.sh
 check_script dev/phase-test.sh
 check_script gardener/gardener-run.sh
@ -215,7 +215,7 @@ check_script vault/vault-fire.sh
 check_script vault/vault-poll.sh
 check_script vault/vault-reject.sh
 check_script action/action-poll.sh
-check_script action/action-agent.sh    dev/phase-handler.sh
+check_script action/action-agent.sh
 check_script supervisor/supervisor-run.sh
 check_script supervisor/preflight.sh
 check_script predictor/predictor-run.sh
--- a/AGENTS.md
+++ b/AGENTS.md
@ -8,7 +8,7 @@ gardener, supervisor, planner, predictor, action, vault) that pick up issues fro
 implement them, review PRs, plan from the vision, gate dangerous actions, and
 keep the system healthy — all via cron and `claude -p`.
-See `README.md` for the full architecture and `BOOTSTRAP.md` for setup.
+See `README.md` for the full architecture and `disinto-factory/SKILL.md` for setup.
 ## Directory layout
--- a/BOOTSTRAP.md
+++ b/BOOTSTRAP.md
@ -1,460 +0,0 @@
 # Bootstrapping a New Project
 How to point disinto at a new target project and get all agents running.
 ## Prerequisites
 Before starting, ensure you have:
 - [ ] A **git repo** (GitHub, Codeberg, or any URL) with at least one issue labeled `backlog`
 - [ ] A **Woodpecker CI** pipeline (`.woodpecker/` dir with at least one `.yml`)
 - [ ] **Docker** installed (for local Forgejo provisioning) — or a running Forgejo instance
 - [ ] A **local clone** of the target repo on the same machine as disinto
 - [ ] `claude` CLI installed and authenticated (`claude --version`)
 - [ ] `tmux` installed (`tmux -V`) — required for persistent dev sessions (issue #80+)
 ## Quick Start
 The fastest path is `disinto init`, which provisions a local Forgejo instance, creates bot users and tokens, clones the repo, and sets up cron — all in one command:
 ```bash
 disinto init https://github.com/org/repo
 ```
 This will:
 1. Start a local Forgejo instance via Docker (at `http://localhost:3000`)
 2. Create admin + bot users (dev-bot, review-bot) with API tokens
 3. Create the repo on Forgejo and push your code
 4. Generate a `projects/<name>.toml` config
 5. Create standard labels (backlog, in-progress, blocked, etc.)
 6. Install cron entries for the agents
 No external accounts or tokens needed.
 ## 1. Secret Management (SOPS + age)
 Disinto encrypts secrets at rest using [SOPS](https://github.com/getsops/sops) with [age](https://age-encryption.org/) encryption. When `sops` and `age` are installed, `disinto init` automatically:
 1. Generates an age key at `~/.config/sops/age/keys.txt` (if none exists)
 2. Creates `.sops.yaml` pinning the age public key
 3. Encrypts all secrets into `.env.enc` (safe to commit)
 4. Removes the plaintext `.env`
 **Install the tools:**
 ```bash
 # age (key generation)
 apt install age          # Debian/Ubuntu
 brew install age         # macOS
 # sops (encryption/decryption)
 # Download from https://github.com/getsops/sops/releases
 ```
 **The age private key** at `~/.config/sops/age/keys.txt` is the single file that must be protected. Back it up securely — without it, `.env.enc` cannot be decrypted. LUKS disk encryption on the VPS protects this key at rest.
 **Managing secrets after setup:**
 ```bash
 disinto secrets edit     # Opens .env.enc in $EDITOR, re-encrypts on save
 disinto secrets show     # Prints decrypted secrets (for debugging)
 disinto secrets migrate  # Converts existing plaintext .env -> .env.enc
 ```
 **Fallback:** If `sops`/`age` are not installed, `disinto init` writes secrets to a plaintext `.env` file with a warning. All agents load secrets transparently — `lib/env.sh` checks for `.env.enc` first, then falls back to `.env`.
 ## 2. Configure `.env`
 ```bash
 cp .env.example .env
 ```
 Fill in:
 ```bash
 # ── Forge (auto-populated by disinto init) ─────────────────
 FORGE_URL=http://localhost:3000        # local Forgejo instance
 FORGE_TOKEN=                           # dev-bot token (auto-generated)
 FORGE_REVIEW_TOKEN=                    # review-bot token (auto-generated)
 # ── Woodpecker CI ───────────────────────────────────────────
 WOODPECKER_TOKEN=tok_xxxxxxxx
 WOODPECKER_SERVER=http://localhost:8000
 # WOODPECKER_REPO_ID — now per-project, set in projects/*.toml [ci] section
 # Woodpecker Postgres (for direct pipeline queries)
 WOODPECKER_DB_PASSWORD=secret
 WOODPECKER_DB_USER=woodpecker
 WOODPECKER_DB_HOST=127.0.0.1
 WOODPECKER_DB_NAME=woodpecker
 # ── Tuning ──────────────────────────────────────────────────
 CLAUDE_TIMEOUT=7200                   # seconds per Claude invocation
 ```
 ### Backwards compatibility
 If you have an existing deployment using `CODEBERG_TOKEN` / `REVIEW_BOT_TOKEN` in `.env`, those still work — `env.sh` falls back to the old names automatically. No migration needed.
 ## 3. Configure Project TOML
 Each project needs a `projects/<name>.toml` file with box-specific settings
 (absolute paths, Woodpecker CI IDs, forge URL). These files are
 **gitignored** — they are local installation config, not shared code.
 To create one:
 ```bash
 # Automatic — generates TOML, clones repo, sets up cron:
 disinto init https://github.com/org/repo
 # Manual — copy a template and fill in your values:
 cp projects/myproject.toml.example projects/myproject.toml
 vim projects/myproject.toml
 ```
 The `forge_url` field in the TOML tells all agents where to find the forge API:
 ```toml
 name            = "myproject"
 repo            = "org/myproject"
 forge_url       = "http://localhost:3000"
 ```
 The repo ships `projects/*.toml.example` templates showing the expected
 structure. See any `.toml.example` file for the full field reference.
 ## 4. Claude Code Global Settings
 Configure `~/.claude/settings.json` with **only** permissions and `skipDangerousModePermissionPrompt`. Do not add hooks to the global settings — `agent-session.sh` injects per-worktree hooks automatically.
 Match the configuration from harb-staging exactly. The file should contain only permission grants and the dangerous-mode flag:
 ```json
 {
  "permissions": {
    "allow": [
      "..."
    ]
  },
  "skipDangerousModePermissionPrompt": true
 }
 ```
 ### Seed `~/.claude.json`
 Run `claude --dangerously-skip-permissions` once interactively to create `~/.claude.json`. This file must exist before cron-driven agents can run.
 ```bash
 claude --dangerously-skip-permissions
 # Exit after it initializes successfully
 ```
 ## 5. File Ownership
 Everything under `/home/debian` must be owned by `debian:debian`. Root-owned files cause permission errors when agents run as the `debian` user.
 ```bash
 chown -R debian:debian /home/debian/harb /home/debian/dark-factory
 ```
 Verify no root-owned files exist in agent temp directories:
 ```bash
 # These should return nothing
 find /tmp/dev-* /tmp/harb-* /tmp/review-* -not -user debian 2>/dev/null
 ```
 ## 5b. Woodpecker CI + Forgejo Integration
 `disinto init` automatically configures Woodpecker to use the local Forgejo instance as its forge backend if `WOODPECKER_SERVER` is set in `.env`. This includes:
 1. Creating an OAuth2 application on Forgejo for Woodpecker
 2. Writing `WOODPECKER_FORGEJO_*` env vars to `.env`
 3. Activating the repo in Woodpecker
 ### Manual setup (if Woodpecker runs outside of `disinto init`)
 If you manage Woodpecker separately, configure these env vars in its server config:
 ```bash
 WOODPECKER_FORGEJO=true
 WOODPECKER_FORGEJO_URL=http://localhost:3000
 WOODPECKER_FORGEJO_CLIENT=<oauth2-client-id>
 WOODPECKER_FORGEJO_SECRET=<oauth2-client-secret>
 ```
 To create the OAuth2 app on Forgejo:
 ```bash
 # Create OAuth2 application (redirect URI = Woodpecker authorize endpoint)
 curl -X POST \
  -H "Authorization: token ${FORGE_TOKEN}" \
  -H "Content-Type: application/json" \
  "http://localhost:3000/api/v1/user/applications/oauth2" \
  -d '{"name":"woodpecker-ci","redirect_uris":["http://localhost:8000/authorize"],"confidential_client":true}'
 ```
 The response contains `client_id` and `client_secret` for `WOODPECKER_FORGEJO_CLIENT` / `WOODPECKER_FORGEJO_SECRET`.
 To activate the repo in Woodpecker:
 ```bash
 woodpecker-cli repo add <org>/<repo>
 # Or via API:
 curl -X POST \
  -H "Authorization: Bearer ${WOODPECKER_TOKEN}" \
  "http://localhost:8000/api/repos" \
  -d '{"forge_remote_id":"<org>/<repo>"}'
 ```
 Woodpecker will now trigger pipelines on pushes to Forgejo and push commit status back. Disinto queries Woodpecker directly for CI status (with a forge API fallback), so pipeline results are visible even if Woodpecker's status push to Forgejo is delayed.
 ## 6. Prepare the Target Repo
 ### Required: CI pipeline
 The repo needs at least one Woodpecker pipeline. Disinto monitors CI status to decide when a PR is ready for review and when it can merge.
 ### Required: `CLAUDE.md`
 Create a `CLAUDE.md` in the repo root. This is the context document that dev-agent and review-agent read before working. It should cover:
 - **What the project is** (one paragraph)
 - **Tech stack** (languages, frameworks, DB)
 - **How to build/run/test** (`npm install`, `npm test`, etc.)
 - **Coding conventions** (import style, naming, linting rules)
 - **Project structure** (key directories and what lives where)
 The dev-agent reads this file via `claude -p` before implementing any issue. The better this file, the better the output.
 ### Required: Issue labels
 `disinto init` creates these automatically. If setting up manually, create these labels on the forge repo:
 | Label | Purpose |
 |-------|---------|
 | `backlog` | Issues ready to be picked up by dev-agent |
 | `in-progress` | Managed by dev-agent (auto-applied, auto-removed) |
 Optional but recommended:
 | Label | Purpose |
 |-------|---------|
 | `tech-debt` | Gardener can promote these to `backlog` |
 | `blocked` | Dev-agent marks issues with unmet dependencies |
 | `formula` | **Not yet functional.** Formula dispatch lives on the unmerged `feat/formula` branch. Dev-agent will skip any issue with this label until that branch is merged. Template files exist in `formulas/` for future use. |
 ### Required: Branch protection
 On Forgejo, set up branch protection for your primary branch:
 - **Require pull request reviews**: enabled
 - **Required approvals**: 1 (from the review bot account)
 - **Restrict push**: only allow merges via PR
 This ensures dev-agent can't merge its own PRs — it must wait for review-agent (running as the bot account) to approve.
 > **Common pitfall:** Approvals alone are not enough. You must also:
 > 1. Add `review-bot` as a **write** collaborator on the repo (Settings → Collaborators)
 > 2. Set both `approvals_whitelist_username` **and** `merge_whitelist_usernames` to include `review-bot` in the branch protection rule
 >
 > Without write access, the bot's approval is counted but the merge API returns HTTP 405.
 ### Required: Seed the `AGENTS.md` tree
 The planner maintains an `AGENTS.md` tree — architecture docs with
 per-file `<!-- last-reviewed: SHA -->` watermarks. You must seed this before
 the first planner run, otherwise the planner sees no watermarks and treats the
 entire repo as "new", generating a noisy first-run diff.
 1. **Create `AGENTS.md` in the repo root** with a one-page overview of the
   project: what it is, tech stack, directory layout, key conventions. Link
   to sub-directory AGENTS.md files.
 2. **Create sub-directory `AGENTS.md` files** for each major directory
   (e.g. `frontend/AGENTS.md`, `backend/AGENTS.md`). Keep each under ~200
   lines — architecture and conventions, not implementation details.
 3. **Set the watermark** on line 1 of every AGENTS.md file to the current HEAD:
   ```bash
   SHA=$(git rev-parse --short HEAD)
   for f in $(find . -name "AGENTS.md" -not -path "./.git/*"); do
     sed -i "1s/^/<!-- last-reviewed: ${SHA} -->\n/" "$f"
   done
   ```
 4. **Symlink `CLAUDE.md`** so Claude Code picks up the same file:
   ```bash
   ln -sf AGENTS.md CLAUDE.md
   ```
 5. Commit and push. The planner will now see 0 changes on its first run and
   only update files when real commits land.
 See `formulas/run-planner.toml` (agents-update step) for the full AGENTS.md conventions.
 ## 7. Write Good Issues
 Dev-agent works best with issues that have:
 - **Clear title** describing the change (e.g., "Add email validation to customer form")
 - **Acceptance criteria** — what "done" looks like
 - **Dependencies** — reference blocking issues with `#NNN` in the body or a `## Dependencies` section:
  ```
  ## Dependencies
  - #4
  - #7
  ```
 Dev-agent checks that all referenced issues are closed (= merged) before starting work. If any are open, the issue is skipped and checked again next cycle.
 ## 8. Install Cron
 ```bash
 crontab -e
 ```
 ### Single project
 Add (adjust paths):
 ```cron
 FACTORY_ROOT=/home/you/disinto
 # Supervisor — health checks, auto-healing (every 10 min)
 0,10,20,30,40,50 * * * * $FACTORY_ROOT/supervisor/supervisor-poll.sh
 # Review agent — find unreviewed PRs (every 10 min, offset +3)
 3,13,23,33,43,53 * * * * $FACTORY_ROOT/review/review-poll.sh $FACTORY_ROOT/projects/myproject.toml
 # Dev agent — find ready issues, implement (every 10 min, offset +6)
 6,16,26,36,46,56 * * * * $FACTORY_ROOT/dev/dev-poll.sh $FACTORY_ROOT/projects/myproject.toml
 # Gardener — backlog grooming (daily)
 15 8 * * *                $FACTORY_ROOT/gardener/gardener-poll.sh
 # Planner — AGENTS.md maintenance + gap analysis (weekly)
 0 9 * * 1                 $FACTORY_ROOT/planner/planner-poll.sh
 ```
 `review-poll.sh`, `dev-poll.sh`, and `gardener-poll.sh` all take a project TOML file as their first argument.
 ### Multiple projects
 Stagger each project's polls so they don't overlap. With the example below, cross-project gaps are 2 minutes:
 ```cron
 FACTORY_ROOT=/home/you/disinto
 # Supervisor (shared)
 0,10,20,30,40,50 * * * * $FACTORY_ROOT/supervisor/supervisor-poll.sh
 # Project A — review +3, dev +6
 3,13,23,33,43,53 * * * * $FACTORY_ROOT/review/review-poll.sh $FACTORY_ROOT/projects/project-a.toml
 6,16,26,36,46,56 * * * * $FACTORY_ROOT/dev/dev-poll.sh     $FACTORY_ROOT/projects/project-a.toml
 # Project B — review +8, dev +1  (2-min gap from project A)
 8,18,28,38,48,58 * * * * $FACTORY_ROOT/review/review-poll.sh $FACTORY_ROOT/projects/project-b.toml
 1,11,21,31,41,51 * * * * $FACTORY_ROOT/dev/dev-poll.sh     $FACTORY_ROOT/projects/project-b.toml
 # Gardener — per-project backlog grooming (daily)
 15 8 * * *                $FACTORY_ROOT/gardener/gardener-poll.sh $FACTORY_ROOT/projects/project-a.toml
 45 8 * * *                $FACTORY_ROOT/gardener/gardener-poll.sh $FACTORY_ROOT/projects/project-b.toml
 # Planner — AGENTS.md maintenance + gap analysis (weekly)
 0 9 * * 1                 $FACTORY_ROOT/planner/planner-poll.sh
 ```
 The staggered offsets prevent agents from competing for resources. Each project gets its own lock file (`/tmp/dev-agent-{name}.lock`) derived from the `name` field in its TOML, so concurrent runs across projects are safe.
 ## 9. Verify
 ```bash
 # Should complete with "all clear" (no problems to fix)
 bash supervisor/supervisor-poll.sh
 # Should list backlog issues (or "no backlog issues")
 bash dev/dev-poll.sh
 # Should find no unreviewed PRs (or review one if exists)
 bash review/review-poll.sh
 ```
 Check logs after a few cycles:
 ```bash
 tail -30 supervisor/supervisor.log
 tail -30 dev/dev-agent.log
 tail -30 review/review.log
 ```
 ## Lifecycle
 Once running, the system operates autonomously:
 ```
 You write issues (with backlog label)
  → dev-poll finds ready issues
    → dev-agent implements in a worktree, opens PR
      → CI runs (Woodpecker)
        → review-agent reviews, approves or requests changes
          → dev-agent addresses feedback (if any)
            → merge, close issue, clean up
 Meanwhile:
  supervisor-poll monitors health, kills stale processes, manages resources
  gardener grooms backlog: closes duplicates, promotes tech-debt, escalates ambiguity
  planner rebuilds AGENTS.md from git history, gap-analyses against VISION.md
 ```
 ## Troubleshooting
 | Symptom | Check |
 |---------|-------|
 | Dev-agent not picking up issues | `cat /tmp/dev-agent.lock` — is another instance running? Issues labeled `backlog`? Dependencies met? |
 | PR not getting reviewed | `tail review/review.log` — CI must pass first. Review bot token valid? |
 | CI stuck | `bash lib/ci-debug.sh` — check Woodpecker. Rate-limited? (exit 128 = wait 15 min) |
 | Claude not found | `which claude` — must be in PATH. Check `lib/env.sh` adds `~/.local/bin`. |
 | Merge fails | Branch protection misconfigured? Review bot needs write access to the repo. |
 | Memory issues | Supervisor auto-heals at <500 MB free. Check `supervisor/supervisor.log` for P0 alerts. |
 | Works on one box but not another | Diff configs first (`~/.claude/settings.json`, `.env`, crontab, branch protection). Write code never — config mismatches are the #1 cause of cross-box failures. |
 ### Multi-project common blockers
 | Symptom | Cause | Fix |
 |---------|-------|-----|
 | Dev-agent for project B never starts | Shared lock file path | Each TOML `name` field must be unique — lock is `/tmp/dev-agent-{name}.lock` |
 | Review-poll skips all PRs | CI gate with no CI configured | Set `woodpecker_repo_id = 0` in the TOML `[ci]` section to bypass the CI check |
 | Approved PRs never merge (HTTP 405) | `review-bot` not in merge/approvals whitelist | Add as write collaborator; set both `approvals_whitelist_username` and `merge_whitelist_usernames` in branch protection |
 | Dev-agent churns through issues without waiting for open PRs to land | No single-threaded enforcement | `WAITING_PRS` check in dev-poll holds new work — verify TOML `name` is consistent across invocations |
 | Label ping-pong (issue reopened then immediately re-closed) | `already_done` handler doesn't close issue | Review dev-agent log; `already_done` status should auto-close the issue |
 ## Security: Docker Socket Sharing in CI
 The `woodpecker-agent` service mounts `/var/run/docker.sock` to execute `type: docker` CI pipelines. This grants root-equivalent access to the Docker host — any CI pipeline step can run privileged containers, mount arbitrary host paths, or access other containers' data.
 **Mitigations:**
 - **Run disinto in an LXD/VM container, not on bare metal.** When the Docker daemon runs inside an LXD container, LXD's user namespace mapping and resource limits contain the blast radius. A compromised CI step cannot reach the real host.
 - **`WOODPECKER_MAX_WORKFLOWS: 1`** limits concurrent CI resource usage, preventing a runaway pipeline from exhausting host resources.
 - **`WOODPECKER_AGENT_SECRET`** authenticates the agent↔server gRPC connection. `disinto init` auto-generates this secret and stores it in `.env` (or `.env.enc` when SOPS is available).
 - Consider setting `WOODPECKER_BACKEND_DOCKER_VOLUMES` on the agent to restrict which host volumes CI pipelines can mount.
 **Threat model:** PRs are created by the dev-agent (Claude) and auto-reviewed by the review-bot. A crafted backlog issue could theoretically produce a PR whose CI step exploits the Docker socket. The LXD containment boundary is the primary defense — treat the LXD container as the trust boundary, not the Docker daemon inside it.
 ## Action Runner — disinto (harb-staging)
 Added 2026-03-19. Polls disinto repo for `action`-labeled issues.
 ```
 */5 * * * * cd /home/debian/dark-factory && bash action/action-poll.sh projects/disinto.toml >> /tmp/action-disinto-cron.log 2>&1
 ```
 Runs locally on harb-staging — same box where Caddy/site live. For formulas that need local resources (publish-site, etc).
 ### Fix applied: action-agent.sh needs +x
 The script wasn't executable after git clone. Run:
 ```bash
 chmod +x action/action-agent.sh action/action-poll.sh
 ```
--- a/CLAUDE.md
+++ b/CLAUDE.md
@ -0,0 +1,6 @@
 # CLAUDE.md
 This repo is **disinto** — an autonomous code factory.
 Read `AGENTS.md` for architecture, coding conventions, and per-file documentation.
 For setup and operations, load the `disinto-factory` skill (`disinto-factory/SKILL.md`).
--- a/dev/phase-handler.sh
+++ b/dev/phase-handler.sh
@ -34,6 +34,17 @@ source "$(dirname "${BASH_SOURCE[0]}")/../lib/ci-helpers.sh"
 # shellcheck source=../lib/mirrors.sh
 source "$(dirname "${BASH_SOURCE[0]}")/../lib/mirrors.sh"
 # --- Default callback stubs (agents can override after sourcing) ---
 # cleanup_worktree and cleanup_labels are called during phase transitions.
 # Provide no-op defaults so phase-handler.sh is self-contained; sourcing
 # agents override these with real implementations.
 if ! declare -f cleanup_worktree >/dev/null 2>&1; then
  cleanup_worktree() { :; }
 fi
 if ! declare -f cleanup_labels >/dev/null 2>&1; then
  cleanup_labels() { :; }
 fi
 # --- Default globals (agents can override after sourcing) ---
 : "${CI_POLL_TIMEOUT:=1800}"
 : "${REVIEW_POLL_TIMEOUT:=10800}"
--- a/disinto-factory/SKILL.md
+++ b/disinto-factory/SKILL.md
@ -0,0 +1,209 @@
 ---
 name: disinto-factory
 description: Set up and operate a disinto autonomous code factory. Use when bootstrapping a new factory instance, checking on agents and CI, managing the backlog, or troubleshooting the stack.
 ---
 # Disinto Factory
 You are helping the user set up and operate a **disinto autonomous code factory** — a system
 of bash scripts and Claude CLI that automates the full development lifecycle: picking up
 issues, implementing via Claude, creating PRs, running CI, reviewing, merging, and mirroring.
 ## First-time setup
 Walk the user through these steps interactively. Ask questions where marked with [ASK].
 ### 1. Environment
 [ASK] Where will the factory run? Options:
 - **LXD container** (recommended for isolation) — need Debian 12, Docker, nesting enabled
 - **Bare VM or server** — need Debian/Ubuntu with Docker
 - **Existing container** — check prerequisites
 Verify prerequisites:
 ```bash
 docker --version && git --version && jq --version && curl --version && tmux -V && python3 --version && claude --version
 ```
 Any missing tool — help the user install it before continuing.
 ### 2. Clone and init
 ```bash
 git clone https://codeberg.org/johba/disinto.git && cd disinto
 ```
 [ASK] What repo should the factory develop? Options:
 - **Itself** (self-development): `bin/disinto init https://codeberg.org/johba/disinto --yes --repo-root $(pwd)`
 - **Another project**: `bin/disinto init <repo-url> --yes`
 Run the init and watch for:
 - All bot users created (dev-bot, review-bot, etc.)
 - `WOODPECKER_TOKEN` generated and saved
 - Stack containers all started
 ### 3. Post-init verification
 Run this checklist — fix any failures before proceeding:
 ```bash
 # Stack healthy?
 docker ps --format "table {{.Names}}\t{{.Status}}"
 # Expected: forgejo, woodpecker (healthy), woodpecker-agent (healthy), agents, edge, staging
 # Token generated?
 grep WOODPECKER_TOKEN .env | grep -v "^$" && echo "OK" || echo "MISSING — see references/troubleshooting.md"
 # Agent cron active?
 docker exec -u agent disinto-agents-1 crontab -l -u agent
 # Agent can reach Forgejo?
 docker exec disinto-agents-1 bash -c "source /home/agent/disinto/.env && curl -sf http://forgejo:3000/api/v1/version | jq .version"
 # Agent repo cloned?
 docker exec -u agent disinto-agents-1 ls /home/agent/repos/
 ```
 If the agent repo is missing, clone it:
 ```bash
 docker exec disinto-agents-1 chown -R agent:agent /home/agent/repos
 docker exec -u agent disinto-agents-1 bash -c "source /home/agent/disinto/.env && git clone http://dev-bot:\${FORGE_TOKEN}@forgejo:3000/<org>/<repo>.git /home/agent/repos/<name>"
 ```
 ### 4. Mirrors (optional)
 [ASK] Should the factory mirror to external forges? If yes, which?
 - GitHub: need repo URL and SSH key added to GitHub account
 - Codeberg: need repo URL and SSH key added to Codeberg account
 Show the user their public key:
 ```bash
 cat ~/.ssh/id_ed25519.pub
 ```
 Test SSH access:
 ```bash
 ssh -T git@github.com 2>&1; ssh -T git@codeberg.org 2>&1
 ```
 If SSH host keys are missing: `ssh-keyscan github.com codeberg.org >> ~/.ssh/known_hosts 2>/dev/null`
 Edit `projects/<name>.toml` to add mirrors:
 ```toml
 [mirrors]
 github   = "git@github.com:Org/repo.git"
 codeberg = "git@codeberg.org:user/repo.git"
 ```
 Test with a manual push:
 ```bash
 source .env && source lib/env.sh && export PROJECT_TOML=projects/<name>.toml && source lib/load-project.sh && source lib/mirrors.sh && mirror_push
 ```
 ### 5. Seed the backlog
 [ASK] What should the factory work on first? Brainstorm with the user.
 Help them create issues on the local Forgejo. Each issue needs:
 - A clear title prefixed with `fix:`, `feat:`, or `chore:`
 - A body describing what to change, which files, and any constraints
 - The `backlog` label (so the dev-agent picks it up)
 ```bash
 source .env
 BACKLOG_ID=$(curl -sf "http://localhost:3000/api/v1/repos/<org>/<repo>/labels" \
  -H "Authorization: token $FORGE_TOKEN" | jq -r '.[] | select(.name=="backlog") | .id')
 curl -sf -X POST "http://localhost:3000/api/v1/repos/<org>/<repo>/issues" \
  -H "Authorization: token $FORGE_TOKEN" \
  -H "Content-Type: application/json" \
  -d "{\"title\": \"<title>\", \"body\": \"<body>\", \"labels\": [$BACKLOG_ID]}"
 ```
 For issues with dependencies, add `Depends-on: #N` in the body — the dev-agent checks
 these before starting.
 Use labels:
 - `backlog` — ready for the dev-agent
 - `blocked` — parked, not for the factory
 - No label — tracked but not for autonomous work
 ### 6. Watch it work
 The dev-agent polls every 5 minutes. Trigger manually to see it immediately:
 ```bash
 docker exec -u agent disinto-agents-1 bash -c "cd /home/agent/disinto && bash dev/dev-poll.sh projects/<name>.toml"
 ```
 Then monitor:
 ```bash
 # Watch the agent work
 docker exec disinto-agents-1 tail -f /home/agent/data/logs/dev/dev-agent.log
 # Check for Claude running
 docker exec disinto-agents-1 bash -c "for f in /proc/[0-9]*/cmdline; do cmd=\$(tr '\0' ' ' < \$f 2>/dev/null); echo \$cmd | grep -q 'claude.*-p' && echo 'Claude is running'; done"
 ```
 ## Ongoing operations
 ### Check factory status
 ```bash
 source .env
 # Issues
 curl -sf "http://localhost:3000/api/v1/repos/<org>/<repo>/issues?state=open" \
  -H "Authorization: token $FORGE_TOKEN" \
  | jq -r '.[] | "#\(.number) [\(.labels | map(.name) | join(","))] \(.title)"'
 # PRs
 curl -sf "http://localhost:3000/api/v1/repos/<org>/<repo>/pulls?state=open" \
  -H "Authorization: token $FORGE_TOKEN" \
  | jq -r '.[] | "PR #\(.number) [\(.head.ref)] \(.title)"'
 # Agent logs
 docker exec disinto-agents-1 tail -20 /home/agent/data/logs/dev/dev-agent.log
 ```
 ### Check CI
 ```bash
 source .env
 WP_CSRF=$(curl -sf -b "user_sess=$WOODPECKER_TOKEN" http://localhost:8000/web-config.js \
  | sed -n 's/.*WOODPECKER_CSRF = "\([^"]*\)".*/\1/p')
 curl -sf -b "user_sess=$WOODPECKER_TOKEN" -H "X-CSRF-Token: $WP_CSRF" \
  "http://localhost:8000/api/repos/1/pipelines?page=1&per_page=5" \
  | jq '.[] | {number, status, event}'
 ```
 ### Unstick a blocked issue
 When a dev-agent run fails (CI timeout, implementation error), the issue gets labeled `blocked`:
 1. Close stale PR and delete the branch
 2. `docker exec disinto-agents-1 rm -f /tmp/dev-agent-*.json /tmp/dev-agent-*.lock`
 3. Relabel the issue to `backlog`
 4. Update agent repo: `docker exec -u agent disinto-agents-1 bash -c "cd /home/agent/repos/<name> && git fetch origin && git reset --hard origin/main"`
 ### Access Forgejo UI
 If running in an LXD container with reverse tunnel:
 ```bash
 # From your machine:
 ssh -L 3000:localhost:13000 user@jump-host
 # Open http://localhost:3000
 ```
 Reset admin password if needed:
 ```bash
 docker exec disinto-forgejo-1 su -c "forgejo admin user change-password --username disinto-admin --password <new-pw> --must-change-password=false" git
 ```
 ## Important context
 - Read `AGENTS.md` for per-agent architecture and file-level docs
 - Read `VISION.md` for project philosophy
 - The factory uses a single internal Forgejo as its forge, regardless of where mirrors go
 - Dev-agent uses `claude -p --resume` for session continuity across CI/review cycles
 - Mirror pushes happen automatically after every merge (fire-and-forget)
 - Cron schedule: dev-poll every 5min, review-poll every 5min, gardener 4x/day
--- a/disinto-factory/references/troubleshooting.md
+++ b/disinto-factory/references/troubleshooting.md
@ -0,0 +1,53 @@
 # Troubleshooting
 ## WOODPECKER_TOKEN empty after init
 The OAuth2 flow failed. Common causes:
 1. **URL-encoded redirect_uri mismatch**: Forgejo logs show "Unregistered Redirect URI".
   The init script must rewrite both plain and URL-encoded Docker hostnames.
 2. **Forgejo must_change_password**: Admin user was created with forced password change.
   The init script calls `--must-change-password=false` but Forgejo 11.x sometimes ignores it.
 3. **WOODPECKER_OPEN not set**: WP refuses first-user OAuth registration without it.
 Manual fix: reset admin password and re-run the token generation manually, or
 use the Woodpecker UI to create a token.
 ## WP CI agent won't connect (DeadlineExceeded)
 gRPC over Docker bridge fails in LXD (and possibly other nested container environments).
 The compose template uses `network_mode: host` + `privileged: true` for the agent.
 If you see this error, check:
 - Server exposes port 9000: `grep "9000:9000" docker-compose.yml`
 - Agent uses `localhost:9000`: `grep "WOODPECKER_SERVER" docker-compose.yml`
 - Agent has `network_mode: host`
 ## CI clone fails (could not resolve host)
 CI containers need to resolve Docker service names (e.g., `forgejo`).
 Check `WOODPECKER_BACKEND_DOCKER_NETWORK` is set on the agent.
 ## Webhooks not delivered
 Forgejo blocks outgoing webhooks by default. Check:
 ```bash
 docker logs disinto-forgejo-1 2>&1 | grep "webhook.*ALLOWED_HOST_LIST"
 ```
 Fix: add `FORGEJO__webhook__ALLOWED_HOST_LIST: "private"` to Forgejo environment.
 Also verify the webhook exists:
 ```bash
 curl -sf -u "disinto-admin:<password>" "http://localhost:3000/api/v1/repos/<org>/<repo>/hooks" | jq '.[].config.url'
 ```
 If missing, deactivate and reactivate the repo in Woodpecker to auto-create it.
 ## Dev-agent fails with "cd: no such file or directory"
 `PROJECT_REPO_ROOT` inside the agents container points to a host path that doesn't
 exist in the container. Check the compose env:
 ```bash
 docker inspect disinto-agents-1 --format '{{range .Config.Env}}{{println .}}{{end}}' | grep PROJECT_REPO_ROOT
 ```
 Should be `/home/agent/repos/<name>`, not `/home/<user>/<name>`.
--- a/disinto-factory/scripts/factory-status.sh
+++ b/disinto-factory/scripts/factory-status.sh
@ -0,0 +1,44 @@
 #!/usr/bin/env bash
 # factory-status.sh — Quick status check for a running disinto factory
 set -euo pipefail
 FACTORY_ROOT="${1:-$(cd "$(dirname "$0")/../.." && pwd)}"
 source "${FACTORY_ROOT}/.env" 2>/dev/null || { echo "No .env found at ${FACTORY_ROOT}"; exit 1; }
 FORGE_URL="${FORGE_URL:-http://localhost:3000}"
 REPO=$(grep '^repo ' "${FACTORY_ROOT}/projects/"*.toml 2>/dev/null | head -1 | sed 's/.*= *"//;s/"//')
 [ -z "$REPO" ] && { echo "No project TOML found"; exit 1; }
 echo "=== Stack ==="
 docker ps --format "table {{.Names}}\t{{.Status}}" 2>/dev/null | grep disinto
 echo ""
 echo "=== Open Issues ==="
 curl -sf "${FORGE_URL}/api/v1/repos/${REPO}/issues?state=open&limit=20" \
  -H "Authorization: token ${FORGE_TOKEN}" \
  | jq -r '.[] | "#\(.number) [\(.labels | map(.name) | join(","))] \(.title)"' 2>/dev/null || echo "(API error)"
 echo ""
 echo "=== Open PRs ==="
 curl -sf "${FORGE_URL}/api/v1/repos/${REPO}/pulls?state=open&limit=10" \
  -H "Authorization: token ${FORGE_TOKEN}" \
  | jq -r '.[] | "PR #\(.number) [\(.head.ref)] \(.title)"' 2>/dev/null || echo "none"
 echo ""
 echo "=== Agent Activity ==="
 docker exec disinto-agents-1 bash -c "tail -5 /home/agent/data/logs/dev/dev-agent.log 2>/dev/null" || echo "(no logs)"
 echo ""
 echo "=== Claude Running? ==="
 docker exec disinto-agents-1 bash -c "
  found=false
  for f in /proc/[0-9]*/cmdline; do
    cmd=\$(tr '\0' ' ' < \"\$f\" 2>/dev/null)
    if echo \"\$cmd\" | grep -q 'claude.*-p'; then found=true; echo 'Yes — Claude is actively working'; break; fi
  done
  \$found || echo 'No — idle'
 " 2>/dev/null
 echo ""
 echo "=== Mirrors ==="
 cd "${FACTORY_ROOT}" 2>/dev/null && git remote -v | grep -E 'github|codeberg' | grep push || echo "none configured"
Author	SHA1	Message	Date
Claude	6f64013fc6	fix: Migrate action-agent.sh to SDK + shared libraries (#5 ) All checks were successful ci/woodpecker/push/ci Pipeline was successful Details ci/woodpecker/pr/ci Pipeline was successful Details Rewrite action-agent from tmux session + phase-handler pattern to synchronous SDK pattern (agent_run via claude -p). Uses shared libraries: - agent-sdk.sh for one-shot Claude invocation - issue-lifecycle.sh for issue_check_deps/issue_close/issue_block - pr-lifecycle.sh for pr_create/pr_walk_to_merge - worktree.sh for worktree_create/worktree_cleanup Add default callback stubs to phase-handler.sh (cleanup_worktree, cleanup_labels) so it is self-contained now that action-agent.sh no longer sources it. Update agent-smoke.sh accordingly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 11:15:10 +00:00
Claude	83ab2930e6	fix: Migrate action-agent.sh to SDK + shared libraries (#5 ) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 11:15:10 +00:00
johba	02dd03eaaf	chore: remove BOOTSTRAP.md, slim CLAUDE.md All checks were successful ci/woodpecker/push/ci Pipeline was successful Details BOOTSTRAP.md is superseded by the disinto-factory skill (SKILL.md). CLAUDE.md now just points to AGENTS.md and the skill. Updated AGENTS.md reference accordingly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 11:14:42 +00:00
johba	cbe5df52b2	feat: add disinto-factory skill for guided setup and operations All checks were successful ci/woodpecker/push/ci Pipeline was successful Details Distributable skill file (SKILL.md) that walks an AI agent through: - First-time factory setup with interactive [ASK] prompts - Post-init verification checklist - Mirror configuration to GitHub/Codeberg - Backlog seeding and issue creation - Ongoing monitoring: agent status, CI, PRs - Unsticking blocked issues Includes: - scripts/factory-status.sh — one-command factory health check - references/troubleshooting.md — common issues from real deployments - Slimmed CLAUDE.md pointing to the skill Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 11:13:24 +00:00
johba	ed43f9db11	docs: add CLAUDE.md skill file for factory setup and operations All checks were successful ci/woodpecker/push/ci Pipeline was successful Details Comprehensive guide for AI coding agents (Claude Code, etc.) to: - Set up a new factory instance in an LXD container - Run disinto init and verify the stack - Configure mirrors to GitHub/Codeberg - Check on dev-agent, review-agent, and CI status - Unstick blocked issues and trigger manual polls - File issues for the factory to work on - Known workarounds for LXD nested Docker Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 11:08:55 +00:00