fix: docs: add factory interaction lessons to SKILL.md (#156)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful

This commit is contained in:
Agent 2026-04-02 20:36:56 +00:00
parent a88544871f
commit ee6285ead9
4 changed files with 312 additions and 253 deletions

View file

@ -1,268 +1,28 @@
--- ---
name: disinto-factory name: disinto-factory
description: Set up and operate a disinto autonomous code factory. Use when bootstrapping a new factory instance, checking on agents and CI, managing the backlog, or troubleshooting the stack. description: Set up and operate a disinto autonomous code factory.
--- ---
# Disinto Factory # Disinto Factory
You are helping the user set up and operate a **disinto autonomous code factory** — a system You are helping the user set up and operate a **disinto autonomous code factory**.
of bash scripts and Claude CLI that automates the full development lifecycle: picking up
issues, implementing via Claude, creating PRs, running CI, reviewing, merging, and mirroring.
This guide shows how to set up the factory to develop an **external project** (e.g., `johba/harb`). ## Guides
## First-time setup - **[Setup guide](setup.md)** — First-time factory setup: environment, init, verification, backlog seeding
- **[Operations guide](operations.md)** — Day-to-day: status checks, CI debugging, unsticking issues, Forgejo access
Walk the user through these steps interactively. Ask questions where marked with [ASK]. - **[Lessons learned](lessons-learned.md)** — Patterns for writing issues, debugging CI, retrying failures, vault operations, breaking down features
### 1. Environment
[ASK] Where will the factory run? Options:
- **LXD container** (recommended for isolation) — need Debian 12, Docker, nesting enabled
- **Bare VM or server** — need Debian/Ubuntu with Docker
- **Existing container** — check prerequisites
Verify prerequisites:
```bash
docker --version && git --version && jq --version && curl --version && tmux -V && python3 --version && claude --version
```
Any missing tool — help the user install it before continuing.
### 2. Clone disinto and choose a target project
Clone the disinto factory itself:
```bash
git clone https://codeberg.org/johba/disinto.git && cd disinto
```
[ASK] What repository should the factory develop? Provide the **remote repository URL** in one of these formats:
- Full URL: `https://github.com/johba/harb.git` or `https://codeberg.org/johba/harb.git`
- Short slug: `johba/harb` (uses local Forgejo as the primary remote)
The factory will clone from the remote URL (if provided) or from your local Forgejo, then mirror to the remote.
Then initialize the factory for that project:
```bash
bin/disinto init johba/harb --yes
# or with full URL:
bin/disinto init https://github.com/johba/harb.git --yes
```
The `init` command will:
- Create all bot users (dev-bot, review-bot, etc.) on the local Forgejo
- Generate and save `WOODPECKER_TOKEN`
- Start the stack containers
- Clone the target repo into the agent workspace
> **Note:** The `--repo-root` flag is optional and only needed if you want to customize
> where the cloned repo lives. By default, it goes under `/home/agent/repos/<name>`.
### 3. Post-init verification
Run this checklist — fix any failures before proceeding:
```bash
# Stack healthy?
docker ps --format "table {{.Names}}\t{{.Status}}"
# Expected: forgejo, woodpecker (healthy), woodpecker-agent (healthy), agents, edge, staging
# Token generated?
grep WOODPECKER_TOKEN .env | grep -v "^$" && echo "OK" || echo "MISSING — see references/troubleshooting.md"
# Agent cron active?
docker exec -u agent disinto-agents-1 crontab -l -u agent
# Agent can reach Forgejo?
docker exec disinto-agents-1 bash -c "source /home/agent/disinto/.env && curl -sf http://forgejo:3000/api/v1/version | jq .version"
# Agent repo cloned?
docker exec -u agent disinto-agents-1 ls /home/agent/repos/
```
If the agent repo is missing, clone it:
```bash
docker exec disinto-agents-1 chown -R agent:agent /home/agent/repos
docker exec -u agent disinto-agents-1 bash -c "source /home/agent/disinto/.env && git clone http://dev-bot:\${FORGE_TOKEN}@forgejo:3000/<org>/<repo>.git /home/agent/repos/<name>"
```
### 4. Create the project configuration file
The factory uses a TOML file to configure how it manages your project. Create
`projects/<name>.toml` based on the template format:
```toml
# projects/harb.toml
name = "harb"
repo = "johba/harb"
forge_url = "http://localhost:3000"
repo_root = "/home/agent/repos/harb"
primary_branch = "master"
[ci]
woodpecker_repo_id = 0
stale_minutes = 60
[services]
containers = ["ponder"]
[monitoring]
check_prs = true
check_dev_agent = true
check_pipeline_stall = true
# [mirrors]
# github = "git@github.com:johba/harb.git"
# codeberg = "git@codeberg.org:johba/harb.git"
```
**Key fields:**
- `name`: Project identifier (used for file names, logs, etc.)
- `repo`: The source repo in `owner/name` format
- `forge_url`: URL of your local Forgejo instance
- `repo_root`: Where the agent clones the repo
- `primary_branch`: Default branch name (e.g., `main` or `master`)
- `woodpecker_repo_id`: Set to `0` initially; auto-populated on first CI run
- `containers`: List of Docker containers the factory should manage
- `mirrors`: Optional external forge URLs for backup/sync
### 5. Mirrors (optional)
[ASK] Should the factory mirror to external forges? If yes, which?
- GitHub: need repo URL and SSH key added to GitHub account
- Codeberg: need repo URL and SSH key added to Codeberg account
Show the user their public key:
```bash
cat ~/.ssh/id_ed25519.pub
```
Test SSH access:
```bash
ssh -T git@github.com 2>&1; ssh -T git@codeberg.org 2>&1
```
If SSH host keys are missing: `ssh-keyscan github.com codeberg.org >> ~/.ssh/known_hosts 2>/dev/null`
Edit `projects/<name>.toml` to uncomment and configure mirrors:
```toml
[mirrors]
github = "git@github.com:Org/repo.git"
codeberg = "git@codeberg.org:user/repo.git"
```
Test with a manual push:
```bash
source .env && source lib/env.sh && export PROJECT_TOML=projects/<name>.toml && source lib/load-project.sh && source lib/mirrors.sh && mirror_push
```
### 6. Seed the backlog
[ASK] What should the factory work on first? Brainstorm with the user.
Help them create issues on the local Forgejo. Each issue needs:
- A clear title prefixed with `fix:`, `feat:`, or `chore:`
- A body describing what to change, which files, and any constraints
- The `backlog` label (so the dev-agent picks it up)
```bash
source .env
BACKLOG_ID=$(curl -sf "http://localhost:3000/api/v1/repos/<org>/<repo>/labels" \
-H "Authorization: token $FORGE_TOKEN" | jq -r '.[] | select(.name=="backlog") | .id')
curl -sf -X POST "http://localhost:3000/api/v1/repos/<org>/<repo>/issues" \
-H "Authorization: token $FORGE_TOKEN" \
-H "Content-Type: application/json" \
-d "{\"title\": \"<title>\", \"body\": \"<body>\", \"labels\": [$BACKLOG_ID]}"
```
For issues with dependencies, add `Depends-on: #N` in the body — the dev-agent checks
these before starting.
Use labels:
- `backlog` — ready for the dev-agent
- `blocked` — parked, not for the factory
- No label — tracked but not for autonomous work
### 7. Watch it work
The dev-agent polls every 5 minutes. Trigger manually to see it immediately:
```bash
source .env
export PROJECT_TOML=projects/<name>.toml
docker exec -u agent disinto-agents-1 bash -c "cd /home/agent/disinto && bash dev/dev-poll.sh projects/<name>.toml"
```
Then monitor:
```bash
# Watch the agent work
docker exec disinto-agents-1 tail -f /home/agent/data/logs/dev/dev-agent.log
# Check for Claude running
docker exec disinto-agents-1 bash -c "for f in /proc/[0-9]*/cmdline; do cmd=\$(tr '\0' ' ' < \$f 2>/dev/null); echo \$cmd | grep -q 'claude.*-p' && echo 'Claude is running'; done"
```
## Ongoing operations
### Check factory status
```bash
source .env
# Issues
curl -sf "http://localhost:3000/api/v1/repos/<org>/<repo>/issues?state=open" \
-H "Authorization: token $FORGE_TOKEN" \
| jq -r '.[] | "#\(.number) [\(.labels | map(.name) | join(","))] \(.title)"'
# PRs
curl -sf "http://localhost:3000/api/v1/repos/<org>/<repo>/pulls?state=open" \
-H "Authorization: token $FORGE_TOKEN" \
| jq -r '.[] | "PR #\(.number) [\(.head.ref)] \(.title)"'
# Agent logs
docker exec disinto-agents-1 tail -20 /home/agent/data/logs/dev/dev-agent.log
```
### Check CI
```bash
source .env
WP_CSRF=$(curl -sf -b "user_sess=$WOODPECKER_TOKEN" http://localhost:8000/web-config.js \
| sed -n 's/.*WOODPECKER_CSRF = "\([^"]*\)".*/\1/p')
curl -sf -b "user_sess=$WOODPECKER_TOKEN" -H "X-CSRF-Token: $WP_CSRF" \
"http://localhost:8000/api/repos/1/pipelines?page=1&per_page=5" \
| jq '.[] | {number, status, event}'
```
### Unstick a blocked issue
When a dev-agent run fails (CI timeout, implementation error), the issue gets labeled `blocked`:
1. Close stale PR and delete the branch
2. `docker exec disinto-agents-1 rm -f /tmp/dev-agent-*.json /tmp/dev-agent-*.lock`
3. Relabel the issue to `backlog`
4. Update agent repo: `docker exec -u agent disinto-agents-1 bash -c "cd /home/agent/repos/<name> && git fetch origin && git reset --hard origin/main"`
### Access Forgejo UI
If running in an LXD container with reverse tunnel:
```bash
# From your machine:
ssh -L 3000:localhost:13000 user@jump-host
# Open http://localhost:3000
```
Reset admin password if needed:
```bash
docker exec disinto-forgejo-1 su -c "forgejo admin user change-password --username disinto-admin --password <new-pw> --must-change-password=false" git
```
## Important context ## Important context
- Read `AGENTS.md` for per-agent architecture and file-level docs - Read `AGENTS.md` for per-agent architecture and file-level docs
- Read `VISION.md` for project philosophy - Read `VISION.md` for project philosophy
- The factory uses a single internal Forgejo as its forge, regardless of where mirrors go - The factory uses a single internal Forgejo as its forge, regardless of where mirrors go
- Dev-agent uses `claude -p --resume` for session continuity across CI/review cycles - Dev-agent uses `claude -p` for one-shot implementation sessions
- Mirror pushes happen automatically after every merge (fire-and-forget) - Mirror pushes happen automatically after every merge
- Cron schedule: dev-poll every 5min, review-poll every 5min, gardener 4x/day - Cron schedule: dev-poll every 5min, review-poll every 5min, gardener 4x/day
## References
- [Troubleshooting](references/troubleshooting.md)
- [Factory status script](scripts/factory-status.sh)

View file

@ -0,0 +1,54 @@
# Working with the factory — lessons learned
## Writing issues for the dev agent
**Put everything in the issue body, not comments.** The dev agent reads the issue body when it starts work. It does not reliably read comments. If an issue fails and you need to add guidance for a retry, update the issue body.
**One approach per issue, no choices.** The dev agent cannot make design decisions. If there are multiple ways to solve a problem, decide before filing. Issues with "Option A or Option B" will confuse the agent.
**Issues must fit the templates.** Every backlog issue needs: affected files (max 3), acceptance criteria (max 5 checkboxes), and a clear proposed solution. If you cannot fill these fields, the issue is too big — label it `vision` and break it down first.
**Explicit dependencies prevent ordering bugs.** Add `Depends-on: #N` in the issue body. dev-poll checks these before pickup. Without explicit deps, the agent may attempt work on a stale codebase.
## Debugging CI failures
**Check CI logs via Woodpecker SQLite when the API fails.** The Woodpecker v3 log API may return HTML instead of JSON. Reliable fallback:
```bash
sqlite3 /var/lib/docker/volumes/disinto_woodpecker-data/_data/woodpecker.sqlite \
"SELECT le.data FROM log_entries le \
JOIN steps s ON le.step_id = s.id \
JOIN workflows w ON s.pipeline_id = w.id \
JOIN pipelines p ON w.pipeline_id = p.id \
WHERE p.number = <N> AND s.name = '<step>' ORDER BY le.id"
```
**When the agent fails repeatedly on CI, diagnose externally.** The dev agent cannot see CI log output (only pass/fail status). If the same step fails 3+ times, read the logs yourself and put the exact error and fix in the issue body.
## Retrying failed issues
**Clean up stale branches before retrying.** Old branches cause recovery mode which inherits stale code. Close the PR, delete the branch on Forgejo, then relabel to backlog.
**After a dependency lands, stale branches miss the fix.** If issue B depends on A, and B's PR was created before A merged, B's branch is stale. Close the PR and delete the branch so the agent starts fresh from current main.
## Environment gotchas
**Alpine/BusyBox differs from Debian.** CI and edge containers use Alpine:
- `grep -P` (Perl regex) does not work — use `grep -E`
- `USER` variable is unset — set it explicitly: `USER=$(whoami); export USER`
- Network calls fail during `docker build` in LXD — download binaries on the host, COPY into images
**The host repo drifts from Forgejo main.** If factory code is bind-mounted, the host checkout goes stale. Pull regularly or use versioned releases.
## Vault operations
**The human merging a vault PR must be a Forgejo site admin.** The dispatcher verifies `is_admin` on the merger. Promote your user via the Forgejo CLI or database if needed.
**Result files cache failures.** If a vault action fails, the dispatcher writes `.result.json` and skips it. To retry: delete the result file inside the edge container.
## Breaking down large features
**Vision issues need structured decomposition.** When a feature touches multiple subsystems or has design forks, label it `vision`. Break it down by identifying what exists, what can be reused, where the design forks are, and resolve them before filing backlog issues.
**Prefer gluecode over greenfield.** Check if Forgejo API, Woodpecker, Docker, or existing lib/ functions can do the job before building new components.
**Max 7 sub-issues per sprint.** If a breakdown produces more, split into two sprints.

View file

@ -0,0 +1,54 @@
# Ongoing operations
### Check factory status
```bash
source .env
# Issues
curl -sf "http://localhost:3000/api/v1/repos/<org>/<repo>/issues?state=open" \
-H "Authorization: token $FORGE_TOKEN" \
| jq -r '.[] | "#\(.number) [\(.labels | map(.name) | join(","))] \(.title)"'
# PRs
curl -sf "http://localhost:3000/api/v1/repos/<org>/<repo>/pulls?state=open" \
-H "Authorization: token $FORGE_TOKEN" \
| jq -r '.[] | "PR #\(.number) [\(.head.ref)] \(.title)"'
# Agent logs
docker exec disinto-agents-1 tail -20 /home/agent/data/logs/dev/dev-agent.log
```
### Check CI
```bash
source .env
WP_CSRF=$(curl -sf -b "user_sess=$WOODPECKER_TOKEN" http://localhost:8000/web-config.js \
| sed -n 's/.*WOODPECKER_CSRF = "\([^"]*\)".*/\1/p')
curl -sf -b "user_sess=$WOODPECKER_TOKEN" -H "X-CSRF-Token: $WP_CSRF" \
"http://localhost:8000/api/repos/1/pipelines?page=1&per_page=5" \
| jq '.[] | {number, status, event}'
```
### Unstick a blocked issue
When a dev-agent run fails (CI timeout, implementation error), the issue gets labeled `blocked`:
1. Close stale PR and delete the branch
2. `docker exec disinto-agents-1 rm -f /tmp/dev-agent-*.json /tmp/dev-agent-*.lock`
3. Relabel the issue to `backlog`
4. Update agent repo: `docker exec -u agent disinto-agents-1 bash -c "cd /home/agent/repos/<name> && git fetch origin && git reset --hard origin/main"`
### Access Forgejo UI
If running in an LXD container with reverse tunnel:
```bash
# From your machine:
ssh -L 3000:localhost:13000 user@jump-host
# Open http://localhost:3000
```
Reset admin password if needed:
```bash
docker exec disinto-forgejo-1 su -c "forgejo admin user change-password --username disinto-admin --password <new-pw> --must-change-password=false" git
```

191
disinto-factory/setup.md Normal file
View file

@ -0,0 +1,191 @@
# First-time setup
Walk the user through these steps interactively. Ask questions where marked with [ASK].
### 1. Environment
[ASK] Where will the factory run? Options:
- **LXD container** (recommended for isolation) — need Debian 12, Docker, nesting enabled
- **Bare VM or server** — need Debian/Ubuntu with Docker
- **Existing container** — check prerequisites
Verify prerequisites:
```bash
docker --version && git --version && jq --version && curl --version && tmux -V && python3 --version && claude --version
```
Any missing tool — help the user install it before continuing.
### 2. Clone disinto and choose a target project
Clone the disinto factory itself:
```bash
git clone https://codeberg.org/johba/disinto.git && cd disinto
```
[ASK] What repository should the factory develop? Provide the **remote repository URL** in one of these formats:
- Full URL: `https://github.com/johba/harb.git` or `https://codeberg.org/johba/harb.git`
- Short slug: `johba/harb` (uses local Forgejo as the primary remote)
The factory will clone from the remote URL (if provided) or from your local Forgejo, then mirror to the remote.
Then initialize the factory for that project:
```bash
bin/disinto init johba/harb --yes
# or with full URL:
bin/disinto init https://github.com/johba/harb.git --yes
```
The `init` command will:
- Create all bot users (dev-bot, review-bot, etc.) on the local Forgejo
- Generate and save `WOODPECKER_TOKEN`
- Start the stack containers
- Clone the target repo into the agent workspace
> **Note:** The `--repo-root` flag is optional and only needed if you want to customize
> where the cloned repo lives. By default, it goes under `/home/agent/repos/<name>`.
### 3. Post-init verification
Run this checklist — fix any failures before proceeding:
```bash
# Stack healthy?
docker ps --format "table {{.Names}}\t{{.Status}}"
# Expected: forgejo, woodpecker (healthy), woodpecker-agent (healthy), agents, edge, staging
# Token generated?
grep WOODPECKER_TOKEN .env | grep -v "^$" && echo "OK" || echo "MISSING — see references/troubleshooting.md"
# Agent cron active?
docker exec -u agent disinto-agents-1 crontab -l -u agent
# Agent can reach Forgejo?
docker exec disinto-agents-1 bash -c "source /home/agent/disinto/.env && curl -sf http://forgejo:3000/api/v1/version | jq .version"
# Agent repo cloned?
docker exec -u agent disinto-agents-1 ls /home/agent/repos/
```
If the agent repo is missing, clone it:
```bash
docker exec disinto-agents-1 chown -R agent:agent /home/agent/repos
docker exec -u agent disinto-agents-1 bash -c "source /home/agent/disinto/.env && git clone http://dev-bot:\${FORGE_TOKEN}@forgejo:3000/<org>/<repo>.git /home/agent/repos/<name>"
```
### 4. Create the project configuration file
The factory uses a TOML file to configure how it manages your project. Create
`projects/<name>.toml` based on the template format:
```toml
# projects/harb.toml
name = "harb"
repo = "johba/harb"
forge_url = "http://localhost:3000"
repo_root = "/home/agent/repos/harb"
primary_branch = "master"
[ci]
woodpecker_repo_id = 0
stale_minutes = 60
[services]
containers = ["ponder"]
[monitoring]
check_prs = true
check_dev_agent = true
check_pipeline_stall = true
# [mirrors]
# github = "git@github.com:johba/harb.git"
# codeberg = "git@codeberg.org:johba/harb.git"
```
**Key fields:**
- `name`: Project identifier (used for file names, logs, etc.)
- `repo`: The source repo in `owner/name` format
- `forge_url`: URL of your local Forgejo instance
- `repo_root`: Where the agent clones the repo
- `primary_branch`: Default branch name (e.g., `main` or `master`)
- `woodpecker_repo_id`: Set to `0` initially; auto-populated on first CI run
- `containers`: List of Docker containers the factory should manage
- `mirrors`: Optional external forge URLs for backup/sync
### 5. Mirrors (optional)
[ASK] Should the factory mirror to external forges? If yes, which?
- GitHub: need repo URL and SSH key added to GitHub account
- Codeberg: need repo URL and SSH key added to Codeberg account
Show the user their public key:
```bash
cat ~/.ssh/id_ed25519.pub
```
Test SSH access:
```bash
ssh -T git@github.com 2>&1; ssh -T git@codeberg.org 2>&1
```
If SSH host keys are missing: `ssh-keyscan github.com codeberg.org >> ~/.ssh/known_hosts 2>/dev/null`
Edit `projects/<name>.toml` to uncomment and configure mirrors:
```toml
[mirrors]
github = "git@github.com:Org/repo.git"
codeberg = "git@codeberg.org:user/repo.git"
```
Test with a manual push:
```bash
source .env && source lib/env.sh && export PROJECT_TOML=projects/<name>.toml && source lib/load-project.sh && source lib/mirrors.sh && mirror_push
```
### 6. Seed the backlog
[ASK] What should the factory work on first? Brainstorm with the user.
Help them create issues on the local Forgejo. Each issue needs:
- A clear title prefixed with `fix:`, `feat:`, or `chore:`
- A body describing what to change, which files, and any constraints
- The `backlog` label (so the dev-agent picks it up)
```bash
source .env
BACKLOG_ID=$(curl -sf "http://localhost:3000/api/v1/repos/<org>/<repo>/labels" \
-H "Authorization: token $FORGE_TOKEN" | jq -r '.[] | select(.name=="backlog") | .id')
curl -sf -X POST "http://localhost:3000/api/v1/repos/<org>/<repo>/issues" \
-H "Authorization: token $FORGE_TOKEN" \
-H "Content-Type: application/json" \
-d "{\"title\": \"<title>\", \"body\": \"<body>\", \"labels\": [$BACKLOG_ID]}"
```
For issues with dependencies, add `Depends-on: #N` in the body — the dev-agent checks
these before starting.
Use labels:
- `backlog` — ready for the dev-agent
- `blocked` — parked, not for the factory
- No label — tracked but not for autonomous work
### 7. Watch it work
The dev-agent polls every 5 minutes. Trigger manually to see it immediately:
```bash
source .env
export PROJECT_TOML=projects/<name>.toml
docker exec -u agent disinto-agents-1 bash -c "cd /home/agent/disinto && bash dev/dev-poll.sh projects/<name>.toml"
```
Then monitor:
```bash
# Watch the agent work
docker exec disinto-agents-1 tail -f /home/agent/data/logs/dev/dev-agent.log
# Check for Claude running
docker exec disinto-agents-1 bash -c "for f in /proc/[0-9]*/cmdline; do cmd=\$(tr '\0' ' ' < \$f 2>/dev/null); echo \$cmd | grep -q 'claude.*-p' && echo 'Claude is running'; done"
```