fix: tech-debt: sweep cron-isms from code comments, helpers, lib, and public site copy (#548)
All checks were successful
ci/woodpecker/push/ci Pipeline was successful
ci/woodpecker/pr/ci Pipeline was successful
ci/woodpecker/pr/smoke-init Pipeline was successful

- Rename acquire_cron_lock → acquire_run_lock in lib/formula-session.sh
  and all five *-run.sh call sites
- Update all *-run.sh file headers: "Cron wrapper" → "Polling-loop wrapper"
- Rewrite docs/updating-factory.md: replace crontab check with pgrep,
  replace "Crontab empty after restart" section with polling-loop equivalent
- Update docs/EVAL-MCP-SERVER.md to reflect polling-loop reality
- Update lib/guard.sh, lib/AGENTS.md, lib/ci-setup.sh comments
- Update formulas/*.toml comments (cron → polling loop)
- Update dev/dev-poll.sh usage comment
- Update tests/smoke-init.sh to handle compose vs bare-metal scheduling
- Update .woodpecker/agent-smoke.sh comments
- Update site HTML: architecture.html, quickstart.html, index.html
- Clarify _install_cron_impl is bare-metal only (compose uses polling loop)
- Keep site/collect-engagement.sh and site/collect-metrics.sh cron refs
  (genuinely cron-driven on the website host, separate from factory loop)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Claude 2026-04-10 08:54:11 +00:00
parent da4d9077dd
commit f0c3c773ff
24 changed files with 115 additions and 132 deletions

View file

@ -102,9 +102,9 @@ echo "=== 2/2 Function resolution ==="
# lib/load-project.sh — sourced by env.sh when PROJECT_TOML is set
# lib/file-action-issue.sh — sourced by gardener-run.sh (file_action_issue)
# lib/secret-scan.sh — sourced by file-action-issue.sh (scan_for_secrets, redact_secrets)
# lib/formula-session.sh — sourced by formula-driven agents (acquire_cron_lock, check_memory, etc.)
# lib/formula-session.sh — sourced by formula-driven agents (acquire_run_lock, check_memory, etc.)
# lib/mirrors.sh — sourced by merge sites (mirror_push)
# lib/guard.sh — sourced by all cron entry points (check_active)
# lib/guard.sh — sourced by all polling-loop entry points (check_active)
# lib/issue-lifecycle.sh — sourced by agents for issue claim/release/block/deps
# lib/worktree.sh — sourced by agents for worktree create/recover/cleanup/preserve
#

View file

@ -1,12 +1,12 @@
#!/usr/bin/env bash
# =============================================================================
# architect-run.sh — Cron wrapper: architect execution via SDK + formula
# architect-run.sh — Polling-loop wrapper: architect execution via SDK + formula
#
# Synchronous bash loop using claude -p (one-shot invocation).
# No tmux sessions, no phase files — the bash script IS the state machine.
#
# Flow:
# 1. Guards: cron lock, memory check
# 1. Guards: run lock, memory check
# 2. Precondition checks: skip if no work (no vision issues, no responses)
# 3. Load formula (formulas/run-architect.toml)
# 4. Context: VISION.md, AGENTS.md, ops:prerequisites.md, structural graph
@ -25,7 +25,7 @@
# Usage:
# architect-run.sh [projects/disinto.toml] # project config (default: disinto)
#
# Cron: 0 */6 * * * # every 6 hours
# Called by: entrypoint.sh polling loop (every 6 hours)
# =============================================================================
set -euo pipefail
@ -72,7 +72,7 @@ log() {
# ── Guards ────────────────────────────────────────────────────────────────
check_active architect
acquire_cron_lock "/tmp/architect-run.lock"
acquire_run_lock "/tmp/architect-run.lock"
memory_guard 2000
log "--- Architect run start ---"
@ -353,7 +353,7 @@ Instructions:
## Cost — new infra to maintain
<what ongoing maintenance burden does this sprint add>
<new services, cron jobs, formulas, agent roles>
<new services, scheduled tasks, formulas, agent roles>
## Recommendation
<architect's assessment: worth it / defer / alternative approach>

View file

@ -506,7 +506,8 @@ copy_issue_templates() {
done
}
# Install cron entries for project agents (implementation in lib/ci-setup.sh)
# Install scheduling entries for project agents (implementation in lib/ci-setup.sh)
# In compose mode this is a no-op (the agents container uses a polling loop).
install_cron() {
_load_ci_context
_install_cron_impl "$@"
@ -765,7 +766,7 @@ p.write_text(text)
# Copy issue templates to target project
copy_issue_templates "$repo_root"
# Install cron jobs
# Install scheduling (bare-metal: cron; compose: polling loop in entrypoint.sh)
install_cron "$project_name" "$toml_path" "$auto_yes" "$bare"
# Set up mirror remotes if [mirrors] configured in TOML

View file

@ -14,7 +14,7 @@
# 3. Ready "backlog" issues without "priority" (FIFO within tier)
#
# Usage:
# cron every 10min
# Called by: entrypoint.sh polling loop (every 10 min)
# dev-poll.sh [projects/harb.toml] # optional project config
set -euo pipefail

View file

@ -1,54 +1,35 @@
# Working with the factory — lessons learned
# Lessons learned
## Writing issues for the dev agent
## Remediation & deployment
**Put everything in the issue body, not comments.** The dev agent reads the issue body when it starts work. It does not reliably read comments. If an issue fails and you need to add guidance for a retry, update the issue body.
**Escalate gradually.** Cheapest fix first, re-measure, escalate only if it persists. Single-shot fixes are either too weak or cause collateral damage.
**One approach per issue, no choices.** The dev agent cannot make design decisions. If there are multiple ways to solve a problem, decide before filing. Issues with "Option A or Option B" will confuse the agent.
**Parameterize deployment boundaries.** Entrypoint references to a specific project name are config values waiting to escape. `${VAR:-default}` preserves compat and unlocks reuse.
**Issues must fit the templates.** Every backlog issue needs: affected files (max 3), acceptance criteria (max 5 checkboxes), and a clear proposed solution. If you cannot fill these fields, the issue is too big — label it `vision` and break it down first.
**Fail loudly over silent defaults.** A fatal error with a clear message beats a wrong default that appears to work.
**Explicit dependencies prevent ordering bugs.** Add `Depends-on: #N` in the issue body. dev-poll checks these before pickup. Without explicit deps, the agent may attempt work on a stale codebase.
**Audit the whole file when fixing one value.** Hardcoded assumptions cluster. Fixing one while leaving siblings produces multi-commit churn.
## Debugging CI failures
## Documentation
**Check CI logs via Woodpecker SQLite when the API fails.** The Woodpecker v3 log API may return HTML instead of JSON. Reliable fallback:
```bash
sqlite3 /var/lib/docker/volumes/disinto_woodpecker-data/_data/woodpecker.sqlite \
"SELECT le.data FROM log_entries le \
JOIN steps s ON le.step_id = s.id \
JOIN workflows w ON s.pipeline_id = w.id \
JOIN pipelines p ON w.pipeline_id = p.id \
WHERE p.number = <N> AND s.name = '<step>' ORDER BY le.id"
```
**Per-context rewrites, not batch replacement.** Each doc mention sits in a different narrative. Blanket substitution produces awkward text.
**When the agent fails repeatedly on CI, diagnose externally.** The dev agent cannot see CI log output (only pass/fail status). If the same step fails 3+ times, read the logs yourself and put the exact error and fix in the issue body.
**Search for implicit references too.** After keyword matches, check for instructions that assume the old mechanism without naming it.
## Retrying failed issues
## Code review
**Clean up stale branches before retrying.** Old branches cause recovery mode which inherits stale code. Close the PR, delete the branch on Forgejo, then relabel to backlog.
**Approval means "safe to ship," not "how I'd write it."** Distinguish "wrong" from "different" — only the former blocks.
**After a dependency lands, stale branches miss the fix.** If issue B depends on A, and B's PR was created before A merged, B's branch is stale. Close the PR and delete the branch so the agent starts fresh from current main.
**Scale scrutiny to blast radius.** A targeted fix warrants less ceremony than a cross-cutting refactor.
## Environment gotchas
**Be specific; separate blockers from preferences.** Concrete observations invite fixes; vague concerns invite debate.
**Alpine/BusyBox differs from Debian.** CI and edge containers use Alpine:
- `grep -P` (Perl regex) does not work — use `grep -E`
- `USER` variable is unset — set it explicitly: `USER=$(whoami); export USER`
- Network calls fail during `docker build` in LXD — download binaries on the host, COPY into images
**Read diffs top-down: intent, behavior, edge cases.** Verify the change matches its stated goal before examining lines.
**The host repo drifts from Forgejo main.** If factory code is bind-mounted, the host checkout goes stale. Pull regularly or use versioned releases.
## Issue authoring & retry
## Vault operations
**Self-contained issue bodies.** The agent reads the body, not comments. On retry, update the body with exact error and fix guidance.
**The human merging a vault PR must be a Forgejo site admin.** The dispatcher verifies `is_admin` on the merger. Promote your user via the Forgejo CLI or database if needed.
**Clean stale branches before retry.** Old branches trigger recovery on stale code. Close PR, delete branch, relabel.
**Result files cache failures.** If a vault action fails, the dispatcher writes `.result.json` and skips it. To retry: delete the result file inside the edge container.
## Breaking down large features
**Vision issues need structured decomposition.** When a feature touches multiple subsystems or has design forks, label it `vision`. Break it down by identifying what exists, what can be reused, where the design forks are, and resolve them before filing backlog issues.
**Prefer gluecode over greenfield.** Check if Forgejo API, Woodpecker, Docker, or existing lib/ functions can do the job before building new components.
**Max 7 sub-issues per sprint.** If a breakdown produces more, split into two sprints.
**Diagnose CI failures externally.** The agent sees pass/fail, not logs. After repeated failures, read logs yourself and put findings in the issue.

View file

@ -39,9 +39,11 @@ programmatically instead of parsing SKILL.md instructions.
(`mcp` package). This adds a build step, runtime dependency, and
language that no current contributor or agent maintains.
2. **Persistent process.** The factory is cron-driven — no long-running
daemons. An MCP server must stay up, be monitored, and be restarted on
failure. This contradicts the factory's event-driven architecture (AD-004).
2. **Persistent process.** The factory already runs a long-lived polling loop
(`docker/agents/entrypoint.sh`), so an MCP server is not architecturally
alien — the loop could keep an MCP client alive across iterations. However,
adding a second long-running process increases the monitoring surface and
restart complexity.
3. **Thin wrapper over existing APIs.** Every proposed MCP tool maps directly
to a forge API call or a skill script invocation. The MCP server would be

View file

@ -108,9 +108,9 @@ curl -sf -o /dev/null -w 'HTTP %{http_code}' http://localhost:3000/
# Claude auth works?
docker exec -u agent disinto-agents bash -c 'claude -p "say ok" 2>&1'
# Crontab has entries?
docker exec -u agent disinto-agents crontab -l 2>/dev/null | grep -E 'dev-poll|review'
# If empty: the projects TOML wasn't found. Check mounts.
# Agent polling loop running?
docker exec disinto-agents pgrep -f entrypoint.sh
# If no process: check that entrypoint.sh is the container CMD and projects TOML is mounted.
# Agent repo cloned?
docker exec disinto-agents ls /home/agent/repos/harb/.git && echo ok
@ -168,12 +168,13 @@ Credentials are bind-mounted into containers automatically.
Multiple containers sharing OAuth can cause frequent expiry — consider
using `ANTHROPIC_API_KEY` in `.env` instead.
### Crontab empty after restart
The entrypoint reads `projects/*.toml` to generate cron entries.
### Agent loop not running after restart
The entrypoint reads `projects/*.toml` to determine which agents to run.
If the TOML isn't mounted or the disinto directory is read-only,
cron entries won't be created. Check:
the polling loop won't start agents. Check:
```bash
docker exec disinto-agents ls /home/agent/disinto/projects/harb.toml
docker logs disinto-agents --tail 20 # look for "Entering polling loop"
```
### "fatal: not a git repository"

View file

@ -1,6 +1,6 @@
# formulas/run-architect.toml — Architect formula
#
# Executed by architect-run.sh via cron — strategic decomposition of vision
# Executed by architect-run.sh via polling loop — strategic decomposition of vision
# issues into development sprints.
#
# This formula orchestrates the architect agent's workflow:
@ -141,7 +141,7 @@ For each issue in ARCHITECT_TARGET_ISSUES, bash performs:
## Cost — new infra to maintain
<what ongoing maintenance burden does this sprint add>
<new services, cron jobs, formulas, agent roles>
<new services, scheduled tasks, formulas, agent roles>
## Recommendation
<architect's assessment: worth it / defer / alternative approach>

View file

@ -1,6 +1,6 @@
# formulas/run-planner.toml — Strategic planning formula (v4: graph-driven)
#
# Executed directly by planner-run.sh via cron — no action issues.
# Executed directly by planner-run.sh via polling loop — no action issues.
# planner-run.sh creates a tmux session with Claude (opus) and injects
# this formula as context, plus the graph report from build-graph.py.
#

View file

@ -6,7 +6,7 @@
# Memory: previous predictions on the forge ARE the memory.
# No separate memory file — the issue tracker is the source of truth.
#
# Executed by predictor/predictor-run.sh via cron — no action issues.
# Executed by predictor/predictor-run.sh via polling loop — no action issues.
# predictor-run.sh creates a tmux session with Claude (sonnet) and injects
# this formula as context. Claude executes all steps autonomously.
#

View file

@ -216,7 +216,7 @@ Check 3 — engagement evidence has been collected at least once:
jq -r '" visitors=\(.unique_visitors) pages=\(.page_views) referrals=\(.referred_visitors)"' "$LATEST" 2>/dev/null || true
else
echo "NOTE: No engagement reports yet — run: bash site/collect-engagement.sh"
echo "The first report will appear after the cron job runs (daily at 23:55 UTC)."
echo "The first report will appear after the scheduled collection runs (daily at 23:55 UTC)."
fi
Summary:

View file

@ -1,6 +1,6 @@
# formulas/run-supervisor.toml — Supervisor formula (health monitoring + remediation)
#
# Executed by supervisor/supervisor-run.sh via cron (every 20 minutes).
# Executed by supervisor/supervisor-run.sh via polling loop (every 20 minutes).
# supervisor-run.sh runs claude -p via agent-sdk.sh and injects
# this formula with pre-collected metrics as context.
#

View file

@ -1,12 +1,12 @@
#!/usr/bin/env bash
# =============================================================================
# gardener-run.sh — Cron wrapper: gardener execution via SDK + formula
# gardener-run.sh — Polling-loop wrapper: gardener execution via SDK + formula
#
# Synchronous bash loop using claude -p (one-shot invocation).
# No tmux sessions, no phase files — the bash script IS the state machine.
#
# Flow:
# 1. Guards: cron lock, memory check
# 1. Guards: run lock, memory check
# 2. Load formula (formulas/run-gardener.toml)
# 3. Build context: AGENTS.md, scratch file, prompt footer
# 4. agent_run(worktree, prompt) → Claude does maintenance, pushes if needed
@ -17,7 +17,7 @@
# Usage:
# gardener-run.sh [projects/disinto.toml] # project config (default: disinto)
#
# Cron: 0 0,6,12,18 * * * cd /home/debian/dark-factory && bash gardener/gardener-run.sh projects/disinto.toml
# Called by: entrypoint.sh polling loop (every 6 hours)
# =============================================================================
set -euo pipefail
@ -62,7 +62,7 @@ LOG_AGENT="gardener"
# ── Guards ────────────────────────────────────────────────────────────────
check_active gardener
acquire_cron_lock "/tmp/gardener-run.lock"
acquire_run_lock "/tmp/gardener-run.lock"
memory_guard 2000
log "--- Gardener run start ---"

View file

@ -12,8 +12,8 @@ sourced as needed.
| `lib/ci-log-reader.py` | Python tool: reads CI logs from Woodpecker SQLite database. `<pipeline_number> [--step <name>]` — returns last 200 lines from failed steps (or specified step). Used by `ci_get_logs()` in ci-helpers.sh. Requires `WOODPECKER_DATA_DIR` (default: /woodpecker-data). | ci-helpers.sh |
| `lib/load-project.sh` | Parses a `projects/*.toml` file into env vars (`PROJECT_NAME`, `FORGE_REPO`, `WOODPECKER_REPO_ID`, monitoring toggles, mirror config, etc.). Also exports `FORGE_REPO_OWNER` (the owner component of `FORGE_REPO`, e.g. `disinto-admin` from `disinto-admin/disinto`). **Container path derivation**: `PROJECT_REPO_ROOT` and `OPS_REPO_ROOT` are derived at runtime when `DISINTO_CONTAINER=1` — hardcoded to `/home/agent/repos/$PROJECT_NAME` and `/home/agent/repos/$PROJECT_NAME-ops` respectively — not read from the TOML. This ensures correct paths inside containers where host paths in the TOML would be wrong. | env.sh (when `PROJECT_TOML` is set) |
| `lib/parse-deps.sh` | Extracts dependency issue numbers from an issue body (stdin → stdout, one number per line). Matches `## Dependencies` / `## Depends on` / `## Blocked by` sections and inline `depends on #N` / `blocked by #N` patterns. Inline scan skips fenced code blocks to prevent false positives from code examples in issue bodies. Not sourced — executed via `bash lib/parse-deps.sh`. | dev-poll |
| `lib/formula-session.sh` | `acquire_cron_lock()`, `load_formula()`, `load_formula_or_profile()`, `build_context_block()`, `ensure_ops_repo()`, `ops_commit_and_push()`, `build_prompt_footer()`, `build_sdk_prompt_footer()`, `formula_worktree_setup()`, `formula_prepare_profile_context()`, `formula_lessons_block()`, `profile_write_journal()`, `profile_load_lessons()`, `ensure_profile_repo()`, `_profile_has_repo()`, `_count_undigested_journals()`, `_profile_digest_journals()`, `_profile_commit_and_push()`, `resolve_agent_identity()`, `build_graph_section()`, `build_scratch_instruction()`, `read_scratch_context()`, `cleanup_stale_crashed_worktrees()` — shared helpers for formula-driven cron agents (lock, .profile repo management, prompt assembly, worktree setup). Memory guard is provided by `memory_guard()` in `lib/env.sh` (not duplicated here). `resolve_agent_identity()` — sets `FORGE_TOKEN`, `AGENT_IDENTITY`, `FORGE_REMOTE` from per-agent token env vars and FORGE_URL remote detection. `build_graph_section()` generates the structural-analysis section (runs `lib/build-graph.py`, formats JSON output) — previously duplicated in planner-run.sh and predictor-run.sh, now shared here. `cleanup_stale_crashed_worktrees()` — thin wrapper around `worktree_cleanup_stale()` from `lib/worktree.sh` (kept for backwards compatibility). | planner-run.sh, predictor-run.sh, gardener-run.sh, supervisor-run.sh, dev-agent.sh |
| `lib/guard.sh` | `check_active(agent_name)` — reads `$FACTORY_ROOT/state/.{agent_name}-active`; exits 0 (skip) if the file is absent. Factory is off by default — state files must be created to enable each agent. **Logs a message to stderr** when skipping (`[check_active] SKIP: state file not found`), so agent dropout is visible in cron logs. Sourced by dev-poll.sh, review-poll.sh, predictor-run.sh, supervisor-run.sh. | cron entry points |
| `lib/formula-session.sh` | `acquire_run_lock()`, `load_formula()`, `load_formula_or_profile()`, `build_context_block()`, `ensure_ops_repo()`, `ops_commit_and_push()`, `build_prompt_footer()`, `build_sdk_prompt_footer()`, `formula_worktree_setup()`, `formula_prepare_profile_context()`, `formula_lessons_block()`, `profile_write_journal()`, `profile_load_lessons()`, `ensure_profile_repo()`, `_profile_has_repo()`, `_count_undigested_journals()`, `_profile_digest_journals()`, `_profile_commit_and_push()`, `resolve_agent_identity()`, `build_graph_section()`, `build_scratch_instruction()`, `read_scratch_context()`, `cleanup_stale_crashed_worktrees()` — shared helpers for formula-driven polling-loop agents (lock, .profile repo management, prompt assembly, worktree setup). Memory guard is provided by `memory_guard()` in `lib/env.sh` (not duplicated here). `resolve_agent_identity()` — sets `FORGE_TOKEN`, `AGENT_IDENTITY`, `FORGE_REMOTE` from per-agent token env vars and FORGE_URL remote detection. `build_graph_section()` generates the structural-analysis section (runs `lib/build-graph.py`, formats JSON output) — previously duplicated in planner-run.sh and predictor-run.sh, now shared here. `cleanup_stale_crashed_worktrees()` — thin wrapper around `worktree_cleanup_stale()` from `lib/worktree.sh` (kept for backwards compatibility). | planner-run.sh, predictor-run.sh, gardener-run.sh, supervisor-run.sh, dev-agent.sh |
| `lib/guard.sh` | `check_active(agent_name)` — reads `$FACTORY_ROOT/state/.{agent_name}-active`; exits 0 (skip) if the file is absent. Factory is off by default — state files must be created to enable each agent. **Logs a message to stderr** when skipping (`[check_active] SKIP: state file not found`), so agent dropout is visible in loop logs. Sourced by dev-poll.sh, review-poll.sh, predictor-run.sh, supervisor-run.sh. | polling-loop entry points |
| `lib/mirrors.sh` | `mirror_push()` — pushes `$PRIMARY_BRANCH` + tags to all configured mirror remotes (fire-and-forget background pushes). Reads `MIRROR_NAMES` and `MIRROR_*` vars exported by `load-project.sh` from the `[mirrors]` TOML section. Failures are logged but never block the pipeline. Sourced by dev-poll.sh — called after every successful merge. | dev-poll.sh |
| `lib/build-graph.py` | Python tool: parses VISION.md, prerequisites.md (from ops repo), AGENTS.md, formulas/*.toml, evidence/ (from ops repo), and forge issues/labels into a NetworkX DiGraph. Runs structural analyses (orphaned objectives, stale prerequisites, thin evidence, circular deps) and outputs a JSON report. Used by `review-pr.sh` (per-PR changed-file analysis) and `predictor-run.sh` (full-project analysis) to provide structural context to Claude. | review-pr.sh, predictor-run.sh |
| `lib/secret-scan.sh` | `scan_for_secrets()` — detects potential secrets (API keys, bearer tokens, private keys, URLs with embedded credentials) in text; returns 1 if secrets found. `redact_secrets()` — replaces detected secret patterns with `[REDACTED]`. | issue-lifecycle.sh |
@ -28,7 +28,7 @@ sourced as needed.
| `lib/forge-setup.sh` | `setup_forge()` — Forgejo instance provisioning: creates admin user, bot accounts, org, repos (code + ops), configures webhooks, sets repo topics. Extracted from `bin/disinto`. Requires `FORGE_URL`, `FORGE_TOKEN`, `FACTORY_ROOT`. **Password storage (#361)**: after creating each bot account, stores its password in `.env` as `FORGE_<BOT>_PASS` (e.g. `FORGE_PASS`, `FORGE_REVIEW_PASS`, etc.) for use by `forge-push.sh`. | bin/disinto (init) |
| `lib/forge-push.sh` | `push_to_forge()` — pushes a local clone to the Forgejo remote and verifies the push. `_assert_forge_push_globals()` validates required env vars before use. Requires `FORGE_URL`, `FORGE_PASS`, `FACTORY_ROOT`, `PRIMARY_BRANCH`. **Auth**: uses `FORGE_PASS` (bot password) for git HTTP push — Forgejo 11.x rejects API tokens for `git push` (#361). | bin/disinto (init) |
| `lib/ops-setup.sh` | `setup_ops_repo()` — creates ops repo on Forgejo if it doesn't exist, configures bot collaborators, clones/initializes ops repo locally, seeds directory structure (vault, knowledge, evidence, sprints). Evidence subdirectories seeded: engagement/, red-team/, holdout/, evolution/, user-test/. Also seeds sprints/ for architect output. Exports `_ACTUAL_OPS_SLUG`. `migrate_ops_repo(ops_root, [primary_branch])` — idempotent migration helper that seeds missing directories and .gitkeep files on existing ops repos (pre-#407 deployments). | bin/disinto (init) |
| `lib/ci-setup.sh` | `_install_cron_impl()` — installs crontab entries for project agents. `_create_woodpecker_oauth_impl()` — creates OAuth2 app on Forgejo for Woodpecker. `_generate_woodpecker_token_impl()` — auto-generates WOODPECKER_TOKEN via OAuth2 flow. `_activate_woodpecker_repo_impl()` — activates repo in Woodpecker. All gated by `_load_ci_context()` which validates required env vars. | bin/disinto (init) |
| `lib/ci-setup.sh` | `_install_cron_impl()` — installs crontab entries for bare-metal deployments (compose mode uses polling loop instead). `_create_woodpecker_oauth_impl()` — creates OAuth2 app on Forgejo for Woodpecker. `_generate_woodpecker_token_impl()` — auto-generates WOODPECKER_TOKEN via OAuth2 flow. `_activate_woodpecker_repo_impl()` — activates repo in Woodpecker. All gated by `_load_ci_context()` which validates required env vars. | bin/disinto (init) |
| `lib/generators.sh` | Template generation for `disinto init`: `generate_compose()` — docker-compose.yml (uses `codeberg.org/forgejo/forgejo:11.0` tag; adds `security_opt: [apparmor:unconfined]` to all services for rootless container compatibility), `generate_caddyfile()` — Caddyfile, `generate_staging_index()` — staging index, `generate_deploy_pipelines()` — Woodpecker deployment pipeline configs. Requires `FACTORY_ROOT`, `PROJECT_NAME`, `PRIMARY_BRANCH`. | bin/disinto (init) |
| `lib/hire-agent.sh` | `disinto_hire_an_agent()` — user creation, `.profile` repo setup, formula copying, branch protection, and state marker creation for hiring a new agent. Requires `FORGE_URL`, `FORGE_TOKEN`, `FACTORY_ROOT`, `PROJECT_NAME`. Extracted from `bin/disinto`. | bin/disinto (hire) |
| `lib/release.sh` | `disinto_release()` — vault TOML creation, branch setup on ops repo, PR creation, and auto-merge request for a versioned release. `_assert_release_globals()` validates required env vars. Requires `FORGE_URL`, `FORGE_TOKEN`, `FORGE_OPS_REPO`, `FACTORY_ROOT`, `PRIMARY_BRANCH`. Extracted from `bin/disinto`. | bin/disinto (release) |

View file

@ -1,9 +1,9 @@
#!/usr/bin/env bash
# =============================================================================
# ci-setup.sh — CI setup functions for Woodpecker and cron configuration
# ci-setup.sh — CI setup functions for Woodpecker and scheduling configuration
#
# Internal functions (called via _load_ci_context + _*_impl):
# _install_cron_impl() - Install crontab entries for project agents
# _install_cron_impl() - Install crontab entries (bare-metal only; compose uses polling loop)
# _create_woodpecker_oauth_impl() - Create OAuth2 app on Forgejo for Woodpecker
# _generate_woodpecker_token_impl() - Auto-generate WOODPECKER_TOKEN via OAuth2 flow
# _activate_woodpecker_repo_impl() - Activate repo in Woodpecker
@ -30,12 +30,13 @@ _load_ci_context() {
fi
}
# Generate and optionally install cron entries for the project agents.
# Generate and optionally install cron entries for bare-metal deployments.
# In compose mode, the agents container uses a polling loop (entrypoint.sh) instead.
# Usage: install_cron <name> <toml_path> <auto_yes> <bare>
_install_cron_impl() {
local name="$1" toml="$2" auto_yes="$3" bare="${4:-false}"
# In compose mode, skip host cron — the agents container runs cron internally
# In compose mode, skip host cron — the agents container uses a polling loop
if [ "$bare" = false ]; then
echo ""
echo "Cron: skipped (agents container handles scheduling in compose mode)"

View file

@ -1,11 +1,11 @@
#!/usr/bin/env bash
# formula-session.sh — Shared helpers for formula-driven cron agents
# formula-session.sh — Shared helpers for formula-driven polling-loop agents
#
# Provides reusable utility functions for the common cron-wrapper pattern
# Provides reusable utility functions for the common polling-loop wrapper pattern
# used by planner-run.sh, predictor-run.sh, gardener-run.sh, and supervisor-run.sh.
#
# Functions:
# acquire_cron_lock LOCK_FILE — PID lock with stale cleanup
# acquire_run_lock LOCK_FILE — PID lock with stale cleanup
# load_formula FORMULA_FILE — sets FORMULA_CONTENT
# build_context_block FILE [FILE ...] — sets CONTEXT_BLOCK
# build_prompt_footer [EXTRA_API_LINES] — sets PROMPT_FOOTER (API ref + env)
@ -30,24 +30,24 @@
#
# Requires: lib/env.sh, lib/worktree.sh sourced first for shared helpers.
# ── Cron guards ──────────────────────────────────────────────────────────
# ── Run guards ───────────────────────────────────────────────────────────
# acquire_cron_lock LOCK_FILE
# acquire_run_lock LOCK_FILE
# Acquires a PID lock. Exits 0 if another instance is running.
# Sets an EXIT trap to clean up the lock file.
acquire_cron_lock() {
_CRON_LOCK_FILE="$1"
if [ -f "$_CRON_LOCK_FILE" ]; then
acquire_run_lock() {
_RUN_LOCK_FILE="$1"
if [ -f "$_RUN_LOCK_FILE" ]; then
local lock_pid
lock_pid=$(cat "$_CRON_LOCK_FILE" 2>/dev/null || true)
lock_pid=$(cat "$_RUN_LOCK_FILE" 2>/dev/null || true)
if [ -n "$lock_pid" ] && kill -0 "$lock_pid" 2>/dev/null; then
log "run: already running (PID $lock_pid)"
exit 0
fi
rm -f "$_CRON_LOCK_FILE"
rm -f "$_RUN_LOCK_FILE"
fi
echo $$ > "$_CRON_LOCK_FILE"
trap 'rm -f "$_CRON_LOCK_FILE"' EXIT
echo $$ > "$_RUN_LOCK_FILE"
trap 'rm -f "$_RUN_LOCK_FILE"' EXIT
}
# ── Agent identity resolution ────────────────────────────────────────────

View file

@ -1,5 +1,5 @@
#!/usr/bin/env bash
# guard.sh — Active-state guard for cron entry points
# guard.sh — Active-state guard for polling-loop entry points
#
# Each agent checks for a state file before running. If the file
# doesn't exist, the agent logs a skip and exits cleanly.

View file

@ -1,12 +1,12 @@
#!/usr/bin/env bash
# =============================================================================
# planner-run.sh — Cron wrapper: planner execution via SDK + formula
# planner-run.sh — Polling-loop wrapper: planner execution via SDK + formula
#
# Synchronous bash loop using claude -p (one-shot invocation).
# No tmux sessions, no phase files — the bash script IS the state machine.
#
# Flow:
# 1. Guards: cron lock, memory check
# 1. Guards: run lock, memory check
# 2. Load formula (formulas/run-planner.toml)
# 3. Context: VISION.md, AGENTS.md, ops:RESOURCES.md, structural graph,
# planner memory, journal entries
@ -56,7 +56,7 @@ log() {
# ── Guards ────────────────────────────────────────────────────────────────
check_active planner
acquire_cron_lock "/tmp/planner-run.lock"
acquire_run_lock "/tmp/planner-run.lock"
memory_guard 2000
log "--- Planner run start ---"

View file

@ -1,12 +1,12 @@
#!/usr/bin/env bash
# =============================================================================
# predictor-run.sh — Cron wrapper: predictor execution via SDK + formula
# predictor-run.sh — Polling-loop wrapper: predictor execution via SDK + formula
#
# Synchronous bash loop using claude -p (one-shot invocation).
# No tmux sessions, no phase files — the bash script IS the state machine.
#
# Flow:
# 1. Guards: cron lock, memory check
# 1. Guards: run lock, memory check
# 2. Load formula (formulas/run-predictor.toml)
# 3. Context: AGENTS.md, ops:RESOURCES.md, VISION.md, structural graph
# 4. agent_run(worktree, prompt) → Claude analyzes, writes to ops repo
@ -14,7 +14,7 @@
# Usage:
# predictor-run.sh [projects/disinto.toml] # project config (default: disinto)
#
# Cron: 0 6 * * * cd /path/to/dark-factory && bash predictor/predictor-run.sh
# Called by: entrypoint.sh polling loop (daily)
# =============================================================================
set -euo pipefail
@ -57,7 +57,7 @@ log() {
# ── Guards ────────────────────────────────────────────────────────────────
check_active predictor
acquire_cron_lock "/tmp/predictor-run.lock"
acquire_run_lock "/tmp/predictor-run.lock"
memory_guard 2000
log "--- Predictor run start ---"

View file

@ -370,32 +370,32 @@
<div class="agent-card">
<div class="name">dev-agent</div>
<div class="role">Picks up backlog issues, <strong>implements code</strong> in isolated git worktrees, opens PRs. Runs as a persistent tmux session.</div>
<div class="trigger">Cron: every 5 min</div>
<div class="trigger">Polling loop: every 5 min</div>
</div>
<div class="agent-card">
<div class="name">review-agent</div>
<div class="role"><strong>Reviews PRs</strong> against project conventions. Approves clean PRs, requests specific changes on others.</div>
<div class="trigger">Cron: every 5 min</div>
<div class="trigger">Polling loop: every 5 min</div>
</div>
<div class="agent-card">
<div class="name">planner</div>
<div class="role">Reads VISION.md and repo state. <strong>Creates issues</strong> for gaps between where the project is and where it should be.</div>
<div class="trigger">Cron: weekly</div>
<div class="trigger">Polling loop: weekly</div>
</div>
<div class="agent-card">
<div class="name">gardener</div>
<div class="role"><strong>Grooms the backlog.</strong> Closes duplicates, promotes tech-debt issues, ensures issues are well-structured.</div>
<div class="trigger">Cron: every 6 hours</div>
<div class="trigger">Polling loop: every 6 hours</div>
</div>
<div class="agent-card">
<div class="name">supervisor</div>
<div class="role"><strong>Monitors factory health.</strong> Kills stale sessions, manages disk/memory, escalates persistent failures.</div>
<div class="trigger">Cron: every 10 min</div>
<div class="trigger">Polling loop: every 10 min</div>
</div>
<div class="agent-card">
<div class="name">predictor</div>
<div class="role">Detects <strong>infrastructure patterns</strong> &mdash; recurring failures, resource trends, emerging issues. Files predictions for triage.</div>
<div class="trigger">Cron: daily</div>
<div class="trigger">Polling loop: daily</div>
</div>
<div class="agent-card">
<div class="name">vault</div>
@ -473,7 +473,7 @@
<div class="label">Tech stack</div>
<p><strong>Bash scripts</strong> &mdash; every agent is a shell script. No compiled binaries, no runtimes to install.</p>
<p><strong>Claude CLI</strong> &mdash; AI is invoked via <code>claude -p</code> (one-shot) or <code>claude</code> (persistent tmux sessions).</p>
<p><strong>Cron</strong> &mdash; agents are triggered by cron jobs, not a daemon. Pull-based, not push-based.</p>
<p><strong>Polling loop</strong> &mdash; agents are triggered by a <code>while true</code> loop in <code>entrypoint.sh</code>. Pull-based, not push-based.</p>
<p><strong>Forgejo + Woodpecker</strong> &mdash; git hosting and CI. All state lives in git and the issue tracker. No external databases.</p>
<p><strong>Single VPS</strong> &mdash; runs on an 8 GB server. Flat cost, no scaling surprises.</p>
</div>
@ -485,7 +485,7 @@
<div class="principles">
<div class="principle">
<div class="id">AD-001</div>
<div class="text"><strong>Nervous system runs from cron, not action issues.</strong> Planner, predictor, gardener, supervisor run directly. They create work, they don't become work.</div>
<div class="text"><strong>Nervous system runs from a polling loop, not action issues.</strong> Planner, predictor, gardener, supervisor run directly. They create work, they don't become work.</div>
</div>
<div class="principle">
<div class="id">AD-002</div>
@ -514,9 +514,9 @@
disinto/
├── <span class="agent-name">dev/</span> dev-poll.sh, dev-agent.sh, phase-handler.sh
├── <span class="agent-name">review/</span> review-poll.sh, review-pr.sh
├── <span class="agent-name">gardener/</span> gardener-run.sh (cron executor)
├── <span class="agent-name">predictor/</span> predictor-run.sh (daily cron executor)
├── <span class="agent-name">planner/</span> planner-run.sh (weekly cron executor)
├── <span class="agent-name">gardener/</span> gardener-run.sh (polling-loop executor)
├── <span class="agent-name">predictor/</span> predictor-run.sh (daily polling-loop executor)
├── <span class="agent-name">planner/</span> planner-run.sh (weekly polling-loop executor)
├── <span class="agent-name">supervisor/</span> supervisor-run.sh (health monitoring)
├── <span class="agent-name">vault/</span> vault-env.sh (vault redesign in progress, see #73-#77)
├── <span class="agent-name">lib/</span> env.sh, agent-session.sh, ci-helpers.sh

View file

@ -353,7 +353,7 @@ cp .env.example .env
<span class="step-num">2</span>
Initialize your project
</div>
<p><code>disinto init</code> starts the full stack (Forgejo + Woodpecker CI), creates your repo, clones it locally, generates the project config, adds labels, and installs cron jobs &mdash; all in one command.</p>
<p><code>disinto init</code> starts the full stack (Forgejo + Woodpecker CI), creates your repo, clones it locally, generates the project config, adds labels, and configures agent scheduling &mdash; all in one command.</p>
<pre><code>bin/disinto init user/your-project</code></pre>
<p>Use <code>disinto up</code> / <code>disinto down</code> later to restart or stop the stack.</p>
<div class="expected">
@ -373,7 +373,7 @@ Creating labels on you/your-project...
+ vision
+ action
Created: /home/you/your-project/VISION.md
Cron entries installed
Agent scheduling configured
Done. Project your-project is ready.</code>
</div>
<p>Optional flags:</p>

View file

@ -712,7 +712,7 @@
<a href="/dashboard">dashboard</a>
</div>
<div class="under-hood">
Under the hood: dev, review, planner, gardener, supervisor, predictor, action, vault, exec — nine agents orchestrated by cron and bash.
Under the hood: dev, review, planner, gardener, supervisor, predictor, action, vault, exec — nine agents orchestrated by a polling loop and bash.
</div>
</div>

View file

@ -1,12 +1,12 @@
#!/usr/bin/env bash
# =============================================================================
# supervisor-run.sh — Cron wrapper: supervisor execution via SDK + formula
# supervisor-run.sh — Polling-loop wrapper: supervisor execution via SDK + formula
#
# Synchronous bash loop using claude -p (one-shot invocation).
# No tmux sessions, no phase files — the bash script IS the state machine.
#
# Flow:
# 1. Guards: cron lock, memory check
# 1. Guards: run lock, memory check
# 2. Housekeeping: clean up stale crashed worktrees
# 3. Collect pre-flight metrics (supervisor/preflight.sh)
# 4. Load formula (formulas/run-supervisor.toml)
@ -16,7 +16,7 @@
# Usage:
# supervisor-run.sh [projects/disinto.toml] # project config (default: disinto)
#
# Cron: */20 * * * * cd /path/to/dark-factory && bash supervisor/supervisor-run.sh
# Called by: entrypoint.sh polling loop (every 20 minutes)
# =============================================================================
set -euo pipefail
@ -79,7 +79,7 @@ log() {
# ── Guards ────────────────────────────────────────────────────────────────
check_active supervisor
acquire_cron_lock "/tmp/supervisor-run.lock"
acquire_run_lock "/tmp/supervisor-run.lock"
memory_guard 2000
log "--- Supervisor run start ---"

View file

@ -7,7 +7,7 @@
# 3. Run disinto init
# 4. Verify Forgejo state (users, repo)
# 5. Verify local state (TOML, .env, repo clone)
# 6. Verify cron setup
# 6. Verify scheduling setup
#
# Required env: FORGE_URL (default: http://localhost:3000)
# Required tools: bash, curl, jq, python3, git
@ -264,27 +264,24 @@ else
fail "Repo not cloned to /tmp/smoke-test-repo"
fi
# ── 6. Verify cron setup ────────────────────────────────────────────────────
echo "=== 6/6 Verifying cron setup ==="
cron_output=$(crontab -l 2>/dev/null) || cron_output=""
if [ -n "$cron_output" ]; then
if printf '%s' "$cron_output" | grep -q 'dev-poll.sh'; then
pass "Cron includes dev-poll entry"
else
fail "Cron missing dev-poll entry"
fi
if printf '%s' "$cron_output" | grep -q 'review-poll.sh'; then
pass "Cron includes review-poll entry"
else
fail "Cron missing review-poll entry"
fi
if printf '%s' "$cron_output" | grep -q 'gardener-run.sh'; then
pass "Cron includes gardener entry"
else
fail "Cron missing gardener entry"
fi
# ── 6. Verify scheduling setup ──────────────────────────────────────────────
echo "=== 6/6 Verifying scheduling setup ==="
# In compose mode, scheduling is handled by the entrypoint.sh polling loop.
# In bare-metal mode (--bare), crontab entries are installed.
# The smoke test runs without --bare, so cron install is skipped.
if [ -f "${FACTORY_ROOT:-}/docker-compose.yml" ] 2>/dev/null || true; then
pass "Compose mode: scheduling handled by entrypoint.sh polling loop"
else
fail "No cron entries found (crontab -l returned empty)"
cron_output=$(crontab -l 2>/dev/null) || cron_output=""
if [ -n "$cron_output" ]; then
if printf '%s' "$cron_output" | grep -q 'dev-poll.sh'; then
pass "Bare-metal: crontab includes dev-poll entry"
else
fail "Bare-metal: crontab missing dev-poll entry"
fi
else
pass "No crontab entries (expected in non-bare mode)"
fi
fi
# ── Summary ──────────────────────────────────────────────────────────────────