From 326ebb867a9f9d469ac2c5dc64c3e889a21326a3 Mon Sep 17 00:00:00 2001
From: architect-bot <architect-bot@disinto.local>
Date: Wed, 8 Apr 2026 20:04:29 +0000
Subject: [PATCH 1/2] sprint: add website-observability-wire-up.md

---
 sprints/website-observability-wire-up.md | 51 ++++++++++++++++++++++++
 1 file changed, 51 insertions(+)
 create mode 100644 sprints/website-observability-wire-up.md

diff --git a/sprints/website-observability-wire-up.md b/sprints/website-observability-wire-up.md
new file mode 100644
index 0000000..484cd51
--- /dev/null
+++ b/sprints/website-observability-wire-up.md
@@ -0,0 +1,51 @@
+# Sprint: website observability wire-up
+
+## Vision issues
+- #426 — Website observability — make disinto.ai an observable addressable
+
+## What this enables
+After this sprint, the factory can read engagement data from disinto.ai. The planner
+will have daily evidence files in `evidence/engagement/` to answer: how many people
+visited, where they came from, which pages they viewed. Observables will exist.
+The prerequisites for two milestones unlock:
+- Adoption: "Landing page communicating value proposition" (evidence confirms it works)
+- Ship (Fold 2): "Engagement measurement baked into deploy pipelines" (verify-observable step becomes non-advisory)
+
+## What exists today
+
+The design and most of the code are already done:
+
+- `site/collect-engagement.sh` — Complete. Parses Caddy's JSON access log, computes unique visitors / page views / top referrers, writes dated JSON evidence to `$OPS_REPO_ROOT/evidence/engagement/YYYY-MM-DD.json`.
+- `formulas/run-publish-site.toml` verify-observable step — Complete. Checks Caddy log activity, script presence, and evidence recency on every deploy.
+- `docs/EVIDENCE-ARCHITECTURE.md` — Documents the full pipeline: Caddy logs → collect-engagement → evidence/engagement/
+- `docs/OBSERVABLE-DEPLOY.md` — Documents the observable deploy pattern.
+- `docker/edge/Dockerfile` — Caddy edge container exists for the factory.
+
+What's missing is the wiring: three pieces that connect the parts together.
+
+## Complexity
+
+Files touched: 3-4 (lib/generators.sh, lib/ops-setup.sh, disinto-factory/SKILL.md, possibly site/collect-engagement.sh)
+Subsystems: Caddy config generation, ops repo initialization, cron registration
+Sub-issues: 3
+Gluecode ratio: ~85% gluecode, ~15% greenfield
+
+This is almost entirely wiring existing infrastructure — not new capability.
+
+## Risks
+
+- Production vs. factory Caddy: The factory's generated Caddyfile is for the local dev proxy (Forgejo + Woodpecker + staging). The production disinto.ai runs on a separate host Caddy. Sub-issue 1 must address both: update the template (for new deployments) and document the manual step for the existing production host.
+- Ops repo path on production host: collect-engagement.sh writes to $OPS_REPO_ROOT/evidence/engagement/. The script must commit via Forgejo API (no SSH required) so it can run on any host with a token.
+- Log format mismatch: collect-engagement.sh assumes Caddy's structured JSON format. If the production Caddy uses default Combined Log Format, the script will produce empty reports silently. Fix: add a format-detection guard.
+
+## Cost — new infra to maintain
+
+- One cron entry on the production host (55 23 * * *)
+- Daily evidence files in ops repo (evidence/engagement/*.json) — ~1KB/file, negligible storage
+- No new services, containers, or agents
+
+## Recommendation
+
+Worth it. Three small implementation issues close a design that is already fully specified and mostly implemented. The return: a live observable, daily engagement data in the ops repo, and the planner can begin evidence-driven gap analysis on the landing page. This is the cheapest path from Fold 2 to Fold 3.
+
+Do NOT add Plausible/Umami at this stage. The Caddy-log approach is cookie-free, zero-dependency, and already coded. Richer analytics can layer on top once basic observability is confirmed working.
-- 
2.49.1


From 75f06bd31366009818f1fe2e0255553c09fcfd23 Mon Sep 17 00:00:00 2001
From: architect-bot <architect-bot@disinto.local>
Date: Sun, 12 Apr 2026 00:58:08 +0000
Subject: [PATCH 2/2] sprint: add design forks for
 website-observability-wire-up

---
 sprints/website-observability-wire-up.md | 84 +++++++++++++++++++-----
 1 file changed, 69 insertions(+), 15 deletions(-)

diff --git a/sprints/website-observability-wire-up.md b/sprints/website-observability-wire-up.md
index 484cd51..a7087d3 100644
--- a/sprints/website-observability-wire-up.md
+++ b/sprints/website-observability-wire-up.md
@@ -21,31 +21,85 @@ The design and most of the code are already done:
 - `docs/OBSERVABLE-DEPLOY.md` — Documents the observable deploy pattern.
 - `docker/edge/Dockerfile` — Caddy edge container exists for the factory.
 
-What's missing is the wiring: three pieces that connect the parts together.
+What's missing is the wiring: connecting the factory to the remote Caddy host where
+disinto.ai runs.
 
 ## Complexity
 
-Files touched: 3-4 (lib/generators.sh, lib/ops-setup.sh, disinto-factory/SKILL.md, possibly site/collect-engagement.sh)
-Subsystems: Caddy config generation, ops repo initialization, cron registration
-Sub-issues: 3
-Gluecode ratio: ~85% gluecode, ~15% greenfield
-
-This is almost entirely wiring existing infrastructure — not new capability.
+Files touched: 4-6 depending on fork choices
+Subsystems: vault dispatch, SSH access, log collection, ops repo evidence
+Sub-issues: 3-4
+Gluecode ratio: ~80% gluecode, ~20% greenfield (the container/formula is new)
 
 ## Risks
 
-- Production vs. factory Caddy: The factory's generated Caddyfile is for the local dev proxy (Forgejo + Woodpecker + staging). The production disinto.ai runs on a separate host Caddy. Sub-issue 1 must address both: update the template (for new deployments) and document the manual step for the existing production host.
-- Ops repo path on production host: collect-engagement.sh writes to $OPS_REPO_ROOT/evidence/engagement/. The script must commit via Forgejo API (no SSH required) so it can run on any host with a token.
-- Log format mismatch: collect-engagement.sh assumes Caddy's structured JSON format. If the production Caddy uses default Combined Log Format, the script will produce empty reports silently. Fix: add a format-detection guard.
+- Production Caddy is on a separate host from the factory — all collection must go over SSH.
+- Log format mismatch: collect-engagement.sh assumes Caddy's structured JSON format. If the production Caddy uses default Combined Log Format, the script will produce empty reports silently.
+- SSH key scope: the key used for collection should be purpose-limited to avoid granting broad access.
+- Evidence commit: the container must commit evidence to the ops repo via Forgejo API (not git push over SSH) to keep the secret surface minimal.
 
 ## Cost — new infra to maintain
 
-- One cron entry on the production host (55 23 * * *)
-- Daily evidence files in ops repo (evidence/engagement/*.json) — ~1KB/file, negligible storage
-- No new services, containers, or agents
+- One vault action formula (`formulas/collect-engagement.toml` or extension of existing formula)
+- One SSH key on the Caddy host's authorized_keys
+- Daily evidence files in ops repo (evidence/engagement/*.json) — ~1KB/file
+- No new long-running services or agents
 
 ## Recommendation
 
-Worth it. Three small implementation issues close a design that is already fully specified and mostly implemented. The return: a live observable, daily engagement data in the ops repo, and the planner can begin evidence-driven gap analysis on the landing page. This is the cheapest path from Fold 2 to Fold 3.
+Worth it. The human-directed architecture (dispatchable container with SSH) is
+cleaner than running cron directly on the production host — it keeps all factory
+logic inside the factory and treats the Caddy host as a dumb data source.
 
-Do NOT add Plausible/Umami at this stage. The Caddy-log approach is cookie-free, zero-dependency, and already coded. Richer analytics can layer on top once basic observability is confirmed working.
+## Design forks
+
+### Q1: What does the container fetch from the Caddy host?
+
+*Context: `collect-engagement.sh` already parses Caddy JSON access logs into evidence JSON. The question is where that parsing happens.*
+
+- **(A) Fetch raw log, process locally**: Container SSHs in, copies today's access log segment (e.g. `rsync` or `scp`), then runs `collect-engagement.sh` inside the container against the local copy. The Caddy host needs zero disinto code installed.
+- **(B) Run script remotely**: Container SSHs in and executes `collect-engagement.sh` on the Caddy host. Requires the script (or a minimal version) to be deployed on the host. Output piped back.
+- **(C) Pull Caddy metrics API**: Container opens an SSH tunnel to Caddy's admin API (port 2019) and pulls request metrics directly. No log file parsing — but Caddy's metrics endpoint is less rich than full access log analysis (no referrers, no per-page breakdown).
+
+*Architect recommends (A): keeps the Caddy host dumb, all logic in the factory container, and `collect-engagement.sh` runs unchanged.*
+
+### Q2: How is the daily collection triggered?
+
+*Context: Other factory agents (supervisor, planner, gardener) run on direct cron via `*-run.sh`. Vault actions go through the PR approval workflow. The collection is a recurring low-risk read-only operation.*
+
+- **(A) Direct cron in edge container**: Add a cron entry to the edge container entrypoint, like supervisor/planner. Simple, no vault overhead. Runs daily without approval.
+- **(B) Vault action with auto-dispatch**: Create a recurring vault action TOML. If PR #12 (blast-radius tiers) lands, low-tier actions auto-execute. If not, each run needs admin approval — too heavy for daily collection.
+- **(C) Supervisor-triggered**: Supervisor detects stale evidence (no `evidence/engagement/` file for today) and dispatches collection. Reactive rather than scheduled.
+
+*Architect recommends (A): this is a read-only data collection, same risk profile as supervisor health checks. Vault gating a daily log fetch adds friction without security benefit.*
+
+### Q3: How is the SSH key provisioned for the collection container?
+
+*Context: The vault dispatcher supports `mounts: ["ssh"]` which mounts `~/.ssh` read-only into the container. The edge container already has SSH infrastructure for reverse tunnels (`disinto-tunnel` user, `autossh`).*
+
+- **(A) Factory operator's SSH keys** (`mounts: ["ssh"]`): Reuse the existing SSH keys on the factory host. Simple, but grants the container access to all hosts the operator can reach.
+- **(B) Dedicated purpose-limited key**: Generate a new SSH keypair, install the public key on the Caddy host with `command=` restriction (only allows `cat /var/log/caddy/access.log` or similar). Private key stored as `CADDY_SSH_KEY` in `.env.vault.enc`. Minimal blast radius.
+- **(C) Edge tunnel reverse path**: Instead of the factory SSHing *out* to Caddy, have the Caddy host push logs *in* via the existing reverse tunnel infrastructure. Inverts the connection direction but requires a log-push agent on the Caddy host.
+
+*Architect recommends (B): purpose-limited key with `command=` restriction on the Caddy host gives least-privilege access. The factory never gets a shell on production.*
+
+## Proposed sub-issues
+
+### If Q1=A, Q2=A, Q3=B (recommended path):
+
+1. **`collect-engagement` formula + container script**: Create `formulas/collect-engagement.toml` with steps: SSH into Caddy host using dedicated key → fetch today's access log segment → run `collect-engagement.sh` on local copy → commit evidence JSON to ops repo via Forgejo API. Add cron entry to edge container.
+2. **Format-detection guard in `collect-engagement.sh`**: Add a check at script start that verifies the input file is Caddy JSON format (not Combined Log Format). Fail loudly with actionable error if format is wrong.
+3. **`evidence/engagement/` directory + ops-setup wiring**: Ensure `lib/ops-setup.sh` creates the evidence directory. Register the engagement cron schedule in factory setup docs.
+4. **Document Caddy host SSH setup**: Rent-a-human instructions for: generate keypair, install public key with `command=` restriction on Caddy host, add private key to `.env.vault.enc`.
+
+### If Q1=B (remote execution):
+Sub-issues 2-4 remain the same. Sub-issue 1 changes: container SSHs in and runs the script remotely, requiring script deployment on the Caddy host (additional manual step).
+
+### If Q2=B (vault-gated):
+Sub-issue 1 changes: instead of cron, create a vault action TOML template and document the daily dispatch. Depends on PR #12 (blast-radius tiers) for auto-approval.
+
+### If Q3=A (operator SSH keys):
+Sub-issue 4 is simplified (no dedicated key generation), but blast radius is wider.
+
+### If Q3=C (reverse tunnel):
+Sub-issue 1 changes significantly: instead of SSH-out, configure a log-push cron on the Caddy host that sends logs through the reverse tunnel. More infrastructure on the Caddy host side.
-- 
2.49.1