bug: disinto-edge hard-fails on missing age key / secrets even when collect-engagement feature is not configured #1038
Labels
No labels
action
backlog
blocked
bug-report
cannot-reproduce
in-progress
in-triage
needs-triage
prediction/actioned
prediction/dismissed
prediction/unreviewed
priority
rejected
reproduced
tech-debt
underspecified
vision
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: disinto-admin/disinto#1038
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Problem
disinto-edgecrashloops on any deployment that hasn't opted into the age-encrypted secret store (#777), because the edge entrypoint treats four secrets as unconditionally required:Observed on
disinto-dev-box(containerdisinto-edge, restarting every ~30s), which blocks PR #1033 (edge-subpath smoke test) and any other work that depends on a running edge.Root cause
docker/edge/entrypoint-edge.sh:176-205requires:~/.config/sops/age/keys.txt/opt/disinto/secrets/with.encfiles forCADDY_SSH_KEY,CADDY_SSH_HOST,CADDY_SSH_USER,CADDY_ACCESS_LOG.These four secrets feed exactly one feature: the daily 23:50 UTC
collect-engagement.shcron (#745), which SCPs Caddy access logs from a remote production edge host for engagement parsing. On a local factory box (disinto-dev-box) or any deployment that hasn't set up a remote edge, this code path has no target to fetch from — yet its absence kills the whole edge container.Pre-#777 this was soft-degrade. #777 turned it into a hard-fail at startup as part of the broader "single granular secrets store" consolidation — the hard-fail fit the general secrets-required model but is wrong for this specific, optional feature.
Fix
Make the secrets block optional. When age key or secrets dir is missing, or any of the four CADDY_ secrets fail to decrypt, log a warning and skip the
collect-engagementcron loop. Caddy itself does not depend on these secrets and should start normally.Concrete edit to
docker/edge/entrypoint-edge.sh(around lines 176-205):Then gate the collect-engagement cron loop on
EDGE_ENGAGEMENT_READY:Acceptance criteria
disinto-edgestarts and stays healthy on a box with no age key and nosecrets/*.encfilescurl -fsS http://localhost:2019/config/in the container)shellcheck docker/edge/entrypoint-edge.shcleanTest
Manual verification on disinto-dev-box:
Non-goals
EDGE_REQUIRE_SECRETS=1opt-in strict mode (can be a follow-up; production edges currently work because they set up the secrets anyway).Affected files
docker/edge/entrypoint-edge.sh— soften secrets check, gate cron on ready flagRelated