fix: [nomad-prep] P3 — add load_secret() abstraction to lib/env.sh (#793)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
1d4e28843e
commit
9dbc43ab23
3 changed files with 225 additions and 1 deletions
|
|
@ -6,7 +6,7 @@ sourced as needed.
|
|||
|
||||
| File | What it provides | Sourced by |
|
||||
|---|---|---|
|
||||
| `lib/env.sh` | Loads `.env`, sets `FACTORY_ROOT`, exports project config (`FORGE_REPO`, `PROJECT_NAME`, etc.), defines `log()`, `forge_api()`, `forge_api_all()` (paginates all pages; accepts optional second TOKEN parameter, defaults to `$FORGE_TOKEN`; handles invalid/empty JSON responses gracefully — returns empty on parse error instead of crashing), `woodpecker_api()`, `wpdb()`, `memory_guard()` (skips agent if RAM < threshold). Auto-loads project TOML if `PROJECT_TOML` is set. Exports per-agent tokens (`FORGE_PLANNER_TOKEN`, `FORGE_GARDENER_TOKEN`, `FORGE_VAULT_TOKEN`, `FORGE_SUPERVISOR_TOKEN`, `FORGE_PREDICTOR_TOKEN`) — each falls back to `$FORGE_TOKEN` if not set. **Vault-only token guard (AD-006)**: `unset GITHUB_TOKEN CLAWHUB_TOKEN` so agents never hold external-action tokens — only the runner container receives them. **Container note**: when `DISINTO_CONTAINER=1`, `.env` is NOT re-sourced — compose already injects env vars (including `FORGE_URL=http://forgejo:3000`) and re-sourcing would clobber them. **Save/restore scope (#364)**: only `FORGE_URL` is preserved across `.env` re-sourcing (compose injects `http://forgejo:3000`, `.env` has `http://localhost:3000`). `FORGE_TOKEN` is NOT preserved so refreshed tokens in `.env` take effect immediately. **Per-agent token override (#762)**: agent run scripts export `FORGE_TOKEN_OVERRIDE=<agent-specific-token>` BEFORE sourcing `env.sh`; `env.sh` applies this override at lines 98-100, ensuring the correct identity survives any re-sourcing of `env.sh` by nested shells or `claude -p` invocations. **Required env var**: `FORGE_PASS` — bot password for git HTTP push (Forgejo 11.x rejects API tokens for `git push`, #361). **Hard preconditions (#674)**: `USER` and `HOME` must be exported by the entrypoint before sourcing. When `PROJECT_TOML` is set, `PROJECT_REPO_ROOT`, `PRIMARY_BRANCH`, and `OPS_REPO_ROOT` must also be set (by entrypoint or TOML). | Every agent |
|
||||
| `lib/env.sh` | Loads `.env`, sets `FACTORY_ROOT`, exports project config (`FORGE_REPO`, `PROJECT_NAME`, etc.), defines `log()`, `forge_api()`, `forge_api_all()` (paginates all pages; accepts optional second TOKEN parameter, defaults to `$FORGE_TOKEN`; handles invalid/empty JSON responses gracefully — returns empty on parse error instead of crashing), `woodpecker_api()`, `wpdb()`, `memory_guard()` (skips agent if RAM < threshold), `load_secret()` (secret-source abstraction — see below). Auto-loads project TOML if `PROJECT_TOML` is set. Exports per-agent tokens (`FORGE_PLANNER_TOKEN`, `FORGE_GARDENER_TOKEN`, `FORGE_VAULT_TOKEN`, `FORGE_SUPERVISOR_TOKEN`, `FORGE_PREDICTOR_TOKEN`) — each falls back to `$FORGE_TOKEN` if not set. **Vault-only token guard (AD-006)**: `unset GITHUB_TOKEN CLAWHUB_TOKEN` so agents never hold external-action tokens — only the runner container receives them. **Container note**: when `DISINTO_CONTAINER=1`, `.env` is NOT re-sourced — compose already injects env vars (including `FORGE_URL=http://forgejo:3000`) and re-sourcing would clobber them. **Save/restore scope (#364)**: only `FORGE_URL` is preserved across `.env` re-sourcing (compose injects `http://forgejo:3000`, `.env` has `http://localhost:3000`). `FORGE_TOKEN` is NOT preserved so refreshed tokens in `.env` take effect immediately. **Per-agent token override (#762)**: agent run scripts export `FORGE_TOKEN_OVERRIDE=<agent-specific-token>` BEFORE sourcing `env.sh`; `env.sh` applies this override at lines 98-100, ensuring the correct identity survives any re-sourcing of `env.sh` by nested shells or `claude -p` invocations. **Required env var**: `FORGE_PASS` — bot password for git HTTP push (Forgejo 11.x rejects API tokens for `git push`, #361). **Hard preconditions (#674)**: `USER` and `HOME` must be exported by the entrypoint before sourcing. When `PROJECT_TOML` is set, `PROJECT_REPO_ROOT`, `PRIMARY_BRANCH`, and `OPS_REPO_ROOT` must also be set (by entrypoint or TOML). **`load_secret NAME [DEFAULT]` (#793)**: backend-agnostic secret resolution. Precedence: (1) `/secrets/<NAME>.env` — Nomad-rendered template, (2) current environment — already set by `.env.enc` / compose, (3) `secrets/<NAME>.enc` — age-encrypted per-key file (decrypted on demand, cached in process env), (4) DEFAULT or empty. Consumers call `$(load_secret GITHUB_TOKEN)` instead of `${GITHUB_TOKEN}` — identical behavior whether secrets come from Docker compose injection or Nomad Vault templates. | Every agent |
|
||||
| `lib/ci-helpers.sh` | `ci_passed()` — returns 0 if CI state is "success" (or no CI configured). `ci_required_for_pr()` — returns 0 if PR has code files (CI required), 1 if non-code only (CI not required). `is_infra_step()` — returns 0 if a single CI step failure matches infra heuristics (clone/git exit 128, any exit 137, log timeout patterns). `classify_pipeline_failure()` — returns "infra \<reason>" if any failed Woodpecker step matches infra heuristics via `is_infra_step()`, else "code". `ensure_priority_label()` — looks up (or creates) the `priority` label and returns its ID; caches in `_PRIORITY_LABEL_ID`. `ci_commit_status <sha>` — queries Woodpecker directly for CI state, falls back to forge commit status API. `ci_pipeline_number <sha>` — returns the Woodpecker pipeline number for a commit, falls back to parsing forge status `target_url`. `ci_promote <repo_id> <pipeline_num> <environment>` — promotes a pipeline to a named Woodpecker environment (vault-gated deployment: vault approves, vault-fire calls this — vault redesign in progress, see #73-#77). `ci_get_logs <pipeline_number> [--step <name>]` — reads CI logs from Woodpecker SQLite database via `lib/ci-log-reader.py`; outputs last 200 lines to stdout. Requires mounted woodpecker-data volume at /woodpecker-data. | dev-poll, review-poll, review-pr |
|
||||
| `lib/ci-debug.sh` | CLI tool for Woodpecker CI: `list`, `status`, `logs`, `failures` subcommands. Not sourced — run directly. | Humans / dev-agent (tool access) |
|
||||
| `lib/ci-log-reader.py` | Python tool: reads CI logs from Woodpecker SQLite database. `<pipeline_number> [--step <name>]` — returns last 200 lines from failed steps (or specified step). Used by `ci_get_logs()` in ci-helpers.sh. Requires `WOODPECKER_DATA_DIR` (default: /woodpecker-data). | ci-helpers.sh |
|
||||
|
|
|
|||
62
lib/env.sh
62
lib/env.sh
|
|
@ -313,6 +313,68 @@ memory_guard() {
|
|||
fi
|
||||
}
|
||||
|
||||
# =============================================================================
|
||||
# SECRET LOADING ABSTRACTION
|
||||
# =============================================================================
|
||||
# load_secret NAME [DEFAULT]
|
||||
#
|
||||
# Resolves a secret value using the following precedence:
|
||||
# 1. /secrets/<NAME>.env — Nomad-rendered template (future)
|
||||
# 2. Current environment — already set by .env.enc, compose, etc.
|
||||
# 3. secrets/<NAME>.enc — age-encrypted per-key file (decrypted on demand)
|
||||
# 4. DEFAULT (or empty)
|
||||
#
|
||||
# Prints the resolved value to stdout. Caches age-decrypted values in the
|
||||
# process environment so subsequent calls are free.
|
||||
# =============================================================================
|
||||
load_secret() {
|
||||
local name="$1"
|
||||
local default="${2:-}"
|
||||
|
||||
# 1. Nomad-rendered template (future: Nomad writes /secrets/<NAME>.env)
|
||||
local nomad_path="/secrets/${name}.env"
|
||||
if [ -f "$nomad_path" ]; then
|
||||
# Source into a subshell to extract just the value
|
||||
local _nomad_val
|
||||
_nomad_val=$(
|
||||
set -a
|
||||
# shellcheck source=/dev/null
|
||||
source "$nomad_path"
|
||||
set +a
|
||||
printf '%s' "${!name:-}"
|
||||
)
|
||||
if [ -n "$_nomad_val" ]; then
|
||||
export "$name=$_nomad_val"
|
||||
printf '%s' "$_nomad_val"
|
||||
return 0
|
||||
fi
|
||||
fi
|
||||
|
||||
# 2. Already in environment (set by .env.enc, compose injection, etc.)
|
||||
if [ -n "${!name:-}" ]; then
|
||||
printf '%s' "${!name}"
|
||||
return 0
|
||||
fi
|
||||
|
||||
# 3. Age-encrypted per-key file: secrets/<NAME>.enc (#777)
|
||||
local _age_key="${HOME}/.config/sops/age/keys.txt"
|
||||
local _enc_path="${FACTORY_ROOT}/secrets/${name}.enc"
|
||||
if [ -f "$_enc_path" ] && [ -f "$_age_key" ] && command -v age &>/dev/null; then
|
||||
local _dec_val
|
||||
if _dec_val=$(age -d -i "$_age_key" "$_enc_path" 2>/dev/null) && [ -n "$_dec_val" ]; then
|
||||
export "$name=$_dec_val"
|
||||
printf '%s' "$_dec_val"
|
||||
return 0
|
||||
fi
|
||||
fi
|
||||
|
||||
# 4. Default (or empty)
|
||||
if [ -n "$default" ]; then
|
||||
printf '%s' "$default"
|
||||
fi
|
||||
return 0
|
||||
}
|
||||
|
||||
# Source tea helpers (available when tea binary is installed)
|
||||
if command -v tea &>/dev/null; then
|
||||
# shellcheck source=tea-helpers.sh
|
||||
|
|
|
|||
162
tests/smoke-load-secret.sh
Normal file
162
tests/smoke-load-secret.sh
Normal file
|
|
@ -0,0 +1,162 @@
|
|||
#!/usr/bin/env bash
|
||||
# tests/smoke-load-secret.sh — Unit tests for load_secret() precedence chain
|
||||
#
|
||||
# Covers the 4 precedence cases:
|
||||
# 1. /secrets/<NAME>.env (Nomad template)
|
||||
# 2. Current environment
|
||||
# 3. secrets/<NAME>.enc (age-encrypted per-key file)
|
||||
# 4. Default / empty fallback
|
||||
#
|
||||
# Required tools: bash, age (for case 3)
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
FACTORY_ROOT="$(cd "$(dirname "$0")/.." && pwd)"
|
||||
FAILED=0
|
||||
|
||||
fail() { printf 'FAIL: %s\n' "$*" >&2; FAILED=1; }
|
||||
pass() { printf 'PASS: %s\n' "$*"; }
|
||||
|
||||
# Set up a temp workspace and fake HOME so age key paths work
|
||||
test_dir=$(mktemp -d)
|
||||
fake_home=$(mktemp -d)
|
||||
trap 'rm -rf "$test_dir" "$fake_home"' EXIT
|
||||
|
||||
# Minimal env for sourcing env.sh's load_secret function without the full boot
|
||||
# We source the function definition directly to isolate the unit under test.
|
||||
# shellcheck disable=SC2034
|
||||
export USER="${USER:-test}"
|
||||
export HOME="$fake_home"
|
||||
|
||||
# Source env.sh to get load_secret (and FACTORY_ROOT)
|
||||
source "${FACTORY_ROOT}/lib/env.sh"
|
||||
|
||||
# ── Case 4: Default / empty fallback ────────────────────────────────────────
|
||||
echo "=== 1/5 Case 4: default fallback ==="
|
||||
|
||||
unset TEST_SECRET_FALLBACK 2>/dev/null || true
|
||||
val=$(load_secret TEST_SECRET_FALLBACK "my-default")
|
||||
if [ "$val" = "my-default" ]; then
|
||||
pass "load_secret returns default when nothing is set"
|
||||
else
|
||||
fail "Expected 'my-default', got '${val}'"
|
||||
fi
|
||||
|
||||
val=$(load_secret TEST_SECRET_FALLBACK)
|
||||
if [ -z "$val" ]; then
|
||||
pass "load_secret returns empty when no default and nothing set"
|
||||
else
|
||||
fail "Expected empty, got '${val}'"
|
||||
fi
|
||||
|
||||
# ── Case 2: Environment variable already set ────────────────────────────────
|
||||
echo "=== 2/5 Case 2: environment variable ==="
|
||||
|
||||
export TEST_SECRET_ENV="from-environment"
|
||||
val=$(load_secret TEST_SECRET_ENV "ignored-default")
|
||||
if [ "$val" = "from-environment" ]; then
|
||||
pass "load_secret returns env value over default"
|
||||
else
|
||||
fail "Expected 'from-environment', got '${val}'"
|
||||
fi
|
||||
unset TEST_SECRET_ENV
|
||||
|
||||
# ── Case 3: Age-encrypted per-key file ──────────────────────────────────────
|
||||
echo "=== 3/5 Case 3: age-encrypted secret ==="
|
||||
|
||||
if command -v age &>/dev/null && command -v age-keygen &>/dev/null; then
|
||||
# Generate a test age key
|
||||
age_key_dir="${fake_home}/.config/sops/age"
|
||||
mkdir -p "$age_key_dir"
|
||||
age-keygen -o "${age_key_dir}/keys.txt" 2>/dev/null
|
||||
pub_key=$(age-keygen -y "${age_key_dir}/keys.txt")
|
||||
|
||||
# Create encrypted secret
|
||||
secrets_dir="${FACTORY_ROOT}/secrets"
|
||||
mkdir -p "$secrets_dir"
|
||||
printf 'age-test-value' | age -r "$pub_key" -o "${secrets_dir}/TEST_SECRET_AGE.enc"
|
||||
|
||||
unset TEST_SECRET_AGE 2>/dev/null || true
|
||||
val=$(load_secret TEST_SECRET_AGE "fallback")
|
||||
if [ "$val" = "age-test-value" ]; then
|
||||
pass "load_secret decrypts age-encrypted secret"
|
||||
else
|
||||
fail "Expected 'age-test-value', got '${val}'"
|
||||
fi
|
||||
|
||||
# Verify caching: call load_secret directly (not in subshell) so export propagates
|
||||
unset TEST_SECRET_AGE 2>/dev/null || true
|
||||
load_secret TEST_SECRET_AGE >/dev/null
|
||||
if [ "${TEST_SECRET_AGE:-}" = "age-test-value" ]; then
|
||||
pass "load_secret caches decrypted value in environment (direct call)"
|
||||
else
|
||||
fail "Decrypted value not cached in environment"
|
||||
fi
|
||||
|
||||
# Clean up test secret
|
||||
rm -f "${secrets_dir}/TEST_SECRET_AGE.enc"
|
||||
rmdir "$secrets_dir" 2>/dev/null || true
|
||||
unset TEST_SECRET_AGE
|
||||
else
|
||||
echo "SKIP: age/age-keygen not found — skipping age decryption test"
|
||||
fi
|
||||
|
||||
# ── Case 1: Nomad template path ────────────────────────────────────────────
|
||||
echo "=== 4/5 Case 1: Nomad template (/secrets/<NAME>.env) ==="
|
||||
|
||||
nomad_dir="/secrets"
|
||||
if [ -w "$(dirname "$nomad_dir")" ] 2>/dev/null || [ -w "$nomad_dir" ] 2>/dev/null; then
|
||||
mkdir -p "$nomad_dir"
|
||||
printf 'TEST_SECRET_NOMAD=from-nomad-template\n' > "${nomad_dir}/TEST_SECRET_NOMAD.env"
|
||||
|
||||
# Even with env set, Nomad path takes precedence
|
||||
export TEST_SECRET_NOMAD="from-env-should-lose"
|
||||
val=$(load_secret TEST_SECRET_NOMAD "default")
|
||||
if [ "$val" = "from-nomad-template" ]; then
|
||||
pass "load_secret prefers Nomad template over env"
|
||||
else
|
||||
fail "Expected 'from-nomad-template', got '${val}'"
|
||||
fi
|
||||
|
||||
rm -f "${nomad_dir}/TEST_SECRET_NOMAD.env"
|
||||
rmdir "$nomad_dir" 2>/dev/null || true
|
||||
unset TEST_SECRET_NOMAD
|
||||
else
|
||||
echo "SKIP: /secrets not writable — skipping Nomad template test (needs root or container)"
|
||||
fi
|
||||
|
||||
# ── Precedence: env beats age ────────────────────────────────────────────
|
||||
echo "=== 5/5 Precedence: env beats age-encrypted ==="
|
||||
|
||||
if command -v age &>/dev/null && command -v age-keygen &>/dev/null; then
|
||||
age_key_dir="${fake_home}/.config/sops/age"
|
||||
mkdir -p "$age_key_dir"
|
||||
[ -f "${age_key_dir}/keys.txt" ] || age-keygen -o "${age_key_dir}/keys.txt" 2>/dev/null
|
||||
pub_key=$(age-keygen -y "${age_key_dir}/keys.txt")
|
||||
|
||||
secrets_dir="${FACTORY_ROOT}/secrets"
|
||||
mkdir -p "$secrets_dir"
|
||||
printf 'age-value-should-lose' | age -r "$pub_key" -o "${secrets_dir}/TEST_SECRET_PREC.enc"
|
||||
|
||||
export TEST_SECRET_PREC="env-value-wins"
|
||||
val=$(load_secret TEST_SECRET_PREC "default")
|
||||
if [ "$val" = "env-value-wins" ]; then
|
||||
pass "load_secret prefers env over age-encrypted file"
|
||||
else
|
||||
fail "Expected 'env-value-wins', got '${val}'"
|
||||
fi
|
||||
|
||||
rm -f "${secrets_dir}/TEST_SECRET_PREC.enc"
|
||||
rmdir "$secrets_dir" 2>/dev/null || true
|
||||
unset TEST_SECRET_PREC
|
||||
else
|
||||
echo "SKIP: age not found — skipping precedence test"
|
||||
fi
|
||||
|
||||
# ── Summary ───────────────────────────────────────────────────────────────
|
||||
echo ""
|
||||
if [ "$FAILED" -ne 0 ]; then
|
||||
echo "=== SMOKE-LOAD-SECRET TEST FAILED ==="
|
||||
exit 1
|
||||
fi
|
||||
echo "=== SMOKE-LOAD-SECRET TEST PASSED ==="
|
||||
Loading…
Add table
Add a link
Reference in a new issue