refactor: make all scripts multi-project via env vars
Replace hardcoded harb references across the entire codebase: - HARB_REPO_ROOT → PROJECT_REPO_ROOT (with deprecated alias) - Derive PROJECT_NAME from CODEBERG_REPO slug - Add PRIMARY_BRANCH (master/main), WOODPECKER_REPO_ID env vars - Parameterize worktree prefixes, docker container names, branch refs - Genericize agent prompts (gardener, factory supervisor) - Update best-practices docs to use $-vars, prefix harb lessons All project-specific values now flow from .env → lib/env.sh → scripts. Backward-compatible: existing harb setups work without .env changes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
f16df6c53e
commit
90ef03a304
16 changed files with 117 additions and 116 deletions
|
|
@ -4,15 +4,15 @@
|
|||
- Woodpecker CI at localhost:8000 (Docker backend)
|
||||
- Postgres DB: use `wpdb` helper from env.sh
|
||||
- Woodpecker API: use `woodpecker_api` helper from env.sh
|
||||
- CI images: pre-built at `registry.niovi.voyage/harb/*:latest`
|
||||
- Example (harb): CI images pre-built at `registry.niovi.voyage/harb/*:latest`
|
||||
|
||||
## Safe Fixes
|
||||
- Retrigger CI: push empty commit to PR branch
|
||||
```bash
|
||||
cd /tmp/harb-worktree-<issue> && git commit --allow-empty -m "ci: retrigger" --no-verify && git push origin <branch> --force
|
||||
cd /tmp/${PROJECT_NAME}-worktree-<issue> && git commit --allow-empty -m "ci: retrigger" --no-verify && git push origin <branch> --force
|
||||
```
|
||||
- Restart woodpecker-agent: `sudo systemctl restart woodpecker-agent`
|
||||
- View pipeline status: `wpdb -c "SELECT number, status FROM pipelines WHERE repo_id=2 ORDER BY number DESC LIMIT 5;"`
|
||||
- View pipeline status: `wpdb -c "SELECT number, status FROM pipelines WHERE repo_id=$WOODPECKER_REPO_ID ORDER BY number DESC LIMIT 5;"`
|
||||
- View failed steps: `bash ${FACTORY_ROOT}/lib/ci-debug.sh failures <pipeline-number>`
|
||||
- View step logs: `bash ${FACTORY_ROOT}/lib/ci-debug.sh logs <pipeline-number> <step-name>`
|
||||
|
||||
|
|
@ -23,7 +23,7 @@
|
|||
## Known Issues
|
||||
- Codeberg rate-limits SSH clones. `git` step fails with exit 128. Retrigger usually works.
|
||||
- `log_entries` table grows fast (was 5.6GB once). Truncate periodically.
|
||||
- Running CI + harb stack = 14+ containers on 8GB. Memory pressure is real.
|
||||
- Example (harb): Running CI + harb stack = 14+ containers on 8GB. Memory pressure is real.
|
||||
- CI images take hours to rebuild. Never run `docker system prune -a`.
|
||||
|
||||
## Lessons Learned
|
||||
|
|
@ -31,10 +31,10 @@
|
|||
- Exit code 137 = OOM kill. Check memory, kill stale processes, retrigger.
|
||||
- `node-quality` step fails on eslint/typescript errors — these need code fixes, not CI fixes.
|
||||
|
||||
### FEE_DEST address must match DeployLocal.sol
|
||||
### Example (harb): FEE_DEST address must match DeployLocal.sol
|
||||
When DeployLocal.sol changes the feeDest address, bootstrap-common.sh must also be updated.
|
||||
Current feeDest = keccak256('harb.local.feeDest') = 0x8A9145E1Ea4C4d7FB08cF1011c8ac1F0e10F9383.
|
||||
Symptom: bootstrap step exits 1 after 'Granting recenter access to deployer' with no error — setRecenterAccess reverts because wrong address is impersonated.
|
||||
|
||||
### keccak-derived FEE_DEST requires anvil_setBalance before impersonation
|
||||
### Example (harb): keccak-derived FEE_DEST requires anvil_setBalance before impersonation
|
||||
When FEE_DEST is a keccak-derived address (e.g. keccak256('harb.local.feeDest')), it has zero ETH balance. Any function that calls `anvil_impersonateAccount` then `cast send --from $FEE_DEST --unlocked` will fail silently (output redirected to LOG_FILE) but exit 1 due to gas deduction failure. Fix: add `cast rpc anvil_setBalance "$FEE_DEST" "0xDE0B6B3A7640000"` before impersonation. Applied in both bootstrap-common.sh and red-team.sh.
|
||||
|
|
|
|||
|
|
@ -10,8 +10,8 @@ Codeberg rate-limits SSH and HTTPS clones. Symptoms:
|
|||
- **Do NOT retrigger** during a rate-limit storm. Wait 10-15 minutes.
|
||||
- Check if multiple pipelines failed on `git` step recently:
|
||||
```bash
|
||||
wpdb -c "SELECT number, status, to_timestamp(started) FROM pipelines WHERE repo_id=2 AND status='failure' ORDER BY number DESC LIMIT 5;"
|
||||
wpdb -c "SELECT s.name, s.exit_code FROM steps s JOIN pipelines p ON s.pipeline_id=p.id WHERE p.number=<N> AND p.repo_id=2 AND s.state='failure';"
|
||||
wpdb -c "SELECT number, status, to_timestamp(started) FROM pipelines WHERE repo_id=$WOODPECKER_REPO_ID AND status='failure' ORDER BY number DESC LIMIT 5;"
|
||||
wpdb -c "SELECT s.name, s.exit_code FROM steps s JOIN pipelines p ON s.pipeline_id=p.id WHERE p.number=<N> AND p.repo_id=$WOODPECKER_REPO_ID AND s.state='failure';"
|
||||
```
|
||||
- If multiple `git` failures with exit 128 in the last 15 min → it's rate limiting. Wait.
|
||||
- Only retrigger after 15+ minutes of no CI activity.
|
||||
|
|
|
|||
|
|
@ -5,13 +5,13 @@
|
|||
- `dev-agent.sh` uses `claude -p` for implementation, runs in git worktree
|
||||
- Lock file: `/tmp/dev-agent.lock` (contains PID)
|
||||
- Status file: `/tmp/dev-agent-status`
|
||||
- Worktrees: `/tmp/harb-worktree-<issue-number>/`
|
||||
- Worktrees: `/tmp/${PROJECT_NAME}-worktree-<issue-number>/`
|
||||
|
||||
## Safe Fixes
|
||||
- Remove stale lock: `rm -f /tmp/dev-agent.lock` (only if PID is dead)
|
||||
- Kill stuck agent: `kill <pid>` then clean lock
|
||||
- Restart on derailed PR: `bash ${FACTORY_ROOT}/dev/dev-agent.sh <issue-number> &`
|
||||
- Clean worktree: `cd /home/debian/harb && git worktree remove /tmp/harb-worktree-<N> --force`
|
||||
- Clean worktree: `cd $PROJECT_REPO_ROOT && git worktree remove /tmp/${PROJECT_NAME}-worktree-<N> --force`
|
||||
- Remove `in-progress` label if agent died without cleanup:
|
||||
```bash
|
||||
codeberg_api DELETE "/issues/<N>/labels/in-progress"
|
||||
|
|
@ -38,7 +38,7 @@
|
|||
|
||||
## Dependency Resolution
|
||||
|
||||
**Trust closed state.** If a dependency issue is closed, the code is on master. Period.
|
||||
**Trust closed state.** If a dependency issue is closed, the code is on the primary branch. Period.
|
||||
|
||||
DO NOT try to find the specific PR that closed an issue. This is over-engineering that causes false negatives:
|
||||
- Codeberg shares issue/PR numbering — no guaranteed relationship
|
||||
|
|
|
|||
|
|
@ -3,14 +3,14 @@
|
|||
## Safe Fixes
|
||||
- Docker cleanup: `sudo docker system prune -f` (keeps images, removes stopped containers + dangling layers)
|
||||
- Truncate factory logs >5MB: `truncate -s 0 <file>`
|
||||
- Remove stale worktrees: check `/tmp/harb-worktree-*`, only if dev-agent not running on them
|
||||
- Remove stale worktrees: check `/tmp/${PROJECT_NAME}-worktree-*`, only if dev-agent not running on them
|
||||
- Woodpecker log_entries: `DELETE FROM log_entries WHERE id < (SELECT max(id) - 100000 FROM log_entries);` then `VACUUM;`
|
||||
- Node module caches in worktrees: `rm -rf /tmp/harb-worktree-*/node_modules/`
|
||||
- Git garbage collection: `cd /home/debian/harb && git gc --prune=now`
|
||||
- Node module caches in worktrees: `rm -rf /tmp/${PROJECT_NAME}-worktree-*/node_modules/`
|
||||
- Git garbage collection: `cd $PROJECT_REPO_ROOT && git gc --prune=now`
|
||||
|
||||
## Dangerous (escalate)
|
||||
- `docker system prune -a --volumes` — deletes ALL images including CI build cache
|
||||
- Deleting anything in `/home/debian/harb/` that's tracked by git
|
||||
- Deleting anything in `$PROJECT_REPO_ROOT/` that's tracked by git
|
||||
- Truncating Woodpecker DB tables other than log_entries
|
||||
|
||||
## Known Disk Hogs
|
||||
|
|
|
|||
|
|
@ -1,39 +1,39 @@
|
|||
# Git Best Practices
|
||||
|
||||
## Environment
|
||||
- Repo: `/home/debian/harb`, remote: `codeberg.org/johba/harb`
|
||||
- Branch: `master` (protected — no direct push, PRs only)
|
||||
- Worktrees: `/tmp/harb-worktree-<issue>/`
|
||||
- Repo: `$PROJECT_REPO_ROOT`, remote: `$PROJECT_REMOTE`
|
||||
- Branch: `$PRIMARY_BRANCH` (protected — no direct push, PRs only)
|
||||
- Worktrees: `/tmp/${PROJECT_NAME}-worktree-<issue>/`
|
||||
|
||||
## Safe Fixes
|
||||
- Abort stale rebase: `cd /home/debian/harb && git rebase --abort`
|
||||
- Switch to master: `git checkout master`
|
||||
- Abort stale rebase: `cd $PROJECT_REPO_ROOT && git rebase --abort`
|
||||
- Switch to $PRIMARY_BRANCH: `git checkout $PRIMARY_BRANCH`
|
||||
- Prune worktrees: `git worktree prune`
|
||||
- Reset dirty state: `git checkout -- .` (only uncommitted changes)
|
||||
- Fetch latest: `git fetch origin master`
|
||||
- Fetch latest: `git fetch origin $PRIMARY_BRANCH`
|
||||
|
||||
## Auto-fixable by Supervisor
|
||||
- **Merge conflict on approved PR**: rebase onto master and force-push
|
||||
- **Merge conflict on approved PR**: rebase onto $PRIMARY_BRANCH and force-push
|
||||
```bash
|
||||
cd /tmp/harb-worktree-<issue> || git worktree add /tmp/harb-worktree-<issue> <branch>
|
||||
cd /tmp/harb-worktree-<issue>
|
||||
git fetch origin master
|
||||
git rebase origin/master
|
||||
cd /tmp/${PROJECT_NAME}-worktree-<issue> || git worktree add /tmp/${PROJECT_NAME}-worktree-<issue> <branch>
|
||||
cd /tmp/${PROJECT_NAME}-worktree-<issue>
|
||||
git fetch origin $PRIMARY_BRANCH
|
||||
git rebase origin/$PRIMARY_BRANCH
|
||||
# If conflict is trivial (NatSpec, comments): resolve and continue
|
||||
# If conflict is code logic: escalate to Clawy
|
||||
git push origin <branch> --force
|
||||
```
|
||||
- **Stale rebase**: `git rebase --abort && git checkout master`
|
||||
- **Wrong branch**: `git checkout master`
|
||||
- **Stale rebase**: `git rebase --abort && git checkout $PRIMARY_BRANCH`
|
||||
- **Wrong branch**: `git checkout $PRIMARY_BRANCH`
|
||||
|
||||
## Dangerous (escalate)
|
||||
- `git reset --hard` on any branch with unpushed work
|
||||
- Deleting remote branches
|
||||
- Force-pushing to any branch
|
||||
- Anything on the master branch directly
|
||||
- Anything on the $PRIMARY_BRANCH branch directly
|
||||
|
||||
## Known Issues
|
||||
- Main repo MUST be on master at all times. Dev work happens in worktrees.
|
||||
- Main repo MUST be on $PRIMARY_BRANCH at all times. Dev work happens in worktrees.
|
||||
- Stale rebases (detached HEAD) break all worktree creation — silent factory stall.
|
||||
- `git worktree add` fails if target directory exists (even empty). Remove first.
|
||||
- Many old branches exist locally (100+). Normal — don't bulk-delete.
|
||||
|
|
|
|||
|
|
@ -7,12 +7,12 @@
|
|||
## Safe Fixes (no permission needed)
|
||||
- Kill stale `claude` processes (>3h old): `pgrep -f "claude" --older 10800 | xargs kill`
|
||||
- Drop filesystem caches: `sync && echo 3 | sudo tee /proc/sys/vm/drop_caches`
|
||||
- Restart bloated Anvil: `sudo docker restart harb-anvil-1` (grows to 12GB+ over hours)
|
||||
- Restart bloated Anvil: `sudo docker restart ${PROJECT_NAME}-anvil-1` (grows to 12GB+ over hours)
|
||||
- Kill orphan node processes from dead worktrees
|
||||
|
||||
## Dangerous (escalate)
|
||||
- `docker system prune -a --volumes` — kills CI images, hours to rebuild
|
||||
- Stopping harb stack containers — breaks dev environment
|
||||
- Stopping project stack containers — breaks dev environment
|
||||
- OOM that survives all safe fixes — needs human decision on what to kill
|
||||
|
||||
## Known Memory Hogs
|
||||
|
|
@ -26,4 +26,4 @@
|
|||
## Lessons Learned
|
||||
- After killing processes, always `sync && echo 3 | sudo tee /proc/sys/vm/drop_caches`
|
||||
- Swap doesn't drain from dropping caches alone — it's actual paged-out process memory
|
||||
- Running CI + full harb stack = 14+ containers on 8GB. Only one pipeline at a time.
|
||||
- Running CI + full project stack = 14+ containers on 8GB. Only one pipeline at a time.
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue