disinto/formulas/run-gardener.toml
openhands 1743438968 fix: fix: gardener enforces AGENTS.md size limit + progressive disclosure split (#480)
Update the agents-update step in run-gardener.toml to enforce the ~200-line
size limit on root AGENTS.md. When exceeded, the gardener now splits
per-directory sections into {dir}/AGENTS.md files with watermarks,
replacing verbose sections in root with a summary table of pointers.
Root keeps: overview, directory map, architecture decisions, key conventions.
Per-directory files get: role, trigger, key files, env vars, lifecycle.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 12:25:55 +00:00

430 lines
20 KiB
TOML

# formulas/run-gardener.toml — Gardener housekeeping formula
#
# Defines the gardener's complete run: grooming (Claude session via
# gardener-agent.sh) + blocked-review + AGENTS.md maintenance + final
# commit-and-pr.
#
# No memory, no journal. The gardener does mechanical housekeeping
# based on current state — it doesn't need to remember past runs.
#
# Steps: preflight → grooming → dust-bundling → blocked-review → agents-update → commit-and-pr
name = "run-gardener"
description = "Mechanical housekeeping: grooming, blocked review, docs update"
version = 1
[context]
files = ["AGENTS.md", "VISION.md", "README.md"]
# ─────────────────────────────────────────────────────────────────────
# Step 1: preflight
# ─────────────────────────────────────────────────────────────────────
[[steps]]
id = "preflight"
title = "Pull latest code"
description = """
Set up the working environment for this gardener run.
1. Change to the project repository:
cd "$PROJECT_REPO_ROOT"
2. Pull the latest code:
git fetch origin "$PRIMARY_BRANCH" --quiet
git checkout "$PRIMARY_BRANCH" --quiet
git pull --ff-only origin "$PRIMARY_BRANCH" --quiet
3. Record the current HEAD SHA for AGENTS.md watermarks:
HEAD_SHA=$(git rev-parse HEAD)
echo "$HEAD_SHA" > /tmp/gardener-head-sha
"""
# ─────────────────────────────────────────────────────────────────────
# Step 2: grooming — Claude-driven backlog grooming
# ─────────────────────────────────────────────────────────────────────
[[steps]]
id = "grooming"
title = "Backlog grooming — triage all open issues"
description = """
Groom the open issue backlog. This step is the core Claude-driven analysis
(currently implemented in gardener-agent.sh with bash pre-checks).
Pre-checks (bash, zero tokens — detect problems before invoking Claude):
1. Fetch all open issues:
curl -sf -H "Authorization: token $CODEBERG_TOKEN" \
"$CODEBERG_API/issues?state=open&type=issues&limit=50&sort=updated&direction=desc"
2. Duplicate detection: compare issue titles pairwise. Normalize
(lowercase, strip prefixes like feat:/fix:/refactor:, collapse whitespace)
and flag pairs with >60% word overlap as possible duplicates.
3. Missing acceptance criteria: flag issues with body < 100 chars and
no checkboxes (- [ ] or - [x]).
4. Stale issues: flag issues with no update in 14+ days.
5. Blockers starving the factory (HIGHEST PRIORITY): find issues that
block backlog items but are NOT themselves labeled backlog. These
starve the dev-agent completely. Extract deps from ## Dependencies /
## Depends on / ## Blocked by sections of backlog issues and check
if each dependency is open + not backlog-labeled.
6. Tech-debt promotion: list all tech-debt labeled issues — goal is to
process them all (promote to backlog or classify as dust).
For each issue, choose ONE action and write to result file:
ACTION (substantial — promote, close duplicate, add acceptance criteria):
echo "ACTION: promoted #NNN to backlog — <reason>" >> "$RESULT_FILE"
echo "ACTION: closed #NNN as duplicate of #OLDER" >> "$RESULT_FILE"
DUST (trivial single-line edit, rename, comment, style, whitespace):
echo 'DUST: {"issue": NNN, "group": "<file-or-subsystem>", "title": "...", "reason": "..."}' >> "$RESULT_FILE"
Group by file or subsystem (e.g. "gardener", "lib/env.sh", "dev-poll").
Do NOT close dust issues the dust-bundling step auto-bundles groups
of 3+ into one backlog issue.
ESCALATE (needs human decision):
printf 'ESCALATE\n1. #NNN "title" — reason (a) option1 (b) option2\n' >> "$RESULT_FILE"
CLEAN (only if truly nothing to do):
echo 'CLEAN' >> "$RESULT_FILE"
Dust vs ore rules:
Dust: comment fix, variable rename, whitespace/formatting, single-line edit, trivial cleanup with no behavior change
Ore: multi-file changes, behavioral fixes, architectural improvements, security/correctness issues
Sibling dependency rule (CRITICAL):
Issues from the same PR review or code audit are SIBLINGS independent work items.
NEVER add bidirectional ## Dependencies between siblings (creates deadlocks).
Use ## Related for cross-references: "## Related\n- #NNN (sibling)"
7. Architecture decision alignment check (AD check):
For each open issue labeled 'backlog', check whether the issue
contradicts any architecture decision listed in the
## Architecture Decisions section of AGENTS.md.
Read AGENTS.md and extract the AD table. For each backlog issue,
compare the issue title and body against each AD. If an issue
clearly violates an AD:
a. Post a comment explaining the violation:
curl -sf -X POST -H "Authorization: token $CODEBERG_TOKEN" \
-H "Content-Type: application/json" \
"$CODEBERG_API/issues/<number>/comments" \
-d '{"body":"Closing: violates AD-NNN (<decision summary>). See AGENTS.md § Architecture Decisions."}'
b. Close the issue:
curl -sf -X PATCH -H "Authorization: token $CODEBERG_TOKEN" \
-H "Content-Type: application/json" \
"$CODEBERG_API/issues/<number>" \
-d '{"state":"closed"}'
c. Log to the result file:
echo "ACTION: closed #NNN — violates AD-NNN" >> "$RESULT_FILE"
Only close for clear, unambiguous violations. If the issue is
borderline or could be interpreted as compatible, leave it open
and ESCALATE instead.
Processing order:
1. Handle PRIORITY_blockers_starving_factory first promote or resolve
2. AD alignment check close backlog issues that violate architecture decisions
3. Process tech-debt issues by score (impact/effort)
4. Classify remaining items as dust or escalate
Do NOT bundle dust yourself the dust-bundling step handles accumulation,
dedup, TTL expiry, and bundling into backlog issues.
CRITICAL: If this step fails for any reason, log the failure and move on.
"""
needs = ["preflight"]
# ─────────────────────────────────────────────────────────────────────
# Step 3: dust-bundling — accumulate, expire, and bundle dust items
# ─────────────────────────────────────────────────────────────────────
[[steps]]
id = "dust-bundling"
title = "Accumulate dust, expire stale entries, and bundle groups"
description = """
Process DUST items emitted during grooming. This step maintains the
persistent dust accumulator at $PROJECT_REPO_ROOT/gardener/dust.jsonl.
IMPORTANT: Use $PROJECT_REPO_ROOT/gardener/dust.jsonl (the main repo
checkout), NOT the worktree copy the worktree is destroyed after the
session, so changes there would be lost.
1. Collect DUST JSON lines emitted during grooming (from the result file
or your notes). Each has: {"issue": NNN, "group": "...", "title": "...", "reason": "..."}
2. Deduplicate: read existing dust.jsonl and skip any issue numbers that
are already staged:
DUST_FILE="$PROJECT_REPO_ROOT/gardener/dust.jsonl"
touch "$DUST_FILE"
EXISTING=$(jq -r '.issue' "$DUST_FILE" 2>/dev/null | sort -nu || true)
For each new dust item, check if its issue number is in EXISTING.
Add new entries with a timestamp:
echo '{"issue":NNN,"group":"...","title":"...","reason":"...","ts":"YYYY-MM-DDTHH:MM:SSZ"}' >> "$DUST_FILE"
3. Expire stale entries (30-day TTL):
CUTOFF=$(date -u -d '30 days ago' +%Y-%m-%dT%H:%M:%SZ)
jq -c --arg c "$CUTOFF" 'select(.ts >= $c)' "$DUST_FILE" > "${DUST_FILE}.tmp" && mv "${DUST_FILE}.tmp" "$DUST_FILE"
4. Bundle groups with 3+ distinct issues:
a. Count distinct issues per group:
jq -r '[.group, (.issue | tostring)] | join("\\t")' "$DUST_FILE" | sort -u | cut -f1 | sort | uniq -c | sort -rn
b. For each group with count >= 3:
- Collect issue details and distinct issue numbers for the group
- Look up the backlog label ID:
BACKLOG_LABEL_ID=$(curl -sf -H "Authorization: token $CODEBERG_TOKEN" \
"$CODEBERG_API/labels" | jq -r '.[] | select(.name == "backlog") | .id')
- Create a bundled backlog issue:
curl -sf -X POST -H "Authorization: token $CODEBERG_TOKEN" \
-H "Content-Type: application/json" "$CODEBERG_API/issues" \
-d '{"title":"fix: bundled dust cleanup — GROUP","body":"...","labels":[LABEL_ID]}'
- Close each source issue with a cross-reference comment:
curl ... "$CODEBERG_API/issues/NNN/comments" -d '{"body":"Bundled into #NEW"}'
curl ... "$CODEBERG_API/issues/NNN" -d '{"state":"closed"}'
- Remove bundled items from dust.jsonl:
jq -c --arg g "GROUP" 'select(.group != $g)' "$DUST_FILE" > "${DUST_FILE}.tmp" && mv "${DUST_FILE}.tmp" "$DUST_FILE"
5. If no DUST items were emitted and no groups are ripe, skip this step.
CRITICAL: If this step fails, log the failure and move on to blocked-review.
"""
needs = ["grooming"]
# ─────────────────────────────────────────────────────────────────────
# Step 4: blocked-review — triage blocked issues
# ─────────────────────────────────────────────────────────────────────
[[steps]]
id = "blocked-review"
title = "Review issues labeled blocked"
description = """
Review all issues labeled 'blocked' and decide their fate.
(See issue #352 for the blocked label convention.)
1. Look up the 'blocked' label ID (Gitea needs integer IDs for label removal):
BLOCKED_LABEL_ID=$(curl -sf -H "Authorization: token $CODEBERG_TOKEN" \
"$CODEBERG_API/labels" | jq -r '.[] | select(.name == "blocked") | .id')
If the lookup fails, skip label removal and just post comments.
2. Fetch all blocked issues:
curl -sf -H "Authorization: token $CODEBERG_TOKEN" \
"$CODEBERG_API/issues?state=open&type=issues&labels=blocked&limit=50"
3. For each blocked issue, read the full body and comments:
curl -sf -H "Authorization: token $CODEBERG_TOKEN" \
"$CODEBERG_API/issues/<number>"
curl -sf -H "Authorization: token $CODEBERG_TOKEN" \
"$CODEBERG_API/issues/<number>/comments"
4. Check dependencies extract issue numbers from ## Dependencies /
## Depends on / ## Blocked by sections. For each dependency:
curl -sf -H "Authorization: token $CODEBERG_TOKEN" \
"$CODEBERG_API/issues/<dep_number>"
Check if the dependency is now closed.
5. For each blocked issue, choose ONE action:
UNBLOCK all dependencies are now closed or the blocking condition resolved:
a. Remove the 'blocked' label (using ID from step 1):
curl -sf -X DELETE -H "Authorization: token $CODEBERG_TOKEN" \
"$CODEBERG_API/issues/<number>/labels/$BLOCKED_LABEL_ID"
b. Add context comment explaining what changed:
curl -sf -X POST -H "Authorization: token $CODEBERG_TOKEN" \
-H "Content-Type: application/json" \
"$CODEBERG_API/issues/<number>/comments" \
-d '{"body":"Unblocked: <explanation of what resolved the blocker>"}'
NEEDS HUMAN blocking condition is ambiguous, requires architectural
decision, or involves external factors:
a. Post a diagnostic comment explaining what you found and what
decision is needed
b. Leave the 'blocked' label in place
CLOSE issue is stale (blocked 30+ days with no progress on blocker),
the blocker is wontfix, or the issue is no longer relevant:
a. Post a comment explaining why:
curl -sf -X POST -H "Authorization: token $CODEBERG_TOKEN" \
-H "Content-Type: application/json" \
"$CODEBERG_API/issues/<number>/comments" \
-d '{"body":"Closing: <reason — stale blocker, no longer relevant, etc.>"}'
b. Close the issue:
curl -sf -X PATCH -H "Authorization: token $CODEBERG_TOKEN" \
-H "Content-Type: application/json" \
"$CODEBERG_API/issues/<number>" \
-d '{"state":"closed"}'
CRITICAL: If this step fails, log the failure and move on.
"""
needs = ["dust-bundling"]
# ─────────────────────────────────────────────────────────────────────
# Step 5: agents-update — AGENTS.md watermark staleness + size enforcement
# ─────────────────────────────────────────────────────────────────────
[[steps]]
id = "agents-update"
title = "Check AGENTS.md watermarks, update stale files, enforce size limit"
description = """
Check all AGENTS.md files for staleness, update any that are outdated, and
enforce the ~200-line size limit via progressive disclosure splitting.
This keeps documentation fresh runs 2x/day so drift stays small.
## Part A: Watermark staleness check and update
1. Read the HEAD SHA from preflight:
HEAD_SHA=$(cat /tmp/gardener-head-sha)
2. Find all AGENTS.md files:
find "$PROJECT_REPO_ROOT" -name "AGENTS.md" -not -path "*/.git/*"
3. For each file, read the watermark from line 1:
<!-- last-reviewed: <sha> -->
4. Check for changes since the watermark:
git log --oneline <watermark>..HEAD -- <directory>
If zero changes, the file is current skip it.
5. For stale files:
- Read the AGENTS.md and the source files in that directory
- Update the documentation to reflect code changes since the watermark
- Set the watermark to the HEAD SHA from the preflight step
- Conventions: architecture and WHY not implementation details
## Part B: Size limit enforcement (progressive disclosure split)
After all updates are done, count lines in the root AGENTS.md:
wc -l < "$PROJECT_REPO_ROOT/AGENTS.md"
If the root AGENTS.md exceeds 200 lines, perform a progressive disclosure
split. The principle: agent reads the map, drills into detail only when
needed. You wouldn't dump a 500-page wiki on a new hire's first morning.
6. Identify per-directory sections to extract. Each agent section under
"## Agents" (e.g. "### Dev (`dev/`)", "### Review (`review/`)") and
each helper section (e.g. "### Shared helpers (`lib/`)") is a candidate.
Also extract verbose subsections like "## Issue lifecycle and label
conventions" and "## Phase-Signaling Protocol" into docs/ or the
relevant directory.
7. For each section to extract, create a `{dir}/AGENTS.md` file with:
- Line 1: watermark <!-- last-reviewed: <HEAD_SHA> -->
- The full section content (role, trigger, key files, env vars, lifecycle)
- Keep the same markdown structure and detail level
Example for dev/:
```
<!-- last-reviewed: abc123 -->
# Dev Agent
**Role**: Implement issues autonomously ...
**Trigger**: dev-poll.sh runs every 10 min ...
**Key files**: ...
**Environment variables consumed**: ...
**Lifecycle**: ...
```
8. Replace extracted sections in the root AGENTS.md with a concise
directory map table. The root file keeps ONLY:
- Watermark (line 1)
- ## What this repo is (brief overview)
- ## Directory layout (existing tree)
- ## Tech stack
- ## Coding conventions
- ## How to lint and test
- ## Agents — replaced with a summary table pointing to per-dir files:
## Agents
| Agent | Directory | Role | Guide |
|-------|-----------|------|-------|
| Dev | dev/ | Issue implementation | [dev/AGENTS.md](dev/AGENTS.md) |
| Review | review/ | PR review | [review/AGENTS.md](review/AGENTS.md) |
| Gardener | gardener/ | Backlog grooming | [gardener/AGENTS.md](gardener/AGENTS.md) |
| ... | ... | ... | ... |
- ## Shared helpers — replaced with a brief pointer:
"See [lib/AGENTS.md](lib/AGENTS.md) for the full helper reference."
Keep the summary table if it fits, or move it to lib/AGENTS.md.
- ## Issue lifecycle and label conventions — keep a brief summary
(labels table + dependency convention) or move verbose parts to
docs/PHASE-PROTOCOL.md
- ## Architecture Decisions — keep in root (humans write, agents enforce)
- ## Phase-Signaling Protocol — keep a brief summary with pointer:
"See [docs/PHASE-PROTOCOL.md](docs/PHASE-PROTOCOL.md) for the full spec."
9. Verify the root AGENTS.md is now under 200 lines:
LINE_COUNT=$(wc -l < "$PROJECT_REPO_ROOT/AGENTS.md")
if [ "$LINE_COUNT" -gt 200 ]; then
echo "WARNING: root AGENTS.md still $LINE_COUNT lines after split"
fi
If still over 200, trim further move more detail into per-directory
files. The root should read like a table of contents, not an encyclopedia.
10. Each new per-directory AGENTS.md must have a watermark on line 1.
The gardener maintains freshness for ALL AGENTS.md files root and
per-directory using the same watermark mechanism from Part A.
## Staging
11. Stage ALL AGENTS.md files you created or changed do NOT commit yet.
All git writes happen in the commit-and-pr step at the end:
find . -name "AGENTS.md" -not -path "./.git/*" -exec git add {} +
12. If no AGENTS.md files need updating AND root is under 200 lines,
skip this step entirely.
CRITICAL: If this step fails for any reason, log the failure and move on.
Do NOT let an AGENTS.md failure prevent the commit-and-pr step.
"""
needs = ["blocked-review"]
# ─────────────────────────────────────────────────────────────────────
# Step 6: commit-and-pr — single commit with all file changes
# ─────────────────────────────────────────────────────────────────────
[[steps]]
id = "commit-and-pr"
title = "One commit with all file changes, push, create PR"
description = """
Collect all file changes from this run (AGENTS.md updates) into a single commit.
API calls (issue creation, PR comments, closures) already happened during the
run only file changes need the PR.
1. Check for staged or unstaged changes:
cd "$PROJECT_REPO_ROOT"
git status --porcelain
If there are no file changes, skip this entire step no commit, no PR.
2. If there are changes:
a. Create a branch:
BRANCH="chore/gardener-$(date -u +%Y%m%d-%H%M)"
git checkout -B "$BRANCH"
b. Stage all modified AGENTS.md files:
find . -name "AGENTS.md" -not -path "./.git/*" -exec git add {} +
c. Also stage any other files the gardener modified (if any):
git add -u
d. Commit:
git commit -m "chore: gardener housekeeping $(date -u +%Y-%m-%d)"
e. Push:
git push -u origin "$BRANCH"
f. Create a PR:
curl -sf -X POST \
-H "Authorization: token $CODEBERG_TOKEN" \
-H "Content-Type: application/json" \
"$CODEBERG_API/pulls" \
-d '{"title":"chore: gardener housekeeping",
"head":"<branch>","base":"<primary-branch>",
"body":"Automated gardener housekeeping — AGENTS.md updates.\n\nReview-agent fast-tracks doc-only PRs."}'
g. Return to primary branch:
git checkout "$PRIMARY_BRANCH"
3. If the PR creation fails (e.g. no changes after staging), log and continue.
"""
needs = ["agents-update"]