Merge pull request 'fix: fix: predictor should dispatch actions through vault, not by filing action-labeled issues (#434)' (#447) from fix/issue-434 into main
All checks were successful
ci/woodpecker/push/ci Pipeline was successful

This commit is contained in:
dev-qwen 2026-04-08 19:23:47 +00:00
commit f278e8fb14

View file

@ -119,27 +119,24 @@ For each weakness you identify, choose one:
**Suggested action:** <what the planner should consider> **Suggested action:** <what the planner should consider>
**EXPLOIT** high confidence, have a theory you can test: **EXPLOIT** high confidence, have a theory you can test:
File a prediction/unreviewed issue AND an action issue that dispatches File a prediction/unreviewed issue AND a vault PR that dispatches
a formula to generate evidence. a formula to generate evidence (AD-006: external actions go through vault).
The prediction explains the theory. The action generates the proof. The prediction explains the theory. The vault PR triggers the proof
When the planner runs next, evidence is already there. after human approval. When the planner runs next, evidence is already there.
Action issue body format (label: action): Vault dispatch (requires lib/vault.sh):
Dispatched by predictor to test theory in #<prediction_number>. source "$PROJECT_REPO_ROOT/lib/vault.sh"
## Task TOML_CONTENT="id = \"predict-<prediction_number>-<formula>\"
Run <formula name> with focus on <specific test>. context = \"Test prediction #<prediction_number>: <theory summary> — focus: <specific test>\"
formula = \"<formula-name>\"
## Expected evidence secrets = []
Results in evidence/<dir>/<date>-<name>.json # Unblocks: #<prediction_number>
# Expected evidence: evidence/<dir>/<date>-<name>.json
## Acceptance criteria "
- [ ] Formula ran to completion PR_NUM=$(vault_request "predict-<prediction_number>-<formula>" "$TOML_CONTENT")
- [ ] Evidence file written with structured results echo "Vault PR #${PR_NUM} filed to test prediction #<prediction_number>"
## Affected files
- evidence/<dir>/
Available formulas (check $PROJECT_REPO_ROOT/formulas/*.toml for current list): Available formulas (check $PROJECT_REPO_ROOT/formulas/*.toml for current list):
cat "$PROJECT_REPO_ROOT/formulas/"*.toml | grep '^name' | head -10 cat "$PROJECT_REPO_ROOT/formulas/"*.toml | grep '^name' | head -10
@ -156,10 +153,10 @@ tea is pre-configured with login "$TEA_LOGIN" and repo "$FORGE_REPO".
tea issues create --login "$TEA_LOGIN" --repo "$FORGE_REPO" \ tea issues create --login "$TEA_LOGIN" --repo "$FORGE_REPO" \
--title "<title>" --body "<body>" --labels "prediction/unreviewed" --title "<title>" --body "<body>" --labels "prediction/unreviewed"
2. File action dispatches (if exploiting): 2. Dispatch formula via vault (if exploiting):
tea issues create --login "$TEA_LOGIN" --repo "$FORGE_REPO" \ source "$PROJECT_REPO_ROOT/lib/vault.sh"
--title "action: test prediction #NNN — <formula> <focus>" \ PR_NUM=$(vault_request "predict-NNN-<formula>" "$TOML_CONTENT")
--body "<body>" --labels "action" # See EXPLOIT section above for TOML_CONTENT format
3. Close superseded predictions: 3. Close superseded predictions:
tea issues close <number> --login "$TEA_LOGIN" --repo "$FORGE_REPO" tea issues close <number> --login "$TEA_LOGIN" --repo "$FORGE_REPO"
@ -173,11 +170,11 @@ tea is pre-configured with login "$TEA_LOGIN" and repo "$FORGE_REPO".
## Rules ## Rules
- Max 5 actions total (predictions + action dispatches combined) - Max 5 actions total (predictions + vault dispatches combined)
- Each exploit counts as 2 (prediction + action dispatch) - Each exploit counts as 2 (prediction + vault dispatch)
- So: 5 explores, or 2 exploits + 1 explore, or 1 exploit + 3 explores - So: 5 explores, or 2 exploits + 1 explore, or 1 exploit + 3 explores
- Never re-file a dismissed prediction without new evidence - Never re-file a dismissed prediction without new evidence
- Action issues must reference existing formulas don't invent formulas - Vault dispatches must reference existing formulas don't invent formulas
- Be specific: name the file, the metric, the threshold, the formula - Be specific: name the file, the metric, the threshold, the formula
- If no weaknesses found, file nothing that's a strong signal the project is healthy - If no weaknesses found, file nothing that's a strong signal the project is healthy