fix: duplicate-detection CI step fails on pre-existing duplicates unrelated to PR (#296)

Add baseline comparison to detect-duplicates.py: when DIFF_BASE is set
(via CI_COMMIT_TARGET_BRANCH for PRs), the script compares findings
against the base branch and only fails on new duplicates introduced by
the PR. Pre-existing duplicates are reported as informational.

For push events (no DIFF_BASE), the script reports all findings but
exits 0 (informational only). Removes failure:ignore from the CI step
so new duplicates properly block PRs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
openhands 2026-03-19 22:03:18 +00:00
parent 8d39ce79e7
commit 5e4b00f9a3
2 changed files with 183 additions and 34 deletions

View file

@ -3,7 +3,7 @@
#
# Steps:
# 1. shellcheck — lint all .sh files (warnings+errors)
# 2. duplicate-detection — report copy-pasted code blocks (non-blocking)
# 2. duplicate-detection — report copy-pasted code blocks (fails only on new duplicates for PRs)
when:
event: [push, pull_request]
@ -23,5 +23,7 @@ steps:
- name: duplicate-detection
image: python:3-alpine
commands:
- apk add --no-cache git
- python3 .woodpecker/detect-duplicates.py
failure: ignore
environment:
DIFF_BASE: ${CI_COMMIT_TARGET_BRANCH}