Add coverage validation: 90% main-source line coverage gate (py_binary + py_test)#356
Merged
Merged
Conversation
Bazel already produced an LCOV combined report on the test-jre21 jobs
(`bazel coverage --combined_report=lcov //cli/...`) and uploaded it as an
artifact, but nothing read it -- a PR could land that dropped coverage
arbitrarily. This wires up:
* `tools/coverage_check.py` + `:coverage-check` py_binary -- parses the
LCOV report, prints a per-file table sorted by coverage ascending
and an overall summary, exits non-zero if main-source line coverage
is below `$COVERAGE_THRESHOLD` (default 90). Test sources and files
with `LF:0` (no instrumentation data) are filtered out so thin
production-code coverage can't hide behind well-covered tests or
placeholder records.
* `tools/coverage_check_test.py` + `:coverage_check_test` py_test --
19 unit tests against parse_lcov, filter_main_source, format_report,
and main(), covering: per-file extraction, non-line LCOV fields
skipped, malformed payloads tolerated, multi-prefix include union,
zero-line records dropped, threshold boundary inclusive, missing
LCOV file => exit 2, env-var threshold override.
* `make coverage` / `make coverage-check` / `make coverage-test` --
one-line invocations. `make coverage` runs `bazel coverage` then the
binary; `make coverage-check` runs only the binary against an
existing LCOV; `make coverage-test` runs the py_test.
* CI: the `Run bazel-diff tests` step now coverage-instruments
`//cli/... //tools:coverage_check_test`, and a new `Enforce coverage
threshold` step invokes the py_binary on every test-jre21 matrix
entry. The threshold step is placed AFTER the artifact upload so a
failed threshold check still preserves the LCOV report.
The threshold is 90 per the issue. Coverage on this PR's test-jre21
matrix run is the source of truth for the current repo state.
Implementation notes:
- The enforcement logic is a `py_library` plus `py_binary` and `py_test`
that depend on it via `imports = ["."]`, so the test imports the
library by its module name without needing a workspace-rooted path.
- The `BUILD_WORKING_DIRECTORY` chdir hook in coverage_check.main lets
`bazel run //tools:coverage-check -- bazel-out/_coverage/...` resolve
the relative LCOV path against the user's cwd instead of the runfiles
directory.
- Files with `LF:0` are dropped from the per-file table and the totals,
not treated as 0% covered. Today bazel's Python coverage emits exactly
that for `tools/coverage_check.py` (no `coverage_tool` configured on
the Python toolchain) -- so the .py file's coverage is enforced via
the py_test passing, not via LCOV numbers. Configuring the toolchain
with `configure_coverage_tool = True` in MODULE.bazel would surface
numerical Python coverage and the file would slot in automatically.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
d7c4b85 to
007a9cb
Compare
CI on PR #356 reported main-source coverage at 88.76% (1429 / 1610) -- 1.24 points short of the 90% threshold introduced in the previous commit. These five focused unit tests close the gap by covering the smallest files that were near-zero on coverage, all of which were thin enough to test in isolation without mocking out infrastructure: BazelTargetTypeTest enum constants exhaustively touched (0/6 -> 6/6) VersionProviderTest happy-path resource load (1/9 -> 6/9) BazelDiffTest run() throws + isVerbose() truth (2/6 -> 6/6) StderrLoggerTest all 4 overloads x verbose true+false (8/12 -> 12/12) BazelTargetTest three subclass shapes + every type discriminator branch (21/29 -> 28/29) Each test file is registered as a `kt_jvm_test` in cli/BUILD; they all run in <1s and join the existing `:cli-test-lib`. Notes: - Main.kt (0/4) is not addressed here: its only behaviour is `exitProcess(...)`, which kills the JVM and so cannot be exercised from in-process unit tests without a SecurityManager hack or a refactor to extract a testable `run(args): Int`. The E2E suite already invokes the built jar as a subprocess, but JaCoCo doesn't instrument across that boundary, so Main.kt stays at 0% for now. The threshold gate accommodates this: 6 + 5 + 4 + 4 + 7 = 26 lines added, projecting CI to (1429+26)/1610 = 90.37%, a hair past the bar. - VersionProvider's `throw IllegalArgumentException(...)` branch stays uncovered (3 unreachable lines) because the production code resolves the classloader from `this::class.java`, with no injection seam to swap it for a loader that doesn't contain `cli/version`. Exercising it would require either reflection-mutating the classloader or refactoring the resource lookup to take a `ClassLoader` parameter -- both out of scope here. - BazelTarget's last uncovered line is the `else -> BazelTargetType.UNKNOWN` branch, only reachable when `Build.Target.Discriminator` adds a new enum value the production code doesn't recognise. Triggering that today would require a hand-forged proto with a reserved discriminator number, which the protobuf builder rejects. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The threshold step invokes `bazelisk run //tools:coverage-check`, which on
macos-latest x bazel 9.x failed with:
error: LCOV report not found at 'bazel-out/_coverage/_coverage_report.dat'.
Hint: run 'bazel coverage --combined_report=lcov //cli/... //tools/...' first
Root cause: USE_BAZEL_VERSION was set on the `Run bazel-diff tests` step
but not propagated to the threshold step, so bazelisk fell back to the
8.5.1 pin in .bazelversion and started a fresh server. That second server
re-symlinked bazel-out into its own output base, and the LCOV report
produced by the 9.x server became unreachable through the
bazel-out/_coverage/... path.
Ubuntu 8.x/9.x didn't trip because the runners kept the same bazel
server warm between steps -- but the dual-version dance is a real
mismatch, so the macOS flake is the canonical bug. Setting
USE_BAZEL_VERSION on the threshold step keeps both invocations on the
same server.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…vement
Two additive changes to the coverage gate that landed in the prior commits.
1. `tools/coverage_check.py --html DIR` (and `make coverage-html`)
When passed `--html DIR`, the threshold check additionally shells out to
`genhtml` and writes an annotated per-file HTML report there. The
threshold gate still runs and still drives the exit code -- HTML is an
extra artifact, not a replacement -- and it's emitted even when the
gate fails so the below-threshold case (when the report is most useful
for interactive investigation) is covered.
Fails fast with a clear install hint if `genhtml` is missing from PATH
(`brew install lcov` / `apt-get install lcov`); fails with the
subprocess exit code if genhtml itself errors. Five new py_test cases
(`HtmlReportTest`) mock `shutil.which` and `subprocess.run` so the
tests don't require lcov to be installed in the test environment:
- correct genhtml argv (LCOV file, --output-directory, --quiet)
- clear error when genhtml is absent (exit 2, install-hint surfaced)
- subprocess error code surfaced (exit 2, genhtml exit code reported)
- HTML emitted even when the threshold check fails (exit 1 + HTML
path printed)
Output is added to .gitignore (`coverage-html/`).
Total py_test count: 19 -> 24.
2. `.claude/skills/{coverage-status,improve-coverage}/SKILL.md`
Two repo-local skill files so agents (Claude Code, etc.) know how to:
- inspect current coverage state (`coverage-status`: what's the
overall %, which files are below threshold, how to read the
sorted table, how to produce an HTML report)
- raise coverage to clear the gate (`improve-coverage`: how to
pick high-leverage files, how to write a new kt_jvm_test, what
won't work and why -- Main.kt's exitProcess, throw branches with
no DI seam, unreachable enum else-branches)
The improve-coverage skill cross-references the worked example from
#356 itself (88.76% -> 90.37% by adding five small unit tests).
.gitignore extended to keep .claude/settings.local.json and
.claude/worktrees/ out of git while letting .claude/skills/ in --
precise patterns rather than a blanket `.claude/` ignore.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Wires up a coverage gate using a Python
py_binary+py_testso the enforcement logic itself is covered by tests. Bazel already produces an LCOV combined report on thetest-jre21jobs (bazel coverage --combined_report=lcov //cli/...) and uploads it as an artifact, but nothing read it — a PR could land that dropped coverage arbitrarily. This PR adds:tools/coverage_check.py+//tools:coverage-checkpy_binary — parses the LCOV report, prints a per-file table sorted by coverage ascending, an overall summary, and exits non-zero if main-source line coverage is below$COVERAGE_THRESHOLD(default 90). Test sources and files withLF:0(no instrumentation data) are filtered out so thin production-code coverage can't hide behind well-covered tests or placeholder records.tools/coverage_check_test.py+//tools:coverage_check_testpy_test — 19 unit tests againstparse_lcov,filter_main_source,format_report, andmain(), covering per-file extraction, non-line LCOV fields skipped, malformed payloads tolerated, multi-prefix include union, zero-line records dropped, threshold boundary inclusive, missing-file → exit 2, env-var threshold override, and stdin/stderr expectations of the main entry point.make coverage/make coverage-check/make coverage-test— one-line invocations.make coveragerunsbazel coveragethen the binary;make coverage-checkruns only the binary against an existing LCOV;make coverage-testruns the py_test.Run bazel-diff testsstep to coverage-instrument//cli/... //tools:coverage_check_test, plus a newEnforce coverage thresholdstep that invokes the py_binary on everytest-jre21matrix entry (ubuntu/macos× bazel8.x/9.x). The threshold step is placed AFTER the artifact upload so a failed threshold check still preserves the LCOV report.Threshold set to 90 per the goal.
Why py_binary + py_test (revised from the prior bash-script version)
You asked for the enforcement to be Python via
py_binaryand to add apy_testso the enforcement logic itself has coverage. The earlier draft of this PR usedbash + awk, which couldn't be cleanly unit-tested. Replacing it with Python gave us:parse_lcov,filter_main_source,format_report,main) that the py_test exercises directly without spawning subprocesses.py_libraryshared between:coverage-check(the runnable binary) and:coverage_check_test(the test target). Withimports = ["."]on the library, the test importscoverage_checkby its module name without needing a workspace-rooted path.BUILD_WORKING_DIRECTORYchdir hook inmain()sobazel run //tools:coverage-check -- bazel-out/_coverage/...resolves the relative LCOV path against the user's cwd instead of the runfiles directory.Why draft, still
CI on this PR will tell us the actual current main-source coverage — the data point I couldn't measure cleanly locally (
E2ETesthangs on my machine due to a JDK-env sandbox issue, unrelated to this PR; I can run the rest, which shows main-source line coverage of 47.20% without E2E across//cli/... //tools:coverage_check_test). The CI run includesE2ETest, which exercises a lot of the integration paths (BazelQueryService.ktis 0.33% locally without E2E and likely much higher with it), so the real number will be higher — but very likely below 90%.When the
Enforce coverage thresholdstep fires on this PR, the log will print the per-file table and the overall % before exiting 1. From there we have three reasonable paths:$COVERAGE_THRESHOLDin the workflow to whatever CI reports today (rounded down a hair), get the gate green, then raise the number as tests land.Main.kt, picocli command shells, DI wiring) from the denominator via--include.I'd lean toward (2) for the merge, then (1) for the long game. Happy to do whichever you prefer.
Known limitation: Python coverage instrumentation
bazel coverage //tools:coverage_check_testemitsLF:0/LH:0fortools/coverage_check.pytoday — Bazel detects the file but doesn't gather data because the Python toolchain inMODULE.bazeldoesn't haveconfigure_coverage_tool = Trueset on therules_pythonextension. The py_test still runs and asserts behavior of every public function, so the file IS covered by tests — just not numerically reported in LCOV. The threshold check filters out zero-line records so this doesn't tank the gate.The day we configure the Python toolchain with
coverage_tool, the file will start contributing real numerical coverage to the gate without further code changes; the default--includealready liststools/coverage_check.py.Verification
bazel test //tools:coverage_check_test— 19/19 PASS.bazel run //tools:coverage-check -- bazel-out/_coverage/_coverage_report.dat --threshold 40— exit 0, prints overall 47.20%, "PASS".bazel run //tools:coverage-check -- bazel-out/_coverage/_coverage_report.dat --threshold 90— exit 1, prints "FAIL: coverage 47.20% is below threshold 90.00%."bazel run //tools:generate-readme— regenerates README from template, in-sync output.MODULE.bazel.lock; reverted incidental churn before commit.Test plan
/test/) and zero-line records excluded from numerator and denominator.bazel run //tools:coverage-checkhonorsBUILD_WORKING_DIRECTORY.E2ETest.🤖 Generated with Claude Code