Fix/ace 198/mtree diff max rows info message #134
Conversation
Up to standards ✅🟢 Issues
|
| Metric | Results |
|---|---|
| Complexity | 7 |
| Duplication | 0 |
NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.
ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughAdds a configurable per-node-pair cap (MaxDiffRows) on collected differing rows during mtree table-diff. The engine enforces the cap in worker fetch loops and appendDiffs, validates negative values, resolves defaults from config, tracks truncation, logs warnings, and includes corresponding unit tests. ChangesMax diff rows enforcement
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/consistency/mtree/merkle.go (1)
2137-2180: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick winNegative
max_diff_rowsfrom config is silently ignored instead of rejected.The early guard at Line 2139 only checks the caller-supplied
m.MaxDiffRows(defaults to 0). Config resolution at Lines 2167-2171 only takes the config valueif cfg.MTree.Diff.MaxDiffRows > 0, so a negative value inace.yaml/default_config.yamlis silently dropped and the diff falls back to unbounded mode instead of erroring — defeating both the "reject negative values" and the OOM-prevention goals of this PR.🐛 Proposed fix: validate after resolving from config
- // Reject a negative cap up front so a bad value can't silently disable the - // row bound (mirrors table_diff's validation). - if m.MaxDiffRows < 0 { - return fmt.Errorf("max_diff_rows must be >= 0, got %d", m.MaxDiffRows) - } - if err = m.UpdateMtree(true); err != nil { return fmt.Errorf("failed to update merkle tree before diff: %w", err) } ... if m.MaxDiffRows == 0 { - if cfg := config.Get(); cfg != nil && cfg.MTree.Diff.MaxDiffRows > 0 { + if cfg := config.Get(); cfg != nil { m.MaxDiffRows = cfg.MTree.Diff.MaxDiffRows } } + // Reject a negative cap (caller- or config-supplied) so a bad value can't + // silently disable the row bound (mirrors table_diff's validation). + if m.MaxDiffRows < 0 { + return fmt.Errorf("max_diff_rows must be >= 0, got %d", m.MaxDiffRows) + }Please also add a test covering a negative value sourced from config (not just a directly-set negative struct field).
#!/bin/bash # Check for any other validation of MaxDiffRows in config.go or elsewhere rg -n -C3 'MaxDiffRows' --type=go🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/consistency/mtree/merkle.go` around lines 2137 - 2180, The negative max_diff_rows validation in the mtree diff path only checks the pre-config value, so a negative value from config can still be ignored and fall back to unbounded diffing. In the diff setup logic in merkle.go, validate m.MaxDiffRows again after resolving the config value from config.Get().MTree.Diff.MaxDiffRows, and return an error for any value below zero before proceeding with UpdateMtree or building DiffResult. Add a test around the mtree diff flow that loads a negative max_diff_rows from config and asserts it is rejected, not silently skipped.
🧹 Nitpick comments (1)
internal/consistency/mtree/merkle.go (1)
568-624: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winPer-pair warning only logs once globally, not once per pair.
diffLimitWarnedis a single bool for the whole task, sonotePairLimitReachedLocked's warning names only the first node pair that hits the cap. If several pairs are independently truncated, operators only see one pair name in the logs (the summary-levelDiffRowLimitReachedflag/final warning still fire correctly for all cases).♻️ Suggested fix: track warned state per pair
- diffRowCounts map[string]int64 - diffLimitWarned bool + diffRowCounts map[string]int64 + diffPairsWarned map[string]struct{}func (m *MerkleTreeTask) notePairLimitReachedLocked(pairKey string) { m.DiffResult.Summary.DiffRowLimitReached = true - if !m.diffLimitWarned { - m.diffLimitWarned = true + if _, warned := m.diffPairsWarned[pairKey]; !warned { + if m.diffPairsWarned == nil { + m.diffPairsWarned = make(map[string]struct{}) + } + m.diffPairsWarned[pairKey] = struct{}{} logger.Warn("mtree table-diff: %s reached max_diff_rows limit (%d); enumeration for this pair stops (other pairs continue)", pairKey, m.MaxDiffRows) } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/consistency/mtree/merkle.go` around lines 568 - 624, The per-pair cap warning in MerkleTreeTask is tracked with a single diffLimitWarned flag, so notePairLimitReachedLocked only logs the first pair that hits MaxDiffRows. Update the warning tracking to be per pair in MerkleTreeTask (for example alongside diffRowCounts), and make notePairLimitReachedLocked/notePairLimitReached consult that per-pair state so each truncated pair can emit its own warning while still preserving the shared DiffRowLimitReached summary behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@internal/consistency/mtree/merkle_test.go`:
- Around line 100-109: Add coverage for config-based negative max_diff_rows
handling in DiffMtree: the current TestMtreeDiffRejectsNegativeMaxDiffRows only
exercises MerkleTreeTask.MaxDiffRows set directly, so add a case that leaves
m.MaxDiffRows at zero, sets config.MTree.Diff.MaxDiffRows to a negative value,
and verifies DiffMtree still returns the same validation error. Use the existing
DiffMtree and MerkleTreeTask symbols so the test targets the config-resolution
path in merkle.go.
---
Outside diff comments:
In `@internal/consistency/mtree/merkle.go`:
- Around line 2137-2180: The negative max_diff_rows validation in the mtree diff
path only checks the pre-config value, so a negative value from config can still
be ignored and fall back to unbounded diffing. In the diff setup logic in
merkle.go, validate m.MaxDiffRows again after resolving the config value from
config.Get().MTree.Diff.MaxDiffRows, and return an error for any value below
zero before proceeding with UpdateMtree or building DiffResult. Add a test
around the mtree diff flow that loads a negative max_diff_rows from config and
asserts it is rejected, not silently skipped.
---
Nitpick comments:
In `@internal/consistency/mtree/merkle.go`:
- Around line 568-624: The per-pair cap warning in MerkleTreeTask is tracked
with a single diffLimitWarned flag, so notePairLimitReachedLocked only logs the
first pair that hits MaxDiffRows. Update the warning tracking to be per pair in
MerkleTreeTask (for example alongside diffRowCounts), and make
notePairLimitReachedLocked/notePairLimitReached consult that per-pair state so
each truncated pair can emit its own warning while still preserving the shared
DiffRowLimitReached summary behavior.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 9e2f9d39-4b8d-4188-971d-cdb244ffa0e3
📒 Files selected for processing (6)
ace.yamldocs/commands/mtree/mtree-table-diff.mdinternal/cli/default_config.yamlinternal/consistency/mtree/merkle.gointernal/consistency/mtree/merkle_test.gopkg/config/config.go
…198) mtree table-diff stayed silent when the collected diffs equalled max_diff_rows: it only flagged truncation when a further row was refused, and it had no final-summary warning. table-diff flags at >= the cap and prints "additional differences may exist", so the two paths disagreed. - recordPairRowLocked now flags truncation once a pair reaches the cap, mirroring table-diff's incrementPairDiffRowsLocked (>= semantics). - Emit a final "mtree table-diff stopped after reaching max_diff_rows=N; additional differences may exist" warning before writing the report, the analog of table_diff.go's completion warning. - Update the boundary unit test to expect DiffRowLimitReached at the exact cap.
1befd21 to
b26ace6
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@internal/consistency/mtree/merkle.go`:
- Around line 2194-2202: The MaxDiffRows fallback in merkle.go only accepts
values greater than zero, so a negative mtree.diff.max_diff_rows silently leaves
the cap unset and unbounded. Update the config-loading path for MTree diff
settings, or the MaxDiffRows initialization logic in the merkle config code, to
validate and reject/clamp non-positive values instead of ignoring them. Use the
existing symbols config.Get, cfg.MTree.Diff.MaxDiffRows, and m.MaxDiffRows to
keep the OOM guard always enabled when the caller leaves it unset.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 8ace95ec-eae3-4f0c-8afa-166bd9c14220
📒 Files selected for processing (2)
internal/consistency/mtree/merkle.gointernal/consistency/mtree/merkle_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
- internal/consistency/mtree/merkle_test.go
There was a problem hiding this comment.
Caution
Inline review comments failed to post. This is likely due to GitHub's internal server error or limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@internal/consistency/mtree/merkle.go`:
- Around line 2194-2202: The MaxDiffRows fallback in merkle.go only accepts
values greater than zero, so a negative mtree.diff.max_diff_rows silently leaves
the cap unset and unbounded. Update the config-loading path for MTree diff
settings, or the MaxDiffRows initialization logic in the merkle config code, to
validate and reject/clamp non-positive values instead of ignoring them. Use the
existing symbols config.Get, cfg.MTree.Diff.MaxDiffRows, and m.MaxDiffRows to
keep the OOM guard always enabled when the caller leaves it unset.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 8ace95ec-eae3-4f0c-8afa-166bd9c14220
📒 Files selected for processing (2)
internal/consistency/mtree/merkle.gointernal/consistency/mtree/merkle_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
- internal/consistency/mtree/merkle_test.go
🛑 Comments failed to post (1)
internal/consistency/mtree/merkle.go (1)
2194-2202: 🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash # Find the mtree diff default(s) and any load-time validation for max_diff_rows. rg -nP 'max_diff_rows|MaxDiffRows' --type=go -g '!**/*_test.go' -C3 fd -e yaml -e yml | xargs rg -nP 'max_diff_rows' -C2 2>/dev/nullRepository: pgEdge/ace
Length of output: 215
🏁 Script executed:
#!/bin/bash set -euo pipefail # Locate the mtree config and the relevant source around the referenced lines. git ls-files | rg 'internal/consistency/mtree/merkle\.go|config|yaml|yml|toml|json|.*\.go$' >/tmp/files.txt echo "== merkle.go around referenced lines ==" sed -n '2175,2210p' internal/consistency/mtree/merkle.go echo echo "== mtree config definitions and defaults ==" rg -n --hidden --glob '!**/*_test.go' 'MaxDiffRows|max_diff_rows|Diff struct|type Diff|MTree struct|config.Get\(\)' internal config . echo echo "== likely config load/validation paths ==" rg -n --hidden --glob '!**/*_test.go' 'MaxDiffRows|max_diff_rows|Diff\.' internal config .Repository: pgEdge/ace
Length of output: 18461
🏁 Script executed:
#!/bin/bash set -euo pipefail echo "== pkg/config/config.go ==" sed -n '1,220p' pkg/config/config.go echo echo "== default configs ==" sed -n '1,120p' ace.yaml echo sed -n '1,120p' internal/cli/default_config.yaml echo echo "== mtree config and any validation hooks ==" rg -n --hidden --glob '!**/*_test.go' 'Validate|validation|MaxDiffRows|max_diff_rows' pkg/config internal/cli internal/consistency/mtree | sed -n '1,220p'Repository: pgEdge/ace
Length of output: 16196
Reject non-positive
mtree.diff.max_diff_rowsvalues The shipped defaults keep this cap on, but a negative YAML value is still ignored by the> 0check and leavesMaxDiffRowsat 0/unbounded. Clamp or reject invalid config values on load so a bad config can’t disable the OOM guard.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/consistency/mtree/merkle.go` around lines 2194 - 2202, The MaxDiffRows fallback in merkle.go only accepts values greater than zero, so a negative mtree.diff.max_diff_rows silently leaves the cap unset and unbounded. Update the config-loading path for MTree diff settings, or the MaxDiffRows initialization logic in the merkle config code, to validate and reject/clamp non-positive values instead of ignoring them. Use the existing symbols config.Get, cfg.MTree.Diff.MaxDiffRows, and m.MaxDiffRows to keep the OOM guard always enabled when the caller leaves it unset.
Fix/ace 198/mtree diff max rows info message
Older output:
./ace mtree table-diff --dbname postgres demo public.n1
..
Updating Merkle tree on node: n2
No updates needed for n2
2026/07/02 07:45:10 INFO Comparing merkle trees between n1 and n2
2026/07/02 07:45:10 INFO Trees differ - traversing to find mismatched leaf nodes...
2026/07/02 07:45:10 INFO Found 271 mismatched blocks
Comparing ranges: 99 / 271
....
2026/07/02 07:45:40 WARN ✘ TABLES DO NOT MATCH
2026/07/02 07:45:40 WARN Found 1000000 differences between n1/n2
2026/07/02 07:45:40 INFO Diff report written to public_n1_diffs-20260702074531.json
New output:
./ace mtree table-diff --dbname postgres demo public.n1
...
2026/07/02 08:43:57 INFO Comparing merkle trees between n1 and n2
2026/07/02 08:43:57 INFO Trees differ - traversing to find mismatched leaf nodes...
2026/07/02 08:43:57 INFO Found 396 mismatched blocks
Comparing ranges: 302 / 396
....
2026/07/02 08:44:17 WARN mtree table-diff stopped after reaching max_diff_rows=1000000; additional differences may exist
2026/07/02 08:44:35 WARN ✘ TABLES DO NOT MATCH
2026/07/02 08:44:35 WARN Found 1000000 differences between n1/n2
2026/07/02 08:44:35 INFO Diff report written to public_n1_diffs-20260702084425.json