Skip to content

Fix/ace 198/mtree diff max rows info message #134

Merged
mason-sharp merged 1 commit into
mainfrom
fix/ACE-198/mtree-diff-max-rows
Jul 2, 2026
Merged

Fix/ace 198/mtree diff max rows info message #134
mason-sharp merged 1 commit into
mainfrom
fix/ACE-198/mtree-diff-max-rows

Conversation

@zaidshabbir25

Copy link
Copy Markdown
Contributor

Fix/ace 198/mtree diff max rows info message

Older output:

./ace mtree table-diff --dbname postgres demo public.n1
..
Updating Merkle tree on node: n2
No updates needed for n2
2026/07/02 07:45:10 INFO Comparing merkle trees between n1 and n2
2026/07/02 07:45:10 INFO Trees differ - traversing to find mismatched leaf nodes...
2026/07/02 07:45:10 INFO Found 271 mismatched blocks
Comparing ranges: 99 / 271
....
2026/07/02 07:45:40 WARN ✘ TABLES DO NOT MATCH
2026/07/02 07:45:40 WARN Found 1000000 differences between n1/n2
2026/07/02 07:45:40 INFO Diff report written to public_n1_diffs-20260702074531.json

New output:
./ace mtree table-diff --dbname postgres demo public.n1
...
2026/07/02 08:43:57 INFO Comparing merkle trees between n1 and n2
2026/07/02 08:43:57 INFO Trees differ - traversing to find mismatched leaf nodes...
2026/07/02 08:43:57 INFO Found 396 mismatched blocks
Comparing ranges: 302 / 396
....
2026/07/02 08:44:17 WARN mtree table-diff stopped after reaching max_diff_rows=1000000; additional differences may exist
2026/07/02 08:44:35 WARN ✘ TABLES DO NOT MATCH
2026/07/02 08:44:35 WARN Found 1000000 differences between n1/n2
2026/07/02 08:44:35 INFO Diff report written to public_n1_diffs-20260702084425.json

@zaidshabbir25 zaidshabbir25 requested a review from mason-sharp July 2, 2026 09:15
@codacy-production

codacy-production Bot commented Jul 2, 2026

Copy link
Copy Markdown

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

🟢 Metrics 7 complexity · 0 duplication

Metric Results
Complexity 7
Duplication 0

View in Codacy

NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.

@coderabbitai

coderabbitai Bot commented Jul 2, 2026

Copy link
Copy Markdown

Review Change Stack

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 8ace95ec-eae3-4f0c-8afa-166bd9c14220

📥 Commits

Reviewing files that changed from the base of the PR and between 572c7c2 and b26ace6.

📒 Files selected for processing (2)
  • internal/consistency/mtree/merkle.go
  • internal/consistency/mtree/merkle_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • internal/consistency/mtree/merkle_test.go

📝 Walkthrough

Walkthrough

Adds a configurable per-node-pair cap (MaxDiffRows) on collected differing rows during mtree table-diff. The engine enforces the cap in worker fetch loops and appendDiffs, validates negative values, resolves defaults from config, tracks truncation, logs warnings, and includes corresponding unit tests.

Changes

Max diff rows enforcement

Layer / File(s) Summary
Diff row cap enforcement in merkle engine
internal/consistency/mtree/merkle.go
MerkleTreeTask gains MaxDiffRows, diffRowCounts, and diffLimitWarned; worker fetch loops and appendDiffs stop collecting rows per node pair once the cap is reached, recordPairRowLocked flags DiffRowLimitReached at the cap, DiffMtree rejects negative caps, resolves unset caps from config, includes the cap in the summary, and logs a warning on truncation.
Tests for cap enforcement
internal/consistency/mtree/merkle_test.go
New/renamed tests verify cap enforcement beyond the limit, truncation at the exact cap boundary, unlimited collection when unset, and rejection of negative MaxDiffRows, using new helper functions for constructing diff state and generating rows.

Poem

A rabbit counts each mismatched row,
"Enough!" it says, "no more shall flow!"
A cap now guards the diffing spree,
With warnings logged for all to see. 🐇
Hop, hop, truncated — tidy and neat! 🥕

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title is concise and matches the main change: improving mtree diff max-rows handling and its warning message.
Description check ✅ Passed The description directly explains the mtree diff max-rows warning change and shows the before/after output.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/ACE-198/mtree-diff-max-rows

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
internal/consistency/mtree/merkle.go (1)

2137-2180: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Negative max_diff_rows from config is silently ignored instead of rejected.

The early guard at Line 2139 only checks the caller-supplied m.MaxDiffRows (defaults to 0). Config resolution at Lines 2167-2171 only takes the config value if cfg.MTree.Diff.MaxDiffRows > 0, so a negative value in ace.yaml/default_config.yaml is silently dropped and the diff falls back to unbounded mode instead of erroring — defeating both the "reject negative values" and the OOM-prevention goals of this PR.

🐛 Proposed fix: validate after resolving from config
-	// Reject a negative cap up front so a bad value can't silently disable the
-	// row bound (mirrors table_diff's validation).
-	if m.MaxDiffRows < 0 {
-		return fmt.Errorf("max_diff_rows must be >= 0, got %d", m.MaxDiffRows)
-	}
-
 	if err = m.UpdateMtree(true); err != nil {
 		return fmt.Errorf("failed to update merkle tree before diff: %w", err)
 	}
 	...
 	if m.MaxDiffRows == 0 {
-		if cfg := config.Get(); cfg != nil && cfg.MTree.Diff.MaxDiffRows > 0 {
+		if cfg := config.Get(); cfg != nil {
 			m.MaxDiffRows = cfg.MTree.Diff.MaxDiffRows
 		}
 	}
+	// Reject a negative cap (caller- or config-supplied) so a bad value can't
+	// silently disable the row bound (mirrors table_diff's validation).
+	if m.MaxDiffRows < 0 {
+		return fmt.Errorf("max_diff_rows must be >= 0, got %d", m.MaxDiffRows)
+	}

Please also add a test covering a negative value sourced from config (not just a directly-set negative struct field).

#!/bin/bash
# Check for any other validation of MaxDiffRows in config.go or elsewhere
rg -n -C3 'MaxDiffRows' --type=go
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/consistency/mtree/merkle.go` around lines 2137 - 2180, The negative
max_diff_rows validation in the mtree diff path only checks the pre-config
value, so a negative value from config can still be ignored and fall back to
unbounded diffing. In the diff setup logic in merkle.go, validate m.MaxDiffRows
again after resolving the config value from config.Get().MTree.Diff.MaxDiffRows,
and return an error for any value below zero before proceeding with UpdateMtree
or building DiffResult. Add a test around the mtree diff flow that loads a
negative max_diff_rows from config and asserts it is rejected, not silently
skipped.
🧹 Nitpick comments (1)
internal/consistency/mtree/merkle.go (1)

568-624: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Per-pair warning only logs once globally, not once per pair.

diffLimitWarned is a single bool for the whole task, so notePairLimitReachedLocked's warning names only the first node pair that hits the cap. If several pairs are independently truncated, operators only see one pair name in the logs (the summary-level DiffRowLimitReached flag/final warning still fire correctly for all cases).

♻️ Suggested fix: track warned state per pair
-	diffRowCounts   map[string]int64
-	diffLimitWarned bool
+	diffRowCounts    map[string]int64
+	diffPairsWarned  map[string]struct{}
 func (m *MerkleTreeTask) notePairLimitReachedLocked(pairKey string) {
 	m.DiffResult.Summary.DiffRowLimitReached = true
-	if !m.diffLimitWarned {
-		m.diffLimitWarned = true
+	if _, warned := m.diffPairsWarned[pairKey]; !warned {
+		if m.diffPairsWarned == nil {
+			m.diffPairsWarned = make(map[string]struct{})
+		}
+		m.diffPairsWarned[pairKey] = struct{}{}
 		logger.Warn("mtree table-diff: %s reached max_diff_rows limit (%d); enumeration for this pair stops (other pairs continue)", pairKey, m.MaxDiffRows)
 	}
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/consistency/mtree/merkle.go` around lines 568 - 624, The per-pair
cap warning in MerkleTreeTask is tracked with a single diffLimitWarned flag, so
notePairLimitReachedLocked only logs the first pair that hits MaxDiffRows.
Update the warning tracking to be per pair in MerkleTreeTask (for example
alongside diffRowCounts), and make
notePairLimitReachedLocked/notePairLimitReached consult that per-pair state so
each truncated pair can emit its own warning while still preserving the shared
DiffRowLimitReached summary behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@internal/consistency/mtree/merkle_test.go`:
- Around line 100-109: Add coverage for config-based negative max_diff_rows
handling in DiffMtree: the current TestMtreeDiffRejectsNegativeMaxDiffRows only
exercises MerkleTreeTask.MaxDiffRows set directly, so add a case that leaves
m.MaxDiffRows at zero, sets config.MTree.Diff.MaxDiffRows to a negative value,
and verifies DiffMtree still returns the same validation error. Use the existing
DiffMtree and MerkleTreeTask symbols so the test targets the config-resolution
path in merkle.go.

---

Outside diff comments:
In `@internal/consistency/mtree/merkle.go`:
- Around line 2137-2180: The negative max_diff_rows validation in the mtree diff
path only checks the pre-config value, so a negative value from config can still
be ignored and fall back to unbounded diffing. In the diff setup logic in
merkle.go, validate m.MaxDiffRows again after resolving the config value from
config.Get().MTree.Diff.MaxDiffRows, and return an error for any value below
zero before proceeding with UpdateMtree or building DiffResult. Add a test
around the mtree diff flow that loads a negative max_diff_rows from config and
asserts it is rejected, not silently skipped.

---

Nitpick comments:
In `@internal/consistency/mtree/merkle.go`:
- Around line 568-624: The per-pair cap warning in MerkleTreeTask is tracked
with a single diffLimitWarned flag, so notePairLimitReachedLocked only logs the
first pair that hits MaxDiffRows. Update the warning tracking to be per pair in
MerkleTreeTask (for example alongside diffRowCounts), and make
notePairLimitReachedLocked/notePairLimitReached consult that per-pair state so
each truncated pair can emit its own warning while still preserving the shared
DiffRowLimitReached summary behavior.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 9e2f9d39-4b8d-4188-971d-cdb244ffa0e3

📥 Commits

Reviewing files that changed from the base of the PR and between f085feb and 572c7c2.

📒 Files selected for processing (6)
  • ace.yaml
  • docs/commands/mtree/mtree-table-diff.md
  • internal/cli/default_config.yaml
  • internal/consistency/mtree/merkle.go
  • internal/consistency/mtree/merkle_test.go
  • pkg/config/config.go

Comment thread internal/consistency/mtree/merkle_test.go
…198)

mtree table-diff stayed silent when the collected diffs equalled
max_diff_rows: it only flagged truncation when a further row was refused,
and it had no final-summary warning. table-diff flags at >= the cap and
prints "additional differences may exist", so the two paths disagreed.

- recordPairRowLocked now flags truncation once a pair reaches the cap,
  mirroring table-diff's incrementPairDiffRowsLocked (>= semantics).
- Emit a final "mtree table-diff stopped after reaching max_diff_rows=N;
  additional differences may exist" warning before writing the report,
  the analog of table_diff.go's completion warning.
- Update the boundary unit test to expect DiffRowLimitReached at the
  exact cap.
@mason-sharp mason-sharp force-pushed the fix/ACE-198/mtree-diff-max-rows branch from 1befd21 to b26ace6 Compare July 2, 2026 20:24

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@internal/consistency/mtree/merkle.go`:
- Around line 2194-2202: The MaxDiffRows fallback in merkle.go only accepts
values greater than zero, so a negative mtree.diff.max_diff_rows silently leaves
the cap unset and unbounded. Update the config-loading path for MTree diff
settings, or the MaxDiffRows initialization logic in the merkle config code, to
validate and reject/clamp non-positive values instead of ignoring them. Use the
existing symbols config.Get, cfg.MTree.Diff.MaxDiffRows, and m.MaxDiffRows to
keep the OOM guard always enabled when the caller leaves it unset.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 8ace95ec-eae3-4f0c-8afa-166bd9c14220

📥 Commits

Reviewing files that changed from the base of the PR and between 572c7c2 and b26ace6.

📒 Files selected for processing (2)
  • internal/consistency/mtree/merkle.go
  • internal/consistency/mtree/merkle_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • internal/consistency/mtree/merkle_test.go

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Inline review comments failed to post. This is likely due to GitHub's internal server error or limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@internal/consistency/mtree/merkle.go`:
- Around line 2194-2202: The MaxDiffRows fallback in merkle.go only accepts
values greater than zero, so a negative mtree.diff.max_diff_rows silently leaves
the cap unset and unbounded. Update the config-loading path for MTree diff
settings, or the MaxDiffRows initialization logic in the merkle config code, to
validate and reject/clamp non-positive values instead of ignoring them. Use the
existing symbols config.Get, cfg.MTree.Diff.MaxDiffRows, and m.MaxDiffRows to
keep the OOM guard always enabled when the caller leaves it unset.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 8ace95ec-eae3-4f0c-8afa-166bd9c14220

📥 Commits

Reviewing files that changed from the base of the PR and between 572c7c2 and b26ace6.

📒 Files selected for processing (2)
  • internal/consistency/mtree/merkle.go
  • internal/consistency/mtree/merkle_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • internal/consistency/mtree/merkle_test.go
🛑 Comments failed to post (1)
internal/consistency/mtree/merkle.go (1)

2194-2202: 🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find the mtree diff default(s) and any load-time validation for max_diff_rows.
rg -nP 'max_diff_rows|MaxDiffRows' --type=go -g '!**/*_test.go' -C3
fd -e yaml -e yml | xargs rg -nP 'max_diff_rows' -C2 2>/dev/null

Repository: pgEdge/ace

Length of output: 215


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Locate the mtree config and the relevant source around the referenced lines.
git ls-files | rg 'internal/consistency/mtree/merkle\.go|config|yaml|yml|toml|json|.*\.go$' >/tmp/files.txt

echo "== merkle.go around referenced lines =="
sed -n '2175,2210p' internal/consistency/mtree/merkle.go

echo
echo "== mtree config definitions and defaults =="
rg -n --hidden --glob '!**/*_test.go' 'MaxDiffRows|max_diff_rows|Diff struct|type Diff|MTree struct|config.Get\(\)' internal config .

echo
echo "== likely config load/validation paths =="
rg -n --hidden --glob '!**/*_test.go' 'MaxDiffRows|max_diff_rows|Diff\.' internal config .

Repository: pgEdge/ace

Length of output: 18461


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== pkg/config/config.go =="
sed -n '1,220p' pkg/config/config.go

echo
echo "== default configs =="
sed -n '1,120p' ace.yaml
echo
sed -n '1,120p' internal/cli/default_config.yaml

echo
echo "== mtree config and any validation hooks =="
rg -n --hidden --glob '!**/*_test.go' 'Validate|validation|MaxDiffRows|max_diff_rows' pkg/config internal/cli internal/consistency/mtree | sed -n '1,220p'

Repository: pgEdge/ace

Length of output: 16196


Reject non-positive mtree.diff.max_diff_rows values The shipped defaults keep this cap on, but a negative YAML value is still ignored by the > 0 check and leaves MaxDiffRows at 0/unbounded. Clamp or reject invalid config values on load so a bad config can’t disable the OOM guard.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/consistency/mtree/merkle.go` around lines 2194 - 2202, The
MaxDiffRows fallback in merkle.go only accepts values greater than zero, so a
negative mtree.diff.max_diff_rows silently leaves the cap unset and unbounded.
Update the config-loading path for MTree diff settings, or the MaxDiffRows
initialization logic in the merkle config code, to validate and reject/clamp
non-positive values instead of ignoring them. Use the existing symbols
config.Get, cfg.MTree.Diff.MaxDiffRows, and m.MaxDiffRows to keep the OOM guard
always enabled when the caller leaves it unset.

@mason-sharp mason-sharp merged commit e8a5f1a into main Jul 2, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants