Skip to content

[WIP]#5143

Draft
lidezhu wants to merge 2 commits into
masterfrom
ldz/fix-gc-ts
Draft

[WIP]#5143
lidezhu wants to merge 2 commits into
masterfrom
ldz/fix-gc-ts

Conversation

@lidezhu
Copy link
Copy Markdown
Collaborator

@lidezhu lidezhu commented May 26, 2026

What problem does this PR solve?

Issue Number: close #xxx

What is changed and how it works?

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Questions

Will it cause performance regression or break compatibility?
Do you need to update user documentation, design documentation or monitoring documentation?

Release note

Please refer to [Release Notes Language Style Guide](https://pingcap.github.io/tidb-dev-guide/contribute-to-tidb/release-notes-style-guide.html) to write a quality release note.

If you don't think this PR needs a release note then fill it with `None`.

@ti-chi-bot ti-chi-bot Bot added do-not-merge/needs-linked-issue release-note Denotes a PR that will be considered when it comes time to generate release notes. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels May 26, 2026
@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot Bot commented May 26, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot Bot commented May 26, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign wlwilliamx for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 26, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c2ef09c9-0325-4492-80b3-90dd2f6bdf5c

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch ldz/fix-gc-ts

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ti-chi-bot ti-chi-bot Bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label May 26, 2026
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request modifies the GC safety checks to allow the changefeed startTs (or checkpointTs) to be equal to the GC safepoint, rather than requiring it to be strictly greater. It introduces getGCSafepointUpperBound to set the GC barrier at startTs - 1 and updates tests accordingly. The review feedback highlights a critical issue in next-gen mode where setting startTs == gcSafepoint could leave the changefeed unprotected if the barrier is not explicitly set to startTs, and suggests adding assertions in tests to verify the barrier is correctly set at this boundary.

Comment on lines 87 to 98
if err != nil {
return errors.ErrStartTsBeforeGC.GenWithStackByArgs(startTs)
if !errors.IsGCBarrierTSBehindTxnSafePointError(err) {
return errors.ErrStartTsBeforeGC.GenWithStackByArgs(startTs, gcSafepointUpperBound)
}
gcSafepoint, getErr := UnifyGetServiceGCSafepoint(ctx, pdCli, keyspaceID, gcServiceID)
if getErr != nil {
return getErr
}
if startTs > 0 && startTs < gcSafepoint {
return errors.ErrStartTsBeforeGC.GenWithStackByArgs(startTs, gcSafepoint)
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

In next-gen mode, if SetGCBarrier fails because gcSafepointUpperBound (startTs - 1) is behind the current GC safepoint, and startTs == gcSafepoint, the current implementation simply returns nil (success) without setting any barrier. This leaves the changefeed unprotected, and a subsequent GC run could advance the safepoint past startTs before the changefeed starts and sets its own barrier.

To prevent this potential data loss, we should explicitly set the GC barrier to startTs when startTs == gcSafepoint.

	if err != nil {
		if !errors.IsGCBarrierTSBehindTxnSafePointError(err) {
			return errors.ErrStartTsBeforeGC.GenWithStackByArgs(startTs, gcSafepointUpperBound)
		}
		gcSafepoint, getErr := UnifyGetServiceGCSafepoint(ctx, pdCli, keyspaceID, gcServiceID)
		if getErr != nil {
			return getErr
		}
		if startTs > 0 && startTs < gcSafepoint {
			return errors.ErrStartTsBeforeGC.GenWithStackByArgs(startTs, gcSafepoint)
		}
		if startTs == gcSafepoint {
			_, err = SetGCBarrier(ctx, gcCli, gcServiceID, startTs, time.Duration(ttl)*time.Second)
			if err != nil {
				return errors.Trace(err)
			}
		}
	}

Comment thread pkg/txnutil/gc/gc_service_test.go Outdated
Comment on lines +122 to +126
err = EnsureChangefeedStartTsSafety(ctx, pdCli,
"ticdc-creating-",
0,
common.NewChangeFeedIDWithName("changefeed-boundary", "default"), TTL, 60)
require.NoError(t, err)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

We should assert that the GC barrier is actually set and active in PD when startTs == gcSafepoint (boundary condition), rather than just checking that no error is returned.

Suggested change
err = EnsureChangefeedStartTsSafety(ctx, pdCli,
"ticdc-creating-",
0,
common.NewChangeFeedIDWithName("changefeed-boundary", "default"), TTL, 60)
require.NoError(t, err)
err = EnsureChangefeedStartTsSafety(ctx, pdCli,
"ticdc-creating-",
0,
common.NewChangeFeedIDWithName("changefeed-boundary", "default"), TTL, 60)
require.NoError(t, err)
require.Equal(t, uint64(60), pdCli.gcBarriers["ticdc-creating-default_changefeed-boundary"])

@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot Bot commented May 26, 2026

[FORMAT CHECKER NOTIFICATION]

Notice: To remove the do-not-merge/needs-linked-issue label, please provide the linked issue number on one line in the PR body, for example: Issue Number: close #123 or Issue Number: ref #456.

📖 For more info, you can check the "Contribute Code" section in the development guide.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/needs-linked-issue do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant