Skip to content

Comments

Change empty-match exit from break 2 to break#46

Merged
JanTvrdik merged 1 commit intomainfrom
test/empty-match-fail-early
Feb 24, 2026
Merged

Change empty-match exit from break 2 to break#46
JanTvrdik merged 1 commit intomainfrom
test/empty-match-fail-early

Conversation

@JanTvrdik
Copy link
Member

@JanTvrdik JanTvrdik commented Feb 24, 2026

Summary

  • Changes break 2 to break on empty match in PatternIterator::getIterator() for simpler control flow that is easier to reason about.
  • Adds a regression test (testZeroLengthMatchMidBufferLoadsMoreData) demonstrating a case where break 2 would incorrectly abort: when an empty match occurs mid-buffer because a fixed-width alternative needs more characters than remain, loading the next chunk resolves it into a positive-length match.

Context

With break 2, an empty match mid-buffer with the stream still active would immediately exit both loops and throw. With break, the iterator returns to the outer loop, loads the next chunk, and retries — which can turn the empty match into a successful one when more data is available.

Since the library's SQL tokenizer patterns never produce empty matches in practice, the break 2 fail-fast offered no real benefit while being harder to reason about.

Test plan

  • vendor/bin/tester tests/ — all 65 tests pass

Copilot AI review requested due to automatic review settings February 24, 2026 07:08
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a regression test for the fail-fast behavior of PatternIterator when encountering an empty regex match. The test verifies that the break 2 statement (which exits both inner and outer loops) prevents unnecessary stream consumption when an empty match occurs mid-buffer, since loading more data cannot help resolve the situation.

Changes:

  • Adds testEmptyMatchFailsEarlyWithoutConsumingEntireStream() to verify that only 2 chunks are touched (1 processed + 1 prefetched) when an empty match triggers early exit, rather than draining all 4 chunks.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

The break 2 was harder to reason about and could incorrectly abort parsing
when an empty match occurred mid-buffer with more stream data available.
With break (single), the iterator loads the next chunk, which can resolve
the empty match into a positive-length one (e.g. when a fixed-width
alternative needs more characters than remain in the current buffer).
@JanTvrdik JanTvrdik force-pushed the test/empty-match-fail-early branch from f744c98 to 1f1dc07 Compare February 24, 2026 07:21
@JanTvrdik JanTvrdik changed the title Add test for empty-match fail-early behavior in PatternIterator Change empty-match exit from break 2 to break Feb 24, 2026
@JanTvrdik JanTvrdik requested a review from Copilot February 24, 2026 07:26
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@JanTvrdik JanTvrdik merged commit ca40970 into main Feb 24, 2026
16 checks passed
@JanTvrdik JanTvrdik deleted the test/empty-match-fail-early branch February 24, 2026 07:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant