fix(test): loosen fragile equality assertions in astream incremental tests#1069
Open
planetf1 wants to merge 1 commit into
Open
fix(test): loosen fragile equality assertions in astream incremental tests#1069planetf1 wants to merge 1 commit into
planetf1 wants to merge 1 commit into
Conversation
…tests Two consecutive streaming chunks can legitimately contain identical content (repeated tokens, whitespace) — asserting `chunk2 != chunk1` or `not chunk2.startswith(chunk1)` produced spurious failures under load or with slow models. Replace both assertions with structural reassembly checks: verify that accumulated chunks are a prefix of (or equal to) the final value, which is the invariant that actually matters. Closes generative-computing#628 Assisted-by: Claude Code
markstur
approved these changes
May 14, 2026
Contributor
markstur
left a comment
There was a problem hiding this comment.
I approve merging this now as a targeted fix.
But I think the tests could be improved by using a loop for accumulation testing instead of chunk1 vs chunk2. Also if a loop is added then probably consolidate the 2 tests into 1.
Also see one nit about a comment but it's very optional.
| # two consecutive identical tokens are valid). | ||
| final_val = await mot.avalue() | ||
| accumulated = chunk1 + chunk2 | ||
| assert final_val.startswith(accumulated) or accumulated.startswith(final_val), ( |
Contributor
There was a problem hiding this comment.
the right operand of the or could use a comment. It seems to me to not be the obvious thing we want to test here but more of a different truncation case.
| assert not chunk2.startswith(chunk1), ( | ||
| "Second chunk should not start with first chunk (should be incremental)" | ||
| ) | ||
| if chunk2 and len(chunk2) > 0 and chunk1: |
Contributor
There was a problem hiding this comment.
I think this is ok given the expected answer, but in general this accumulation test would make more sense as a loop instead of only working with chunk1,2
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Misc PR
Type of PR
Description
Two assertions in
test/core/test_astream_incremental.pycompared streaming chunk content for inequality, which caused intermittent test failures:Two consecutive streaming chunks can legitimately contain identical content — repeated tokens, whitespace, short common substrings — so these assertions fail spuriously under load or with slow models.
Both replaced with structural reassembly checks: verify that the accumulated chunks form a valid prefix of (or equal) the final streamed value. This is the invariant that actually matters, and is the same pattern already used correctly in
test_astream_multiple_calls_accumulate_correctly.Testing
Attribution