Skip to content

fix(tests): align 3 stale OSS test assertions with intentional defaults#164

Merged
yilu331 merged 1 commit into
mainfrom
ci/check-and-test-nightly-20260616
Jun 16, 2026
Merged

fix(tests): align 3 stale OSS test assertions with intentional defaults#164
yilu331 merged 1 commit into
mainfrom
ci/check-and-test-nightly-20260616

Conversation

@yilu331

@yilu331 yilu331 commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator

Summary

Nightly /check-and-test run found 3 OSS unit-test failures that were stale assertions lagging intentional behavior changes on main. The application code is correct; the tests are fixed to match.

Test Was Now Why
test_unified_search_floor::test_floor_applied_per_arm RetrievalFloorConfig(user_playbook_floor=-5.0) adds enabled=True The enabled default flipped TrueFalse; the floor-applied test must opt in.
test_litellm_client::test_config_defaults max_retries == 1 max_retries == 3 #124 adopted native litellm num_retries; default is now 3.
test_embedding_service_concurrency::test_embed_texts_caps_and_queues peak >= 2 peak == 1 #153 added _MODEL_ENCODE_LOCK serializing inference (tensor-race fix); encode peak is always 1.

Also includes pre-existing ruff-format drift in billing_meter.py and extraction/outcome.py (formatting only).

Verification

Full OSS unit + integration suite: 3061 passed, 0 failed (pytest tests/ --ignore=tests/e2e_tests/, mocked LLM).

Summary by CodeRabbit

  • Tests
    • Updated test assertions for concurrency behavior and configuration defaults
  • Refactor
    • Improved internal code structure and formatting for maintainability

@coderabbitai

coderabbitai Bot commented Jun 16, 2026

Copy link
Copy Markdown

Review Change Stack

Warning

Review limit reached

@yilu331, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 25 minutes and 31 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 918f93a5-2ff6-4c25-a2c0-181d39658f1b

📥 Commits

Reviewing files that changed from the base of the PR and between 7eda9b5 and 49cb14d.

📒 Files selected for processing (1)
  • tests/server/llm/test_litellm_client.py
📝 Walkthrough

Walkthrough

Five small, independent edits: _INTERNAL in billing_meter.py is changed from a str to a tuple; ExtractionOutcome.completed is reformatted with no logic change; three test assertions are updated to match current runtime behavior (recorder.peak == 1, max_retries == 3, enabled=True).

Changes

Miscellaneous Fixes and Test Alignment

Layer / File(s) Summary
Source constant and classmethod reformat
reflexio/server/billing_meter.py, reflexio/server/services/extraction/outcome.py
_INTERNAL is redefined as a tuple wrapping "internal" instead of a plain string; ExtractionOutcome.completed return cls(...) is split across multiple lines with identical arguments.
Test assertion alignment
tests/server/llm/test_embedding_service_concurrency.py, tests/server/llm/test_litellm_client.py, tests/server/services/test_unified_search_floor.py
Concurrency test expects recorder.peak == 1 to match _MODEL_ENCODE_LOCK serialization; LiteLLMConfig.max_retries default expectation changed from 1 to 3; RetrievalFloorConfig in search floor test gains explicit enabled=True.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

  • ReflexioAI/reflexio#153: Introduced _MODEL_ENCODE_LOCK to serialize model inference in embedding_service.py, directly motivating the recorder.peak == 1 assertion update in this PR.

Poem

🐇 A tuple here, a peak of one,
Three retries where there once was one,
enabled=True makes the floor run,
Multi-line returns, neatly done —
Small hops forward, every one! 🌿

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately captures the main objective of fixing three stale test assertions to align with intentional defaults.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch ci/check-and-test-nightly-20260616

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@yilu331 yilu331 marked this pull request as ready for review June 16, 2026 16:28
…rmat

Three OSS test failures were stale assertions that lagged intentional
behavior changes on main; fix the tests (the application behavior is correct):

- test_unified_search_floor: RetrievalFloorConfig.enabled default flipped
  True->False, so the floor-applied test must opt in with enabled=True.
- test_litellm_client::test_config_defaults: max_retries default is now 3
  (native litellm num_retries adoption, #124), not 1.
- test_embedding_service_concurrency: model inference is now serialized by
  _MODEL_ENCODE_LOCK (#153, fixes a tensor race), so recorder.peak is
  always 1; the old `peak >= 2` assertion contradicted the new design.

Also applies pre-existing ruff-format drift to billing_meter.py and
extraction/outcome.py (formatting only, no behavior change).
@yilu331 yilu331 force-pushed the ci/check-and-test-nightly-20260616 branch from 7eda9b5 to 49cb14d Compare June 16, 2026 16:32
@yilu331 yilu331 merged commit 6e85810 into main Jun 16, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant