Skip to content

speed ci#922

Open
hh-space-invader wants to merge 21 commits intodevfrom
speed-ci
Open

speed ci#922
hh-space-invader wants to merge 21 commits intodevfrom
speed-ci

Conversation

@hh-space-invader
Copy link
Copy Markdown
Contributor

improve: Updated vectors with smaller ones

All Submissions:

  • Contributions should target the dev branch. Did you create your branch from dev?
  • Have you followed the guidelines in our Contributing document?
  • Have you checked to ensure there aren't other open Pull Requests for the same update/change?

New Feature Submissions:

  1. Does your submission pass tests?
  2. Have you installed pre-commit with pip3 install pre-commit and set up hooks with pre-commit install?

Changes to Core Features:

  • Have you added an explanation of what your changes do and why you'd like us to include them?
  • Have you written new tests for your core changes, as applicable?
  • Have you successfully ran tests with your changes locally?

improve: Updated vectors with smaller ones
@netlify
Copy link
Copy Markdown

netlify Bot commented Mar 11, 2025

Deploy Preview for poetic-froyo-8baba7 ready!

Name Link
🔨 Latest commit 0f2b5a4
🔍 Latest deploy log https://app.netlify.com/sites/poetic-froyo-8baba7/deploys/67dbeab5e6aa64000845072b
😎 Deploy Preview https://deploy-preview-922--poetic-froyo-8baba7.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 11, 2025

📝 Walkthrough

Walkthrough

The changes modify several test files to standardize and parameterize vector counts and model configuration values. In multiple test files, the previous hardcoded vector numbers (generally 1000) are replaced with constants such as NUM_VECTORS set to 100 or QUERY_VECTORS_NUM set to 1000, ensuring consistent test setups across functions that generate fixtures. Additionally, some tests have removed redundant parameters to streamline fixture generation calls. The updates include adjustments to model names and dimensions in inference tests, specifically changing model configurations, and minor formatting improvements for readability, such as condensing multi-line indexing. These modifications focus on improving consistency, maintainability, and clarity in how test data and configurations are defined and used.

✨ Finishing Touches
  • 📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (14)
tests/congruence_tests/test_group_search.py (2)

306-306: Reduced test iterations for better performance

Reducing iterations from 100 to 50 is a reasonable compromise that balances test thoroughness with execution time.

Consider renaming the unused loop variable i to _ to indicate intentional non-use:

-    for i in range(50):
+    for _ in range(50):
🧰 Tools
🪛 Ruff (0.8.2)

306-306: Loop control variable i not used within loop body

Rename unused i to _i

(B007)


339-339: Reduced test iterations for better performance

Similar to the previous loop, reducing iterations from 100 to 50 helps improve test execution time.

Consider renaming the unused loop variable i to _ to indicate intentional non-use:

-    for i in range(50):
+    for _ in range(50):
🧰 Tools
🪛 Ruff (0.8.2)

339-339: Loop control variable i not used within loop body

Rename unused i to _i

(B007)

tests/congruence_tests/test_count.py (2)

41-41: Reduced test iterations for better performance

Lowering iterations from 100 to 50 in this test is consistent with changes in other test files and improves execution time.

Consider renaming the unused loop variable i to _ to indicate intentional non-use:

-    for i in range(50):
+    for _ in range(50):
🧰 Tools
🪛 Ruff (0.8.2)

41-41: Loop control variable i not used within loop body

Rename unused i to _i

(B007)


63-63: Reduced test iterations for better performance

This change is consistent with the pattern of reducing test iterations across the codebase.

Consider renaming the unused loop variable i to _ to indicate intentional non-use:

-    for i in range(50):
+    for _ in range(50):
🧰 Tools
🪛 Ruff (0.8.2)

63-63: Loop control variable i not used within loop body

Rename unused i to _i

(B007)

tests/congruence_tests/test_query.py (5)

862-862: Optimization: Loop iterations reduced for faster test execution.

The iteration count has been reduced from 100 to 50, which will reduce test execution time while still providing sufficient test coverage.

Consider renaming the unused loop variable i to _ to follow Python conventions for unused variables:

-for i in range(50):
+for _ in range(50):

1004-1004: Optimization: Loop iterations reduced for faster test execution.

The iteration count has been reduced from 100 to 50, which will reduce test execution time while still providing sufficient test coverage.

Consider renaming the unused loop variable i to _ to follow Python conventions for unused variables:

-for i in range(50):
+for _ in range(50):
🧰 Tools
🪛 Ruff (0.8.2)

1004-1004: Loop control variable i not used within loop body

Rename unused i to _i

(B007)


1134-1134: Optimization: Loop iterations reduced for faster test execution.

The iteration count has been reduced from 100 to 50, which will reduce test execution time while still providing sufficient test coverage.

Consider renaming the unused loop variable i to _ to follow Python conventions for unused variables:

-for i in range(50):
+for _ in range(50):
🧰 Tools
🪛 Ruff (0.8.2)

1134-1134: Loop control variable i not used within loop body

Rename unused i to _i

(B007)


1170-1170: Optimization: Loop iterations reduced for faster test execution.

The iteration count has been reduced from 100 to 50, which will reduce test execution time while still providing sufficient test coverage.

Consider renaming the unused loop variable i to _ to follow Python conventions for unused variables:

-for i in range(50):
+for _ in range(50):
🧰 Tools
🪛 Ruff (0.8.2)

1170-1170: Loop control variable i not used within loop body

Rename unused i to _i

(B007)


1473-1473: Optimization: Loop iterations reduced for faster test execution.

The iteration count has been reduced from 100 to 50, which will reduce test execution time while still providing sufficient test coverage.

Consider renaming the unused loop variable i to _ to follow Python conventions for unused variables:

-for i in range(50):
+for _ in range(50):
🧰 Tools
🪛 Ruff (0.8.2)

1473-1473: Loop control variable i not used within loop body

Rename unused i to _i

(B007)

tests/congruence_tests/test_sparse_search.py (2)

164-164: Optimization: Loop iterations reduced for faster test execution.

The iteration count has been reduced from 100 to 50, which will reduce test execution time while still providing sufficient test coverage.

Consider renaming the unused loop variable i to _ to follow Python conventions for unused variables:

-for i in range(50):
+for _ in range(50):
🧰 Tools
🪛 Ruff (0.8.2)

164-164: Loop control variable i not used within loop body

Rename unused i to _i

(B007)


195-195: Optimization: Loop iterations reduced for faster test execution.

The iteration count has been reduced from 100 to 50, which will reduce test execution time while still providing sufficient test coverage.

Consider renaming the unused loop variable i to _ to follow Python conventions for unused variables:

-for i in range(50):
+for _ in range(50):
🧰 Tools
🪛 Ruff (0.8.2)

195-195: Loop control variable i not used within loop body

Rename unused i to _i

(B007)

tests/congruence_tests/test_search.py (3)

169-169: Reducing the test loop iterations is good for performance.

Reducing the number of test iterations from 100 to 50 will improve test execution speed while still maintaining sufficient coverage.

The loop control variable i is unused within the loop body. Consider renaming it to _ or _i to indicate it's deliberately unused:

-    for i in range(50):
+    for _ in range(50):
🧰 Tools
🪛 Ruff (0.8.2)

169-169: Loop control variable i not used within loop body

Rename unused i to _i

(B007)


201-201: Reducing the test loop iterations is good for performance.

Reducing the number of test iterations from 100 to 50 will improve test execution speed while still maintaining sufficient coverage.

The loop control variable i is unused within the loop body. Consider renaming it to _ or _i to indicate it's deliberately unused:

-    for i in range(50):
+    for _ in range(50):
🧰 Tools
🪛 Ruff (0.8.2)

201-201: Loop control variable i not used within loop body

Rename unused i to _i

(B007)


228-228: Reducing the test loop iterations is good for performance.

Reducing the number of test iterations from 100 to 50 will improve test execution speed while still maintaining sufficient coverage.

The loop control variable i is unused within the loop body. Consider renaming it to _ or _i to indicate it's deliberately unused:

-    for i in range(50):
+    for _ in range(50):
🧰 Tools
🪛 Ruff (0.8.2)

228-228: Loop control variable i not used within loop body

Rename unused i to _i

(B007)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d8951c5 and 7cfedd1.

📒 Files selected for processing (10)
  • tests/congruence_tests/test_common.py (1 hunks)
  • tests/congruence_tests/test_count.py (2 hunks)
  • tests/congruence_tests/test_group_search.py (2 hunks)
  • tests/congruence_tests/test_query.py (5 hunks)
  • tests/congruence_tests/test_search.py (3 hunks)
  • tests/congruence_tests/test_sparse_search.py (2 hunks)
  • tests/embed_tests/test_local_inference.py (1 hunks)
  • tests/test_async_qdrant_client.py (2 hunks)
  • tests/test_local_persistence.py (2 hunks)
  • tests/test_migrate.py (3 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
tests/congruence_tests/test_group_search.py

306-306: Loop control variable i not used within loop body

Rename unused i to _i

(B007)


339-339: Loop control variable i not used within loop body

Rename unused i to _i

(B007)

tests/congruence_tests/test_sparse_search.py

164-164: Loop control variable i not used within loop body

Rename unused i to _i

(B007)


195-195: Loop control variable i not used within loop body

Rename unused i to _i

(B007)

tests/congruence_tests/test_count.py

41-41: Loop control variable i not used within loop body

Rename unused i to _i

(B007)


63-63: Loop control variable i not used within loop body

Rename unused i to _i

(B007)

tests/congruence_tests/test_search.py

169-169: Loop control variable i not used within loop body

Rename unused i to _i

(B007)


201-201: Loop control variable i not used within loop body

Rename unused i to _i

(B007)


228-228: Loop control variable i not used within loop body

Rename unused i to _i

(B007)

tests/congruence_tests/test_query.py

1004-1004: Loop control variable i not used within loop body

Rename unused i to _i

(B007)


1134-1134: Loop control variable i not used within loop body

Rename unused i to _i

(B007)


1170-1170: Loop control variable i not used within loop body

Rename unused i to _i

(B007)


1473-1473: Loop control variable i not used within loop body

Rename unused i to _i

(B007)

⏰ Context from checks skipped due to timeout of 90000ms (9)
  • GitHub Check: Redirect rules - poetic-froyo-8baba7
  • GitHub Check: Header rules - poetic-froyo-8baba7
  • GitHub Check: Pages changed - poetic-froyo-8baba7
  • GitHub Check: Python 3.13.x on ubuntu-latest test
  • GitHub Check: Python 3.12.x on ubuntu-latest test
  • GitHub Check: Python 3.11.x on ubuntu-latest test
  • GitHub Check: Python 3.10.x on ubuntu-latest test
  • GitHub Check: Python 3.9.x on ubuntu-latest test
  • GitHub Check: Python 3.13 test
🔇 Additional comments (9)
tests/embed_tests/test_local_inference.py (1)

17-23: Model changes align with PR objective to use smaller models

The changes to model names and dimensions reflect the PR goal of updating fastembed models with smaller alternatives:

  1. DENSE_MODEL_NAME changes from sentence-transformers model to smaller BAAI model
  2. COLBERT_MODEL_NAME changes to a smaller version with dimension reduction from 128 to 96
  3. DENSE_IMAGE_MODEL_NAME changes from ResNet50 to CLIP ViT-B-32 with dimension reduction from 2048 to 512

These changes should reduce memory usage and potentially improve performance in tests while maintaining functionality.

tests/test_local_persistence.py (2)

40-40: Reduced ID range for random sampling

Reducing the range for random ID sampling from 100 to 50 is consistent with the overall PR objective of using smaller test data. This change aligns with similar modifications in other test files.


70-70: Reduced ID range for random sampling

Similar to the previous change, this reduction in ID range helps streamline the test data while maintaining test effectiveness.

tests/test_migrate.py (3)

19-19: Optimization: Reduced vector count for faster test execution.

Reducing VECTOR_NUMBER from 1000 to 100 will significantly decrease test execution time while still providing sufficient coverage. This is consistent with other similar changes across the test suite.


481-481: Code style improvement: Simplified initialization to a single line.

The multi-line initialization has been condensed to a single line, improving readability for this simple configuration object.


493-493: Code style improvement: Added trailing comma for consistency.

Adding a trailing comma is good practice in Python as it makes future additions to the parameter list cleaner in version control diffs.

tests/congruence_tests/test_common.py (3)

17-19: Appropriate reduction in dense vector sizes.

Reducing vector dimensions will decrease memory usage and improve test execution speed, aligning well with the PR objective of updating fastembed models with smaller alternatives.


22-24: Significant reduction in sparse vector sizes.

The sparse vector size reductions (particularly from 10,000 to 1,000 for code vectors) will substantially decrease memory requirements and improve test performance.


27-27: Significant reduction in number of test vectors.

Reducing NUM_VECTORS from 1,000 to 100 represents a 90% decrease, which will greatly improve test execution speed while still maintaining enough vectors to test functionality properly.

Comment thread tests/test_async_qdrant_client.py Outdated
Comment thread tests/test_async_qdrant_client.py Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
tests/congruence_tests/test_group_search.py (2)

306-306: Consider using underscore for unused loop variable.

The loop variable i is not used within the loop body. Consider renaming it to _ to indicate that it's intentionally unused, following Python conventions.

-    for i in range(50):
+    for _ in range(50):
🧰 Tools
🪛 Ruff (0.8.2)

306-306: Loop control variable i not used within loop body

Rename unused i to _i

(B007)


339-339: Consider using underscore for unused loop variable.

The loop variable i is not used within the loop body. Consider renaming it to _ to indicate that it's intentionally unused, following Python conventions.

-        for i in range(50):
+        for _ in range(50):
🧰 Tools
🪛 Ruff (0.8.2)

339-339: Loop control variable i not used within loop body

Rename unused i to _i

(B007)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 30ae574 and 2c7b58d.

📒 Files selected for processing (6)
  • tests/congruence_tests/test_group_search.py (2 hunks)
  • tests/test_async_qdrant_client.py (2 hunks)
  • tests/test_async_qdrant_client.py (4 hunks)
  • tests/test_async_qdrant_client.py (2 hunks)
  • tests/test_async_qdrant_client.py (4 hunks)
  • tests/congruence_tests/test_group_search.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
  • tests/test_async_qdrant_client.py
  • tests/test_async_qdrant_client.py
  • tests/congruence_tests/test_group_search.py
  • tests/test_async_qdrant_client.py
  • tests/test_async_qdrant_client.py
🧰 Additional context used
🪛 Ruff (0.8.2)
tests/congruence_tests/test_group_search.py

306-306: Loop control variable i not used within loop body

Rename unused i to _i

(B007)


339-339: Loop control variable i not used within loop body

Rename unused i to _i

(B007)

⏰ Context from checks skipped due to timeout of 90000ms (6)
  • GitHub Check: Python 3.13.x on ubuntu-latest test
  • GitHub Check: Python 3.12.x on ubuntu-latest test
  • GitHub Check: Python 3.11.x on ubuntu-latest test
  • GitHub Check: Python 3.10.x on ubuntu-latest test
  • GitHub Check: Python 3.9.x on ubuntu-latest test
  • GitHub Check: Python 3.13 test
🔇 Additional comments (1)
tests/congruence_tests/test_group_search.py (1)

306-318: Reduced test iteration count looks good.

Reducing the number of iterations from 100 to 50 is a good improvement that will make the tests run faster while still maintaining adequate test coverage. This change is consistent with similar changes in other test files in this PR.

Also applies to: 339-351

🧰 Tools
🪛 Ruff (0.8.2)

306-306: Loop control variable i not used within loop body

Rename unused i to _i

(B007)

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (14)
tests/congruence_tests/test_group_search.py (2)

306-306: Reduced test iterations for better performance

Reducing the loop iterations from 100 to 50 aligns with the PR objective of using smaller models, helping tests run faster while maintaining adequate test coverage.

Consider renaming the unused loop variable to follow Python conventions:

-for i in range(50):
+for _ in range(50):
🧰 Tools
🪛 Ruff (0.8.2)

306-306: Loop control variable i not used within loop body

Rename unused i to _i

(B007)


339-339: Reduced test iterations for better performance

Reducing the loop iterations from 100 to 50 aligns with the PR objective of using smaller models, helping tests run faster while maintaining adequate test coverage.

Consider renaming the unused loop variable to follow Python conventions:

-for i in range(50):
+for _ in range(50):
🧰 Tools
🪛 Ruff (0.8.2)

339-339: Loop control variable i not used within loop body

Rename unused i to _i

(B007)

tests/congruence_tests/test_count.py (2)

41-41: Reduced test iterations for better performance

Reducing the loop iterations from 100 to 50 aligns with the PR objective of using smaller models, helping tests run faster while maintaining adequate test coverage.

Consider renaming the unused loop variable to follow Python conventions:

-for i in range(50):
+for _ in range(50):
🧰 Tools
🪛 Ruff (0.8.2)

41-41: Loop control variable i not used within loop body

Rename unused i to _i

(B007)


63-63: Reduced test iterations for better performance

Reducing the loop iterations from 100 to 50 aligns with the PR objective of using smaller models, helping tests run faster while maintaining adequate test coverage.

Consider renaming the unused loop variable to follow Python conventions:

-for i in range(50):
+for _ in range(50):
🧰 Tools
🪛 Ruff (0.8.2)

63-63: Loop control variable i not used within loop body

Rename unused i to _i

(B007)

tests/congruence_tests/test_query.py (5)

862-862: Reduced test iterations for better performance

Reducing the loop iterations from 100 to 50 aligns with the PR objective of using smaller models, helping tests run faster while maintaining adequate test coverage.

Consider renaming the unused loop variable to follow Python conventions:

-for i in range(50):
+for _ in range(50):

1004-1004: Reduced test iterations for better performance

Reducing the loop iterations from 100 to 50 aligns with the PR objective of using smaller models, helping tests run faster while maintaining adequate test coverage.

Consider renaming the unused loop variable to follow Python conventions:

-for i in range(50):
+for _ in range(50):
🧰 Tools
🪛 Ruff (0.8.2)

1004-1004: Loop control variable i not used within loop body

Rename unused i to _i

(B007)


1134-1134: Reduced test iterations for better performance

Reducing the loop iterations from 100 to 50 aligns with the PR objective of using smaller models, helping tests run faster while maintaining adequate test coverage.

Consider renaming the unused loop variable to follow Python conventions:

-for i in range(50):
+for _ in range(50):
🧰 Tools
🪛 Ruff (0.8.2)

1134-1134: Loop control variable i not used within loop body

Rename unused i to _i

(B007)


1170-1170: Reduced test iterations for better performance

Reducing the loop iterations from 100 to 50 aligns with the PR objective of using smaller models, helping tests run faster while maintaining adequate test coverage.

Consider renaming the unused loop variable to follow Python conventions:

-for i in range(50):
+for _ in range(50):
🧰 Tools
🪛 Ruff (0.8.2)

1170-1170: Loop control variable i not used within loop body

Rename unused i to _i

(B007)


1473-1473: Reduced test iterations for better performance

Reducing the loop iterations from 100 to 50 aligns with the PR objective of using smaller models, helping tests run faster while maintaining adequate test coverage.

Consider renaming the unused loop variable to follow Python conventions:

-for i in range(50):
+for _ in range(50):
🧰 Tools
🪛 Ruff (0.8.2)

1473-1473: Loop control variable i not used within loop body

Rename unused i to _i

(B007)

tests/congruence_tests/test_sparse_search.py (2)

164-164: Reduced test iterations for better performance

Reducing the loop iterations from 100 to 50 aligns with the PR objective of using smaller models, helping tests run faster while maintaining adequate test coverage.

Consider renaming the unused loop variable to follow Python conventions:

-for i in range(50):
+for _ in range(50):
🧰 Tools
🪛 Ruff (0.8.2)

164-164: Loop control variable i not used within loop body

Rename unused i to _i

(B007)


195-195: Reduced test iterations for better performance

Reducing the loop iterations from 100 to 50 aligns with the PR objective of using smaller models, helping tests run faster while maintaining adequate test coverage.

Consider renaming the unused loop variable to follow Python conventions:

-for i in range(50):
+for _ in range(50):
🧰 Tools
🪛 Ruff (0.8.2)

195-195: Loop control variable i not used within loop body

Rename unused i to _i

(B007)

tests/congruence_tests/test_search.py (3)

169-169: Prefer using _ for unused loop variables.

The loop control variable i is not used within the loop body. Consider using _ to indicate an intentionally unused variable.

-    for i in range(50):
+    for _ in range(50):
🧰 Tools
🪛 Ruff (0.8.2)

169-169: Loop control variable i not used within loop body

Rename unused i to _i

(B007)


201-201: Prefer using _ for unused loop variables.

The loop control variable i is not used within the loop body. Consider using _ to indicate an intentionally unused variable.

-    for i in range(50):
+    for _ in range(50):
🧰 Tools
🪛 Ruff (0.8.2)

201-201: Loop control variable i not used within loop body

Rename unused i to _i

(B007)


228-228: Prefer using _ for unused loop variables.

The loop control variable i is not used within the loop body. Consider using _ to indicate an intentionally unused variable.

-    for i in range(50):
+    for _ in range(50):
🧰 Tools
🪛 Ruff (0.8.2)

228-228: Loop control variable i not used within loop body

Rename unused i to _i

(B007)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2c7b58d and 752abce.

📒 Files selected for processing (18)
  • tests/congruence_tests/test_count.py (2 hunks)
  • tests/congruence_tests/test_group_search.py (2 hunks)
  • tests/congruence_tests/test_query.py (5 hunks)
  • tests/congruence_tests/test_search.py (3 hunks)
  • tests/congruence_tests/test_sparse_search.py (2 hunks)
  • tests/test_async_qdrant_client.py (2 hunks)
  • tests/test_local_persistence.py (2 hunks)
  • tests/test_async_qdrant_client.py (4 hunks)
  • tests/test_async_qdrant_client.py (2 hunks)
  • tests/test_async_qdrant_client.py (4 hunks)
  • tests/congruence_tests/test_group_search.py (1 hunks)
  • tests/congruence_tests/test_count.py (2 hunks)
  • tests/congruence_tests/test_group_search.py (1 hunks)
  • tests/congruence_tests/test_query.py (5 hunks)
  • tests/congruence_tests/test_search.py (3 hunks)
  • tests/congruence_tests/test_sparse_search.py (2 hunks)
  • tests/test_async_qdrant_client.py (1 hunks)
  • tests/test_local_persistence.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (13)
  • tests/test_local_persistence.py
  • tests/congruence_tests/test_count.py
  • tests/congruence_tests/test_group_search.py
  • tests/test_async_qdrant_client.py
  • tests/test_async_qdrant_client.py
  • tests/test_async_qdrant_client.py
  • tests/test_local_persistence.py
  • tests/congruence_tests/test_group_search.py
  • tests/test_async_qdrant_client.py
  • tests/congruence_tests/test_sparse_search.py
  • tests/test_async_qdrant_client.py
  • tests/congruence_tests/test_query.py
  • tests/congruence_tests/test_search.py
🧰 Additional context used
🪛 Ruff (0.8.2)
tests/congruence_tests/test_count.py

41-41: Loop control variable i not used within loop body

Rename unused i to _i

(B007)


63-63: Loop control variable i not used within loop body

Rename unused i to _i

(B007)

tests/congruence_tests/test_group_search.py

306-306: Loop control variable i not used within loop body

Rename unused i to _i

(B007)


339-339: Loop control variable i not used within loop body

Rename unused i to _i

(B007)

tests/congruence_tests/test_query.py

1004-1004: Loop control variable i not used within loop body

Rename unused i to _i

(B007)


1134-1134: Loop control variable i not used within loop body

Rename unused i to _i

(B007)


1170-1170: Loop control variable i not used within loop body

Rename unused i to _i

(B007)


1473-1473: Loop control variable i not used within loop body

Rename unused i to _i

(B007)

tests/congruence_tests/test_search.py

169-169: Loop control variable i not used within loop body

Rename unused i to _i

(B007)


201-201: Loop control variable i not used within loop body

Rename unused i to _i

(B007)


228-228: Loop control variable i not used within loop body

Rename unused i to _i

(B007)

tests/congruence_tests/test_sparse_search.py

164-164: Loop control variable i not used within loop body

Rename unused i to _i

(B007)


195-195: Loop control variable i not used within loop body

Rename unused i to _i

(B007)

⏰ Context from checks skipped due to timeout of 90000ms (9)
  • GitHub Check: Redirect rules - poetic-froyo-8baba7
  • GitHub Check: Header rules - poetic-froyo-8baba7
  • GitHub Check: Pages changed - poetic-froyo-8baba7
  • GitHub Check: Python 3.13 test
  • GitHub Check: Python 3.13.x on ubuntu-latest test
  • GitHub Check: Python 3.12.x on ubuntu-latest test
  • GitHub Check: Python 3.11.x on ubuntu-latest test
  • GitHub Check: Python 3.10.x on ubuntu-latest test
  • GitHub Check: Python 3.9.x on ubuntu-latest test
🔇 Additional comments (1)
tests/congruence_tests/test_search.py (1)

169-177: The iteration reduction aligns with PR objectives.

Reducing the number of iterations from 100 to 50 across multiple test loops aligns with the PR objective of using smaller models. This change will help decrease test execution time while maintaining sufficient test coverage.

Also applies to: 201-209, 228-239

🧰 Tools
🪛 Ruff (0.8.2)

169-169: Loop control variable i not used within loop body

Rename unused i to _i

(B007)

@hh-space-invader hh-space-invader changed the title improve: Updated fastembed models with smaller models speed ci Mar 12, 2025
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 22c311b and 6fe422e.

📒 Files selected for processing (4)
  • tests/congruence_tests/test_common.py (1 hunks)
  • tests/congruence_tests/test_sparse_search.py (6 hunks)
  • tests/congruence_tests/test_sparse_updates.py (1 hunks)
  • tests/test_qdrant_client.py (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • tests/congruence_tests/test_sparse_updates.py
  • tests/congruence_tests/test_common.py
  • tests/test_qdrant_client.py
⏰ Context from checks skipped due to timeout of 90000ms (9)
  • GitHub Check: Redirect rules - poetic-froyo-8baba7
  • GitHub Check: Header rules - poetic-froyo-8baba7
  • GitHub Check: Pages changed - poetic-froyo-8baba7
  • GitHub Check: Python 3.13.x on ubuntu-latest test
  • GitHub Check: Python 3.12.x on ubuntu-latest test
  • GitHub Check: Python 3.13 test
  • GitHub Check: Python 3.11.x on ubuntu-latest test
  • GitHub Check: Python 3.10.x on ubuntu-latest test
  • GitHub Check: Python 3.9.x on ubuntu-latest test
🔇 Additional comments (1)
tests/congruence_tests/test_sparse_search.py (1)

147-147: Good refactoring to use a constant

Using the NUM_VECTORS constant instead of hardcoded values is a good practice that improves maintainability. This change will make it easier to adjust the number of vectors used in tests in the future.

Also applies to: 178-178, 211-211, 249-249, 293-293

Comment thread tests/congruence_tests/test_sparse_search.py Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
tests/integration-tests.sh (2)

41-43: Clarify the updated pytest command for backward compatibility tests.

The modification on lines 41–43 now appends the --durations=0 option and skips the congruence tests when either the QDRANT_VERSION is not "dev" or IGNORE_CONGRUENCE_TESTS is set to true. This explicitly standardizes the reporting of test durations and ensures that incompatibility scenarios won’t run congruence tests. Please confirm that:

  • The environment variable assignment (QDRANT_VERSION=$QDRANT_VERSION) is intentional to pass the current value into the pytest command.
  • The decision to omit congruence tests in these cases meets the intended testing strategy.

43-45: Review the default pytest command execution for standard cases.

On lines 43–45, in the “else” branch, the pytest command now always includes --durations=0. This change helps in uniform reporting of test durations. As with the previous branch, ensure that setting the environment variable QDRANT_VERSION inline is necessary for the tests downstream. If this pattern is used elsewhere, it might be worth adding a short inline comment for clarity.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d357cef and efabfe9.

📒 Files selected for processing (7)
  • tests/congruence_tests/test_count.py (2 hunks)
  • tests/congruence_tests/test_delete_points.py (4 hunks)
  • tests/congruence_tests/test_multivector_updates.py (1 hunks)
  • tests/congruence_tests/test_sparse_search.py (6 hunks)
  • tests/embed_tests/test_local_inference.py (1 hunks)
  • tests/integration-tests.sh (1 hunks)
  • tests/test_migrate.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (6)
  • tests/congruence_tests/test_count.py
  • tests/test_migrate.py
  • tests/congruence_tests/test_multivector_updates.py
  • tests/congruence_tests/test_delete_points.py
  • tests/congruence_tests/test_sparse_search.py
  • tests/embed_tests/test_local_inference.py
⏰ Context from checks skipped due to timeout of 90000ms (9)
  • GitHub Check: Redirect rules - poetic-froyo-8baba7
  • GitHub Check: Header rules - poetic-froyo-8baba7
  • GitHub Check: Pages changed - poetic-froyo-8baba7
  • GitHub Check: Python 3.13.x on ubuntu-latest test
  • GitHub Check: Python 3.12.x on ubuntu-latest test
  • GitHub Check: Python 3.11.x on ubuntu-latest test
  • GitHub Check: Python 3.10.x on ubuntu-latest test
  • GitHub Check: Python 3.9.x on ubuntu-latest test
  • GitHub Check: Python 3.13 test

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between dd8085d and 0f2b5a4.

📒 Files selected for processing (1)
  • tests/congruence_tests/test_facet.py (1 hunks)
🧰 Additional context used
🧬 Code Definitions (1)
tests/congruence_tests/test_facet.py (1)
tests/congruence_tests/test_common.py (1) (1)
  • generate_fixtures (117-132)
⏰ Context from checks skipped due to timeout of 90000ms (9)
  • GitHub Check: Redirect rules - poetic-froyo-8baba7
  • GitHub Check: Header rules - poetic-froyo-8baba7
  • GitHub Check: Pages changed - poetic-froyo-8baba7
  • GitHub Check: Python 3.13.x on ubuntu-latest test
  • GitHub Check: Python 3.12.x on ubuntu-latest test
  • GitHub Check: Python 3.11.x on ubuntu-latest test
  • GitHub Check: Python 3.10.x on ubuntu-latest test
  • GitHub Check: Python 3.9.x on ubuntu-latest test
  • GitHub Check: Python 3.13 test

Comment on lines 30 to +31
def fixture_points() -> list[models.PointStruct]:
return generate_fixtures()
return generate_fixtures(1000)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Consider the impact of increasing the vector count.

You've explicitly specified 1000 vectors here, which is 10x more than the default NUM_VECTORS (100) defined in test_common.py. This change appears to contradict the PR objective of "updating vectors with smaller ones".

While this change aligns with the QUERY_VECTORS_NUM constant used in other test files, increasing the vector count will:

  1. Increase test execution time
  2. Consume more memory during tests

🏁 Script executed:

#!/bin/bash
# Check if other test files are using the same approach
echo "Checking other test files for similar patterns:"
rg --type py "generate_fixtures\([0-9]" tests/congruence_tests
echo ""
echo "Checking for NUM_VECTORS constant usage:"
rg --type py "NUM_VECTORS" tests/congruence_tests

Length of output: 8680


Action Required: Reevaluate Vector Count in Test Fixture

The usage of 1000 vectors in tests/congruence_tests/test_facet.py is considerably higher than the typical count (e.g., 100 defined in tests/congruence_tests/test_common.py and lower counts in other test files). This discrepancy, while mirroring the QUERY_VECTORS_NUM usage in some tests, could lead to prolonged test execution times and increased memory consumption. Please review whether this higher count is intentional under the new PR objectives of "updating vectors with smaller ones," or if it should be adjusted to maintain consistency across the tests.

@joein joein self-requested a review March 25, 2025 21:42
Copy link
Copy Markdown
Member

@joein joein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR might decrease the amount of cases we cover in tests with payload filters, which is undesirable
I've created another PR which updates the amount of vectors only in those tests which do not run queries with payload filters

Also, it seems that test which consumes the biggest amount of time is tests/embed_tests/test_local_inference

I think it makes sense to optimize it
Most of the tests there takes more than 5 seconds to run
These tests were the most time consuming on my machine:
test_upsert 17.83
test_upload 38.70
test_propagate_options 21.31
test_inference_object 14.38
test_upload_mixed_batches_upload_points 34.44
test_upload_mixed_batches_upload_collection 33.15

I think it might worth it to add some caches to those tests
E.g., add cache to fastembed classes via mocking, so we would not need to compute embeddings for the same documents when possible (might not be possible when we're using parallel in some of the methods)
Moreover, we might even store some of the model instances in cache and reuse them if options we are initializing them with are the same

This was referenced Mar 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants