Update supported model documentation in docs#609
Update supported model documentation in docs#609wilyJ80 wants to merge 3 commits intoqdrant:mainfrom
Conversation
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
📝 WalkthroughWalkthroughThis PR updates the Supported_Models.ipynb notebook by re-executing cells to populate execution metadata, output results, and kernel specifications. Additionally, pyproject.toml dependency constraints are updated: Python minimum version raised from >=3.10.0 to >=3.11.0, onnxruntime upgraded from >1.20.0 to ^1.24.2 (for Python >=3.13), pandas dependency added at ^3.0.1, and numpy version constraints expanded with additional Python-version-specific entries. Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs). Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
pyproject.toml (1)
14-24:⚠️ Potential issue | 🟡 MinorRemove the dead numpy dependency branch for Python <3.11.
With the global Python floor set to
>=3.11.0at line 14, the numpy entry with markerpython = ">=3.10,<3.11"(line 16) is unreachable and should be removed. This keeps the dependency resolution logic clean.Note: The onnxruntime marker
>=3.10,<3.13includes an overlapping 3.10-3.11 range, but the 3.11-3.13 portion remains valid, so it's not a dead branch.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pyproject.toml` around lines 14 - 24, The pyproject.toml sets a global python floor ">=3.11.0", so the numpy dependency entry with the marker python = ">=3.10,<3.11" is dead code and should be removed; locate the numpy array in pyproject.toml and delete the object whose version marker is ">=1.21,<2.3.0" paired with python = ">=3.10,<3.11", leaving the remaining numpy entries intact (including the ">=1.21" for ">=3.11,<3.12", ">=1.26" for ">=3.12,<3.13", and ">=2.1.0"/">=2.3.0" branches) so dependency resolution remains correct.
🧹 Nitpick comments (1)
docs/examples/Supported_Models.ipynb (1)
953-967: Avoid committing machine-specific notebook kernel metadata.
display_nameand exact local Python patch version are environment-specific and tend to create unnecessary churn in docs diffs. Prefer a stable, generic kernelspec label.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/examples/Supported_Models.ipynb` around lines 953 - 967, The notebook contains machine-specific kernelspec metadata (keys "display_name" and "language_info.version") that cause noisy diffs; update the kernelspec to a stable generic label (e.g., set "display_name" to "Python 3" or remove it) and remove or generalize the exact patch version in "language_info.version" (e.g., "3.14" or omit the patch component) so the kernelspec/name remains "python3" and the notebook metadata is environment-agnostic; edit the notebook JSON entries for "display_name" and "language_info.version" accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/examples/Supported_Models.ipynb`:
- Around line 535-543: The table contains a duplicated SPLADE entry with
conflicting model IDs ("prithivida/Splade_PP_en_v1" vs "prithvida/..."); pick
the canonical model ID (confirm which spelling is correct), remove the duplicate
row for the incorrect ID, and ensure the remaining row has consistent fields
(model name, description, license, score, tokenizer vocab size) and any internal
references in the notebook updated to use that canonical ID (look for the table
row strings containing "Splade_PP_en_v1", "prithivida", and "prithvida" to
locate and fix).
In `@pyproject.toml`:
- Line 38: The pyproject currently lists pandas = "^3.0.1" under the main
dependencies; remove that entry from [tool.poetry.dependencies] and add the same
pandas = "^3.0.1" line under the docs group section (e.g.
[tool.poetry.group.docs.dependencies]) so pandas is only installed for
documentation builds; update any lockfile or run poetry lock/poetry install as
needed after moving the dependency.
---
Outside diff comments:
In `@pyproject.toml`:
- Around line 14-24: The pyproject.toml sets a global python floor ">=3.11.0",
so the numpy dependency entry with the marker python = ">=3.10,<3.11" is dead
code and should be removed; locate the numpy array in pyproject.toml and delete
the object whose version marker is ">=1.21,<2.3.0" paired with python =
">=3.10,<3.11", leaving the remaining numpy entries intact (including the
">=1.21" for ">=3.11,<3.12", ">=1.26" for ">=3.12,<3.13", and
">=2.1.0"/">=2.3.0" branches) so dependency resolution remains correct.
---
Nitpick comments:
In `@docs/examples/Supported_Models.ipynb`:
- Around line 953-967: The notebook contains machine-specific kernelspec
metadata (keys "display_name" and "language_info.version") that cause noisy
diffs; update the kernelspec to a stable generic label (e.g., set "display_name"
to "Python 3" or remove it) and remove or generalize the exact patch version in
"language_info.version" (e.g., "3.14" or omit the patch component) so the
kernelspec/name remains "python3" and the notebook metadata is
environment-agnostic; edit the notebook JSON entries for "display_name" and
"language_info.version" accordingly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 6a7ea41b-687f-43d0-953d-dee278aab1b2
⛔ Files ignored due to path filters (2)
poetry.lockis excluded by!**/*.lockuv.lockis excluded by!**/*.lock
📒 Files selected for processing (2)
docs/examples/Supported_Models.ipynbpyproject.toml
| " <tr>\n", | ||
| " <th>4</th>\n", | ||
| " <td>prithivida/Splade_PP_en_v1</td>\n", | ||
| " <td>Independent Implementation of SPLADE++ Model f...</td>\n", | ||
| " <td>apache-2.0</td>\n", | ||
| " <td>0.532</td>\n", | ||
| " <td>None</td>\n", | ||
| " <td>30522</td>\n", | ||
| " </tr>\n", |
There was a problem hiding this comment.
Remove/resolve duplicated SPLADE entry with conflicting model IDs.
The table currently publishes both prithvida/... and prithivida/... variants for what appears to be the same model, which is confusing in user-facing docs and likely a typo/duplicate source entry.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/examples/Supported_Models.ipynb` around lines 535 - 543, The table
contains a duplicated SPLADE entry with conflicting model IDs
("prithivida/Splade_PP_en_v1" vs "prithvida/..."); pick the canonical model ID
(confirm which spelling is correct), remove the duplicate row for the incorrect
ID, and ensure the remaining row has consistent fields (model name, description,
license, score, tokenizer vocab size) and any internal references in the
notebook updated to use that canonical ID (look for the table row strings
containing "Splade_PP_en_v1", "prithivida", and "prithvida" to locate and fix).
| ] | ||
| mmh3 = ">=4.1.0,<6.0.0" | ||
| py-rust-stemmers = "^0.1.0" | ||
| pandas = "^3.0.1" |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
rg -nP '^\s*(import pandas\b|from pandas\b)' -g '!**/.venv/**' -g '!**/site-packages/**'Repository: qdrant/fastembed
Length of output: 42
🏁 Script executed:
# Search for pandas usage anywhere in the codebase
rg -i 'pandas' --type python -g '!**/.venv/**' -g '!**/site-packages/**'
# Also check pyproject.toml structure around the pandas dependency
head -50 pyproject.tomlRepository: qdrant/fastembed
Length of output: 1690
🏁 Script executed:
# Search for pandas usage more broadly without type restriction
rg -i 'pandas' -g '!**/.venv/**' -g '!**/site-packages/**'
# Also check the complete pyproject.toml
cat pyproject.tomlRepository: qdrant/fastembed
Length of output: 11675
Move pandas to the docs dependency group.
Pandas is imported only in notebook documentation (docs/examples/ and docs/qdrant/), not in any fastembed/ package code. It should be in [tool.poetry.group.docs.dependencies] rather than [tool.poetry.dependencies] to avoid unnecessarily increasing the install surface for users who only need the core library.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pyproject.toml` at line 38, The pyproject currently lists pandas = "^3.0.1"
under the main dependencies; remove that entry from [tool.poetry.dependencies]
and add the same pandas = "^3.0.1" line under the docs group section (e.g.
[tool.poetry.group.docs.dependencies]) so pandas is only installed for
documentation builds; update any lockfile or run poetry lock/poetry install as
needed after moving the dependency.
All Submissions:
New Feature Submissions:
pre-commitwithpip3 install pre-commitand set up hooks withpre-commit install?