LCORE-1617 Centralize Llama Stack Vector Store#1473
LCORE-1617 Centralize Llama Stack Vector Store#1473JslYoon wants to merge 9 commits intolightspeed-core:mainfrom
Conversation
Signed-off-by: Lucas <lyoon@redhat.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds a new FastAPI router Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client (HTTP)
participant Auth as Auth Dependency
participant Config as Configuration
participant API as Vector Stores API
participant Llama as AsyncLlamaStackClient
Client->>API: HTTP request (e.g., POST /v1/vector-stores or /files)
API->>Auth: authorize(Action.*) / get_auth_dependency()
Auth-->>API: auth tuple
API->>Config: check_configuration_loaded(configuration)
Config-->>API: ok / error
API->>Llama: AsyncLlamaStackClientHolder().get_client()
Llama-->>API: client
API->>Llama: client.vector_stores.* or client.files.* (create/add/list/get/delete)
alt add_file lock error
Llama-->>API: LockError
API->>Llama: retry (exponential backoff, up to 3 attempts)
end
Llama-->>API: success / other error
API-->>Client: HTTP response (mapped model or error)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
✨ Simplify code
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 7
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/app/endpoints/vector_stores.py`:
- Around line 129-144: Remove the ad-hoc debugging call to await
client.models.list() and any plain print() statements in vector store request
handling (including the print at the start of the create flow and the other
print occurrences noted) and either delete them or convert them to structured
logger.debug() calls; additionally, change the logger.info that currently logs
the full request payload (the logger.info referencing "Creating vector store -
body_dict: %s, extra_body: %s") to either logger.debug or redact
sensitive/session/document fields before logging so no full request body is
emitted at info level; locate the code by the body.model_dump(exclude_none=True)
usage and the extra_body construction to apply these changes.
- Around line 366-369: Update both delete route decorators (e.g.,
`@router.delete`("/vector-stores/{vector_store_id}") and the other delete route
for entries) to include status_code=status.HTTP_204_NO_CONTENT so the actual
response matches the OpenAPI responses={"204": ...}; also fix the imports by
adding status and importing Depends directly from fastapi (e.g., from fastapi
import Depends, status) instead of importing Depends from fastapi.params and
ensure any existing references to status.HTTP_204_NO_CONTENT resolve.
- Around line 65-73: file_responses currently advertises a 503 via
ServiceUnavailableResponse for client-side upload validation failures; replace
that with a 400 Bad Request mapping and ensure BadRequestError is translated to
an HTTP 400 response. Update the file_responses dict to include a
BadRequestResponse/OpenAPI 400 entry instead of ServiceUnavailableResponse, and
in the handler where BadRequestError is caught (reference BadRequestError and
any exception handling code around the upload endpoint) raise or return FastAPI
HTTPException(status_code=400) so the OpenAPI and runtime behavior consistently
reflect a 400 Bad Request for malformed uploads.
In `@src/models/requests.py`:
- Around line 1009-1050: The VectorStoreUpdateRequest model currently allows an
empty payload because all fields are optional; add a Pydantic `@model_validator`
(e.g., on VectorStoreUpdateRequest) that runs after parsing and checks that at
least one of name, expires_at, or metadata is not None (and treat an empty
metadata dict as “empty”); if all are absent/empty, raise a ValueError so
requests return 422. Use `@field_validator` only if you need per-field
normalization (e.g., converting empty metadata to None) and reference the class
name VectorStoreUpdateRequest and the field names name, expires_at, metadata
when implementing the validators.
- Around line 1053-1097: The VectorStoreFileCreateRequest model does not enforce
the documented limits on the attributes field (max 16 pairs, keys max 64 chars,
string values max 512 chars, values may also be bool/number), so add validation:
implement a `@field_validator`("attributes", mode="before" or "after") to ensure
the field is a dict when present, has at most 16 entries, that each key is a
string <=64 chars, and each value is either a string (<=512 chars), bool, or
number; then implement a `@model_validator` to enforce any cross-field invariants
if needed and raise clear ValueError messages on violations. Reference the
VectorStoreFileCreateRequest class and the attributes field when adding these
Pydantic validators; keep model_config as-is (extra: forbid).
In `@tests/unit/app/endpoints/test_vector_stores.py`:
- Around line 33-52: The test fakes lack the optional fields the real handler
expects; update the VectorStore __init__ to accept metadata: Optional[dict]=None
and attributes: Optional[dict]=None (defaulting to None) and assign them to
self.metadata and self.attributes, and likewise add those optional
parameters/attributes to the other mock class in the same file (the file handler
fake around lines 78-90) so create_vector_store() and file handlers can read
metadata/attributes without error.
- Around line 169-172: The assertions are inside the pytest.raises context and
never run after the awaited create_vector_store raises; move the assertions so
they execute after the with block. Specifically, in
tests/unit/app/endpoints/test_vector_stores.py surrounding the call to
create_vector_store(request=request, auth=auth, body=body), keep the "with
pytest.raises(HTTPException) as e:" only wrapping the await, then after exiting
that block assert e.value.status_code == status.HTTP_500_INTERNAL_SERVER_ERROR
and assert e.value.detail["response"] == "Configuration is not loaded" (and make
the same change for the other occurrence around lines 224-227).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: b8a0a1c8-d00b-4162-a0c2-b9dac912b21d
📒 Files selected for processing (6)
src/app/endpoints/vector_stores.pysrc/app/routers.pysrc/models/config.pysrc/models/requests.pysrc/models/responses.pytests/unit/app/endpoints/test_vector_stores.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build-pr
🧰 Additional context used
📓 Path-based instructions (6)
src/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/**/*.py: Use absolute imports for internal modules (e.g.,from authentication import get_auth_dependency)
Usefrom llama_stack_client import AsyncLlamaStackClientfor Llama Stack imports
Checkconstants.pyfor shared constants before defining new ones
All modules start with descriptive docstrings explaining purpose
Uselogger = get_logger(__name__)fromlog.pyfor module logging
Type aliases defined at module level for clarity
All functions require docstrings with brief descriptions
Use complete type annotations for function parameters and return types
Use union types with modern syntax:str | intinstead ofUnion[str, int]
UseOptional[Type]for optional types in type annotations
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns: return new data structures instead of modifying parameters
Useasync deffor I/O operations and external API calls
HandleAPIConnectionErrorfrom Llama Stack in error handling
Uselogger.debug()for detailed diagnostic information
Uselogger.info()for general information about program execution
Uselogger.warning()for unexpected events or potential problems
Uselogger.error()for serious problems that prevented function execution
All classes require descriptive docstrings explaining purpose
Use PascalCase for class names with descriptive names and standard suffixes:Configuration,Error/Exception,Resolver,Interface
Use complete type annotations for all class attributes; avoid usingAny
Follow Google Python docstring conventions for all modules, classes, and functions
IncludeParameters:,Returns:,Raises:sections in function docstrings as needed
Files:
src/app/routers.pysrc/models/config.pysrc/models/requests.pysrc/models/responses.pysrc/app/endpoints/vector_stores.py
src/app/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/app/**/*.py: Usefrom fastapi import APIRouter, HTTPException, Request, status, Dependsfor FastAPI dependencies
Use FastAPIHTTPExceptionwith appropriate status codes for API endpoint error handling
Files:
src/app/routers.pysrc/app/endpoints/vector_stores.py
src/models/config.py
📄 CodeRabbit inference engine (AGENTS.md)
src/models/config.py: All config uses Pydantic models extendingConfigurationBase
Base class setsextra="forbid"to reject unknown fields in Pydantic configuration models
Use type hints:Optional[FilePath],PositiveInt,SecretStrfor configuration fields
Pydantic configuration models should extendConfigurationBase
Files:
src/models/config.py
src/models/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/models/**/*.py: Use@field_validatorand@model_validatorfor custom validation in Pydantic models
Usetyping_extensions.Selffor model validators in type annotations
Pydantic data models should extendBaseModel
IncludeAttributes:section in Pydantic model docstrings
Files:
src/models/config.pysrc/models/requests.pysrc/models/responses.py
tests/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Usepytest.mark.asynciomarker for async unit tests
Files:
tests/unit/app/endpoints/test_vector_stores.py
tests/unit/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Use
pytest-mockfor AsyncMock objects in unit tests
Files:
tests/unit/app/endpoints/test_vector_stores.py
🧠 Learnings (3)
📚 Learning: 2026-01-12T10:58:40.230Z
Learnt from: blublinsky
Repo: lightspeed-core/lightspeed-stack PR: 972
File: src/models/config.py:459-513
Timestamp: 2026-01-12T10:58:40.230Z
Learning: In lightspeed-core/lightspeed-stack, for Python files under src/models, when a user claims a fix is done but the issue persists, verify the current code state before accepting the fix. Steps: review the diff, fetch the latest changes, run relevant tests, reproduce the issue, search the codebase for lingering references to the original problem, confirm the fix is applied and not undone by subsequent commits, and validate with local checks to ensure the issue is resolved.
Applied to files:
src/models/config.pysrc/models/requests.pysrc/models/responses.py
📚 Learning: 2026-02-25T07:46:33.545Z
Learnt from: asimurka
Repo: lightspeed-core/lightspeed-stack PR: 1211
File: src/models/responses.py:8-16
Timestamp: 2026-02-25T07:46:33.545Z
Learning: In the Python codebase, requests.py should use OpenAIResponseInputTool as Tool while responses.py uses OpenAIResponseTool as Tool. This difference is intentional due to differing schemas for input vs output tools in llama-stack-api. Apply this distinction consistently to other models under src/models (e.g., ensure request-related tools use the InputTool variant and response-related tools use the ResponseTool variant). If adding new tools, choose the corresponding InputTool or Tool class based on whether the tool represents input or output, and document the rationale in code comments.
Applied to files:
src/models/config.pysrc/models/requests.pysrc/models/responses.py
📚 Learning: 2026-04-06T20:18:07.852Z
Learnt from: major
Repo: lightspeed-core/lightspeed-stack PR: 1463
File: src/app/endpoints/rlsapi_v1.py:266-271
Timestamp: 2026-04-06T20:18:07.852Z
Learning: In the lightspeed-stack codebase, within `src/app/endpoints/` inference/MCP endpoints, treat `tools: Optional[list[Any]]` in MCP tool definitions as an intentional, consistent typing pattern (used across `query`, `responses`, `streaming_query`, `rlsapi_v1`). Do not raise or suggest this as a typing issue during code review; changing it in isolation could break endpoint typing consistency across the codebase.
Applied to files:
src/app/endpoints/vector_stores.py
🪛 GitHub Actions: Unit tests
src/app/endpoints/vector_stores.py
[error] 159-159: Test_create_vector_store_success failed: AttributeError in create_vector_store when building VectorStoreResponse: 'VectorStore' object has no attribute 'metadata'
[error] 214-214: Test_list_vector_stores_success failed: AttributeError in list_vector_stores when building VectorStoreResponse: 'VectorStore' object has no attribute 'metadata'
[error] 275-275: Test_get_vector_store_success failed: AttributeError in get_vector_store when building VectorStoreResponse: 'VectorStore' object has no attribute 'metadata'
[error] 345-345: Test_update_vector_store_success failed: AttributeError in update_vector_store when building VectorStoreResponse: 'VectorStore' object has no attribute 'metadata'
[error] 590-590: Test_add_file_to_vector_store_success failed: AttributeError in add_file_to_vector_store when building VectorStoreFileResponse: 'VectorStoreFile' object has no attribute 'attributes'
[error] 661-661: Test_list_vector_store_files_success failed: AttributeError in list_vector_store_files when building VectorStoreFileResponse: 'VectorStoreFile' object has no attribute 'attributes'
[error] 736-736: Test_get_vector_store_file_success failed: AttributeError in get_vector_store_file when building VectorStoreFileResponse: 'VectorStoreFile' object has no attribute 'attributes'
Signed-off-by: Lucas <lyoon@redhat.com>
There was a problem hiding this comment.
Actionable comments posted: 4
♻️ Duplicate comments (4)
tests/unit/app/endpoints/test_vector_stores.py (1)
173-176:⚠️ Potential issue | 🟠 MajorMove the assertions outside
pytest.raises.Once the awaited call raises, the remaining lines inside the context manager never run. These tests currently verify only that some
HTTPExceptionwas raised.Suggested fix
with pytest.raises(HTTPException) as e: await create_vector_store(request=request, auth=auth, body=body) - assert e.value.status_code == status.HTTP_500_INTERNAL_SERVER_ERROR - assert e.value.detail["response"] == "Configuration is not loaded" # type: ignore + assert e.value.status_code == status.HTTP_500_INTERNAL_SERVER_ERROR + assert e.value.detail["response"] == "Configuration is not loaded" # type: ignore @@ with pytest.raises(HTTPException) as e: await create_vector_store(request=request, auth=auth, body=body) - assert e.value.status_code == status.HTTP_503_SERVICE_UNAVAILABLE - assert e.value.detail["response"] == "Unable to connect to Llama Stack" # type: ignore + assert e.value.status_code == status.HTTP_503_SERVICE_UNAVAILABLE + assert e.value.detail["response"] == "Unable to connect to Llama Stack" # type: ignoreAlso applies to: 228-231
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/app/endpoints/test_vector_stores.py` around lines 173 - 176, The assertions are inside the pytest.raises context so they never run after await create_vector_store(...) raises; move the two assert lines out of the with pytest.raises(HTTPException) as e: block so you capture the exception into e and then assert e.value.status_code and e.value.detail["response"] after the context. Apply the same change for the second occurrence around lines referencing the same pattern (the other test block at 228-231) and keep references to create_vector_store, pytest.raises, e.value.status_code and e.value.detail["response"] when updating the tests.src/app/endpoints/vector_stores.py (3)
139-143:⚠️ Potential issue | 🟠 MajorStop logging the full create payload at
info.
body_dictandextra_bodycan carry notebook/session metadata. Emitting the whole request at info level is noisy and leaks more context than this endpoint needs. Log stable identifiers only, and keep payload-level diagnostics at debug.Suggested fix
- logger.info( - "Creating vector store - body_dict: %s, extra_body: %s", - body_dict, - extra_body, - ) + logger.debug( + "Creating vector store '%s' with provider '%s'", + body.name, + extra_body.get("provider_id"), + )As per coding guidelines, use
logger.debug()for detailed diagnostic information.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/app/endpoints/vector_stores.py` around lines 139 - 143, Change the noisy info-level logging that prints the full payload to log only stable identifiers at info and move payload diagnostics to debug: replace the current logger.info call that references body_dict and extra_body with an info log that emits only stable identifiers (e.g., request id, user_id, notebook_id or session_id if available) and use logger.debug to record the full body_dict/extra_body for detailed diagnostics; update references in the surrounding function (the logger usage in this module, e.g., the create vector store endpoint in vector_stores.py) so sensitive/session metadata is not logged at info level.
65-73:⚠️ Potential issue | 🟠 MajorReturn 400 for upload bad requests, not 503.
BadRequestErrorhere is a client-side upload failure, but the route currently advertises and returnsServiceUnavailableResponse. That makes malformed uploads look like transient backend outages and encourages retries instead of request fixes. The matching bad-request unit test will need to flip with this change.Based on learnings, use FastAPI
HTTPExceptionwith appropriate status codes for API endpoint error handling.Also applies to: 487-490
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/app/endpoints/vector_stores.py` around lines 65 - 73, The route currently advertises and returns ServiceUnavailableResponse (503) for client-side BadRequestError; change the response mapping to return a 400 BadRequestResponse and raise FastAPI HTTPException(status_code=400) when catching BadRequestError so uploads are treated as client errors; update file_responses to include BadRequestResponse.openapi_response() instead of ServiceUnavailableResponse.openapi_response() and adjust the error handling in the same handler (and the similar handler referenced around the code marked 487-490) so unit tests expect 400 instead of 503.
9-10:⚠️ Potential issue | 🟠 MajorDeclare 204 on both delete routes and fix the FastAPI imports.
responses={"204": ...}only changes the OpenAPI schema. Withoutstatus_code=status.HTTP_204_NO_CONTENT, both handlers still return 200 at runtime. The import also bypasses the repo’s requiredfastapiimport form forDepends/status.Suggested fix
-from fastapi import APIRouter, File, HTTPException, Request, UploadFile -from fastapi.params import Depends +from fastapi import APIRouter, Depends, File, HTTPException, Request, UploadFile, status @@ `@router.delete`( "/vector-stores/{vector_store_id}", + status_code=status.HTTP_204_NO_CONTENT, responses={"204": {"description": "Vector store deleted"}}, ) @@ `@router.delete`( "/vector-stores/{vector_store_id}/files/{file_id}", + status_code=status.HTTP_204_NO_CONTENT, responses={"204": {"description": "File deleted from vector store"}}, )As per coding guidelines, use
from fastapi import APIRouter, HTTPException, Request, status, Dependsfor FastAPI dependencies.Also applies to: 365-368, 760-763
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/app/endpoints/vector_stores.py` around lines 9 - 10, The FastAPI import and delete handlers are incorrect: change the import to use the repo style "from fastapi import APIRouter, HTTPException, Request, status, Depends" (remove the separate fastapi.params import) and update both delete route handlers that currently only set responses={"204": ...} to also set status_code=status.HTTP_204_NO_CONTENT so they return 204 at runtime; locate the routes in src/app/endpoints/vector_stores.py that declare a 204 response (the two DELETE handlers) and add status_code=status.HTTP_204_NO_CONTENT to their decorator arguments.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/app/endpoints/vector_stores.py`:
- Around line 86-94: The list_vector_store_files() handler currently only maps
APIConnectionError and generic Exception to responses; update
vector_store_files_list_responses to include a 404 entry (use the appropriate
NotFound/BadRequest response helper used elsewhere) and add a specific except
BadRequestError branch in list_vector_store_files() that returns the 404
response, mirroring how get_vector_store() handles BadRequestError; also update
the function docstring to mention the 404/BadRequest response.
- Around line 450-468: The code double-buffers uploaded files by calling await
file.read() then wrapping into BytesIO(content); change to pass the payload
directly to the Llama client as a tuple (filename, content) or (filename,
file.file) to avoid copying (use client.files.create(file=(filename, content),
purpose="assistants") or client.files.create(file=(filename, file.file),
purpose="assistants")), remove the BytesIO usage (file_bytes and its .name
assignment), and update logging to use the filename and size without forcing a
second copy; also add a pre-read size check using the request's Content-Length
header (and raise HTTPException when over the allowed limit) so you validate
large uploads before materializing them into memory.
In `@tests/unit/app/endpoints/test_vector_stores.py`:
- Around line 184-205: Add a test case that verifies the adapter moves
provider_id, embedding_model, and embedding_dimension from the main request into
extra_body when calling the client: use the existing test setup
(get_test_request, get_test_auth, VectorStoreCreateRequest) but include
provider_id, embedding_model, and embedding_dimension in the body; patch
AsyncLlamaStackClientHolder.get_client to return mock_client and assert
mock_client.vector_stores.create was called with extra_body containing those
three keys (and that the top-level body forwarded to create does not contain
them) while still returning a VectorStore("vs_123","test_store") and asserting
the response fields as in the current test.
---
Duplicate comments:
In `@src/app/endpoints/vector_stores.py`:
- Around line 139-143: Change the noisy info-level logging that prints the full
payload to log only stable identifiers at info and move payload diagnostics to
debug: replace the current logger.info call that references body_dict and
extra_body with an info log that emits only stable identifiers (e.g., request
id, user_id, notebook_id or session_id if available) and use logger.debug to
record the full body_dict/extra_body for detailed diagnostics; update references
in the surrounding function (the logger usage in this module, e.g., the create
vector store endpoint in vector_stores.py) so sensitive/session metadata is not
logged at info level.
- Around line 65-73: The route currently advertises and returns
ServiceUnavailableResponse (503) for client-side BadRequestError; change the
response mapping to return a 400 BadRequestResponse and raise FastAPI
HTTPException(status_code=400) when catching BadRequestError so uploads are
treated as client errors; update file_responses to include
BadRequestResponse.openapi_response() instead of
ServiceUnavailableResponse.openapi_response() and adjust the error handling in
the same handler (and the similar handler referenced around the code marked
487-490) so unit tests expect 400 instead of 503.
- Around line 9-10: The FastAPI import and delete handlers are incorrect: change
the import to use the repo style "from fastapi import APIRouter, HTTPException,
Request, status, Depends" (remove the separate fastapi.params import) and update
both delete route handlers that currently only set responses={"204": ...} to
also set status_code=status.HTTP_204_NO_CONTENT so they return 204 at runtime;
locate the routes in src/app/endpoints/vector_stores.py that declare a 204
response (the two DELETE handlers) and add
status_code=status.HTTP_204_NO_CONTENT to their decorator arguments.
In `@tests/unit/app/endpoints/test_vector_stores.py`:
- Around line 173-176: The assertions are inside the pytest.raises context so
they never run after await create_vector_store(...) raises; move the two assert
lines out of the with pytest.raises(HTTPException) as e: block so you capture
the exception into e and then assert e.value.status_code and
e.value.detail["response"] after the context. Apply the same change for the
second occurrence around lines referencing the same pattern (the other test
block at 228-231) and keep references to create_vector_store, pytest.raises,
e.value.status_code and e.value.detail["response"] when updating the tests.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: eea96eaa-fe50-41d6-b442-dd1b7c8bd69a
📒 Files selected for processing (3)
src/app/endpoints/vector_stores.pytests/unit/app/endpoints/test_vector_stores.pytests/unit/app/test_routers.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: build-pr
- GitHub Check: E2E: library mode / ci
- GitHub Check: E2E Tests for Lightspeed Evaluation job
- GitHub Check: E2E: server mode / ci
🧰 Additional context used
📓 Path-based instructions (4)
tests/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Usepytest.mark.asynciomarker for async unit tests
Files:
tests/unit/app/test_routers.pytests/unit/app/endpoints/test_vector_stores.py
tests/unit/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Use
pytest-mockfor AsyncMock objects in unit tests
Files:
tests/unit/app/test_routers.pytests/unit/app/endpoints/test_vector_stores.py
src/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/**/*.py: Use absolute imports for internal modules (e.g.,from authentication import get_auth_dependency)
Usefrom llama_stack_client import AsyncLlamaStackClientfor Llama Stack imports
Checkconstants.pyfor shared constants before defining new ones
All modules start with descriptive docstrings explaining purpose
Uselogger = get_logger(__name__)fromlog.pyfor module logging
Type aliases defined at module level for clarity
All functions require docstrings with brief descriptions
Use complete type annotations for function parameters and return types
Use union types with modern syntax:str | intinstead ofUnion[str, int]
UseOptional[Type]for optional types in type annotations
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns: return new data structures instead of modifying parameters
Useasync deffor I/O operations and external API calls
HandleAPIConnectionErrorfrom Llama Stack in error handling
Uselogger.debug()for detailed diagnostic information
Uselogger.info()for general information about program execution
Uselogger.warning()for unexpected events or potential problems
Uselogger.error()for serious problems that prevented function execution
All classes require descriptive docstrings explaining purpose
Use PascalCase for class names with descriptive names and standard suffixes:Configuration,Error/Exception,Resolver,Interface
Use complete type annotations for all class attributes; avoid usingAny
Follow Google Python docstring conventions for all modules, classes, and functions
IncludeParameters:,Returns:,Raises:sections in function docstrings as needed
Files:
src/app/endpoints/vector_stores.py
src/app/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/app/**/*.py: Usefrom fastapi import APIRouter, HTTPException, Request, status, Dependsfor FastAPI dependencies
Use FastAPIHTTPExceptionwith appropriate status codes for API endpoint error handling
Files:
src/app/endpoints/vector_stores.py
🧠 Learnings (3)
📚 Learning: 2026-04-05T12:19:36.009Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-05T12:19:36.009Z
Learning: Applies to src/app/**/*.py : Use FastAPI `HTTPException` with appropriate status codes for API endpoint error handling
Applied to files:
src/app/endpoints/vector_stores.py
📚 Learning: 2026-04-05T12:19:36.009Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-05T12:19:36.009Z
Learning: Applies to src/app/**/*.py : Use `from fastapi import APIRouter, HTTPException, Request, status, Depends` for FastAPI dependencies
Applied to files:
src/app/endpoints/vector_stores.py
📚 Learning: 2026-04-06T20:18:07.852Z
Learnt from: major
Repo: lightspeed-core/lightspeed-stack PR: 1463
File: src/app/endpoints/rlsapi_v1.py:266-271
Timestamp: 2026-04-06T20:18:07.852Z
Learning: In the lightspeed-stack codebase, within `src/app/endpoints/` inference/MCP endpoints, treat `tools: Optional[list[Any]]` in MCP tool definitions as an intentional, consistent typing pattern (used across `query`, `responses`, `streaming_query`, `rlsapi_v1`). Do not raise or suggest this as a typing issue during code review; changing it in isolation could break endpoint typing consistency across the codebase.
Applied to files:
src/app/endpoints/vector_stores.py
src/app/endpoints/vector_stores.py
Outdated
| # Read file content | ||
| content = await file.read() | ||
| filename = file.filename or "uploaded_file" | ||
|
|
||
| # Add .txt extension if no extension present | ||
| # (since parsed PDFs/URLs are sent as plain text) | ||
| if not os.path.splitext(filename)[1]: | ||
| filename = f"{filename}.txt" | ||
|
|
||
| logger.info( | ||
| "Uploading file - filename: %s, size: %d bytes", | ||
| filename, | ||
| len(content), | ||
| ) | ||
|
|
||
| # Convert to BytesIO for Llama Stack client | ||
| # The client expects bytes, io.IOBase, PathLike, or a tuple | ||
| file_bytes = BytesIO(content) | ||
| file_bytes.name = filename # Set the filename attribute |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's look at the full function context around lines 450-468
cat -n src/app/endpoints/vector_stores.py | sed -n '420,480p'Repository: lightspeed-core/lightspeed-stack
Length of output: 2378
🏁 Script executed:
# Check if there's a size limit check or any validation before reading
cat -n src/app/endpoints/vector_stores.py | sed -n '410,475p'Repository: lightspeed-core/lightspeed-stack
Length of output: 2563
🏁 Script executed:
# Look for any size limit constants or config
rg -n "size|SIZE|limit|LIMIT" src/app/endpoints/vector_stores.py | head -20Repository: lightspeed-core/lightspeed-stack
Length of output: 140
🏁 Script executed:
# Check if UploadFile.file is used anywhere else in the codebase
rg -n "UploadFile" src/ -A 5 -B 2Repository: lightspeed-core/lightspeed-stack
Length of output: 1217
🏁 Script executed:
# Check if there are other file upload endpoints in the codebase to see patterns
find src -name "*.py" -type f | xargs grep -l "UploadFile" | head -5Repository: lightspeed-core/lightspeed-stack
Length of output: 110
🏁 Script executed:
# Look for how file.file is used or if it's mentioned anywhere
rg -n "file\.file" src/ --type pyRepository: lightspeed-core/lightspeed-stack
Length of output: 278
🏁 Script executed:
# Check the imports and FastAPI version to understand UploadFile better
cat src/app/endpoints/vector_stores.py | head -30Repository: lightspeed-core/lightspeed-stack
Length of output: 992
🏁 Script executed:
# Check the Llama Stack client usage - what does files.create() expect?
rg -n "client.files.create\|files\.create" src/ -B 2 -A 3 --type pyRepository: lightspeed-core/lightspeed-stack
Length of output: 58
🏁 Script executed:
# Check if there's any documentation or type hints about what file parameter accepts
rg -n "files\.create" src/ -B 5 -A 5Repository: lightspeed-core/lightspeed-stack
Length of output: 1763
🏁 Script executed:
# Search for any Python documentation or comments about UploadFile usage
rg -n "SpooledTemporaryFile\|\.file\|file object" src/ --type py | head -10Repository: lightspeed-core/lightspeed-stack
Length of output: 58
🏁 Script executed:
# Check if there's a requirements.txt to see llama-stack-client version
cat requirements.txt | grep -i llamaRepository: lightspeed-core/lightspeed-stack
Length of output: 124
🏁 Script executed:
# Look for requirements files with different names
find . -maxdepth 2 -name "*.txt" -o -name "pyproject.toml" -o -name "setup.py" | grep -E "req|setup|pyproject"Repository: lightspeed-core/lightspeed-stack
Length of output: 239
🏁 Script executed:
# Check for pyproject.toml or setup.py with dependencies
ls -la | grep -E "requirements|setup|pyproject"Repository: lightspeed-core/lightspeed-stack
Length of output: 539
🏁 Script executed:
# Look for any configuration about file size limits
rg -n "max.*size\|MAX.*SIZE\|file.*limit\|upload.*limit" src/ --type py -i | head -20Repository: lightspeed-core/lightspeed-stack
Length of output: 58
🏁 Script executed:
# Check pyproject.toml for llama-stack-client version
cat pyproject.toml | grep -A 5 -B 5 "llama"Repository: lightspeed-core/lightspeed-stack
Length of output: 813
🏁 Script executed:
# Check if the UploadFile's underlying file object is accessible
# FastAPI's UploadFile has a .file property that is SpooledTemporaryFile
# Let's see if there are any patterns in the code
rg -n "SpooledTemporaryFile\|BinaryIO\|AsyncBinaryIO" src/ --type pyRepository: lightspeed-core/lightspeed-stack
Length of output: 58
🏁 Script executed:
# Check the full function signature and see what happens with file after the read
cat -n src/app/endpoints/vector_stores.py | sed -n '417,490p'Repository: lightspeed-core/lightspeed-stack
Length of output: 3117
🏁 Script executed:
# Look for the error handling to see what status code is returned for large files
cat -n src/app/endpoints/vector_stores.py | sed -n '490,510p'Repository: lightspeed-core/lightspeed-stack
Length of output: 982
🏁 Script executed:
# Search for any MAX_FILE_SIZE or similar constants
rg -n "MAX\|LIMIT\|SIZE" src/ --type py | grep -i "file\|upload\|size" | head -15Repository: lightspeed-core/lightspeed-stack
Length of output: 58
🌐 Web query:
llama-stack-client files.create() file parameter accept bytes BytesIO SpooledTemporaryFile
💡 Result:
client.files.create(file=...) accepts FileTypes, where FileContent = IO[bytes] | bytes | PathLike[...]. So yes: bytes, io.BytesIO, and tempfile.SpooledTemporaryFile (opened in binary mode) are valid inputs. [1] [2]
What you typically want in practice is to also provide a filename (and optionally a MIME type), because that’s how the client/server can infer file type (especially for BytesIO, which has no .name). The supported “rich” forms are tuples like: (filename, contents) or (filename, contents, content_type) (and optionally headers). [1]
Examples:
from io import BytesIO
from llama_stack_client import LlamaStackClient
client = LlamaStackClient()
bio = BytesIO(b"%PDF-...") # or any bytes
client.files.create(
file=("doc.pdf", bio, "application/pdf"),
purpose="assistants",
)import tempfile
from llama_stack_client import LlamaStackClient
client = LlamaStackClient()
with tempfile.SpooledTemporaryFile(mode="w+b") as f:
f.write(b"hello")
f.seek(0)
client.files.create(
file=("hello.txt", f, "text/plain"),
purpose="assistants",
)If you already have raw bytes and don’t care about name/type, you can pass them directly:
client.files.create(file=b"hello", purpose="assistants")But supplying a filename is safer for type detection. [1]
Sources:
[1] FileTypes / FileContent definitions in _types.py (union includes IO[bytes], bytes, tuple variants) (github.com)
[2] FileCreateParams.file: Required[FileTypes] (github.com)
[3] README “File uploads” (bytes, PathLike, or (filename, contents, media type) tuple) (github.com)
Citations:
- 1: https://github.com/llamastack/llama-stack-client-python/blob/main/src/llama_stack_client/_types.py
- 2: https://github.com/llamastack/llama-stack-client-python/blob/main/src/llama_stack_client/types/file_create_params.py
- 3: https://github.com/meta-llama/llama-stack-client-python
Avoid double-buffering the file into memory.
await file.read() loads the full payload into bytes, and BytesIO(content) copies it again. For large files this doubles memory consumption. Instead, pass the bytes directly using a tuple with the filename: await client.files.create(file=(filename, content), purpose="assistants"). Alternatively, pass the underlying spooled file object directly: await client.files.create(file=(filename, file.file), purpose="assistants").
Additionally, the docstring claims to raise HTTPException for large files, but there is no size validation before reading. Add a check using the Content-Length header before materializing the payload, or configure a file size limit upstream.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/app/endpoints/vector_stores.py` around lines 450 - 468, The code
double-buffers uploaded files by calling await file.read() then wrapping into
BytesIO(content); change to pass the payload directly to the Llama client as a
tuple (filename, content) or (filename, file.file) to avoid copying (use
client.files.create(file=(filename, content), purpose="assistants") or
client.files.create(file=(filename, file.file), purpose="assistants")), remove
the BytesIO usage (file_bytes and its .name assignment), and update logging to
use the filename and size without forcing a second copy; also add a pre-read
size check using the request's Content-Length header (and raise HTTPException
when over the allowed limit) so you validate large uploads before materializing
them into memory.
There was a problem hiding this comment.
Actionable comments posted: 4
♻️ Duplicate comments (2)
src/models/requests.py (2)
1029-1070:⚠️ Potential issue | 🟠 MajorAdd
@model_validatorto reject empty update payloads.
VectorStoreUpdateRequestaccepts{}because all fields are optional. This allows no-op updates to reach the backend instead of failing fast with a 422.🛠️ Proposed fix
class VectorStoreUpdateRequest(BaseModel): ... model_config = { "extra": "forbid", ... } + + `@model_validator`(mode="after") + def validate_non_empty_update(self) -> Self: + """Ensure at least one updatable field is provided.""" + if self.name is None and self.expires_at is None and self.metadata is None: + raise ValueError( + "At least one of 'name', 'expires_at', or 'metadata' must be provided" + ) + return selfAs per coding guidelines,
src/models/**/*.pyshould use@model_validatorfor custom validation in Pydantic models.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/models/requests.py` around lines 1029 - 1070, VectorStoreUpdateRequest currently allows an empty payload because all fields are optional; add a Pydantic v2 model validator to reject no-op updates by raising a validation error when every updatable field is missing or None. Implement an `@model_validator`(mode="after") on VectorStoreUpdateRequest that inspects the instance (checking name, expires_at, metadata) and raises a ValueError (or ValidationError) with a clear message if all three are None/absent, otherwise returns the model.
1089-1098:⚠️ Potential issue | 🟠 MajorEnforce documented
attributeslimits with a@field_validator.The docstring specifies max 16 pairs, keys max 64 chars, and string values max 512 chars, but there's no validation. Invalid payloads will pass through to the backend.
🛠️ Proposed fix
+ `@field_validator`("attributes") + `@classmethod` + def validate_attributes( + cls, value: Optional[dict[str, str | float | bool]] + ) -> Optional[dict[str, str | float | bool]]: + """Validate attribute count and key/value sizes.""" + if value is None: + return value + if len(value) > 16: + raise ValueError("A maximum of 16 attributes is allowed") + for key, attr_value in value.items(): + if len(key) > 64: + raise ValueError(f"Attribute key '{key}' exceeds 64 characters") + if isinstance(attr_value, str) and len(attr_value) > 512: + raise ValueError( + f"Attribute value for '{key}' exceeds 512 characters" + ) + return valueAs per coding guidelines,
src/models/**/*.pyshould use@field_validatorfor custom validation in Pydantic models.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/models/requests.py` around lines 1089 - 1098, The attributes Field lacks validation against the documented limits; add a Pydantic `@field_validator` for "attributes" (on the model class containing the attributes field) that enforces: at most 16 key/value pairs, each key is a string ≤64 chars, values are only str|float|bool, and if a value is a string it is ≤512 chars; raise ValueError with a clear message on violation. Implement the validator using `@field_validator`("attributes", mode="after") (or appropriate mode for your Pydantic version) and return the validated dict unchanged when all checks pass.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/models/requests.py`:
- Around line 1100-1104: The chunking_strategy Field in the model currently uses
dict[str, Any] which bypasses validation; replace it with a shared Pydantic
model (e.g., ChunkingStrategy) and reuse that same model used by
VectorStoreCreateRequest so validation and docs are consistent. Update the Field
type from Optional[dict[str, Any]] to Optional[ChunkingStrategy], import or move
the ChunkingStrategy class into a common module, and adjust any example/defaults
to match the ChunkingStrategy schema (keeping same description and examples).
- Around line 962-1026: VectorStoreCreateRequest uses dict[str, Any] for
chunking_strategy and metadata which violates the no-Any rule; replace these
with explicit Pydantic models/typed structures (e.g., a ChunkingStrategy model
with fields like type, chunk_size, chunk_overlap and a Metadata model or
TypedDict for allowed keys) and update the VectorStoreCreateRequest definitions
to use those types (chunking_strategy: Optional[ChunkingStrategy], metadata:
Optional[Metadata]) and adjust model_config json_schema_extra examples to match
the new schemas; ensure the new models live in the same module (or imported) and
include validation constraints (types, ranges) instead of Any.
In `@src/models/responses.py`:
- Around line 2659-2668: Move the metadata field declaration so it appears
before the model_config attribute to match the file's convention of declaring
model fields first and class config last; specifically, relocate the
Optional[dict[str, Any]] Field(...) named metadata (with its description and
examples) above the model_config = {"extra": "forbid"} line in the same class,
leaving the metadata Field signature and contents unchanged and retaining
model_config as the final class attribute.
In `@tests/unit/app/test_routers.py`:
- Line 118: Replace the hard-coded router count literal with a single shared
constant (e.g., ROUTER_COUNT) and use it wherever the literal 23 is asserted;
update the assertions that reference app.routers (the one using assert
len(app.routers) == 23 and the other at the second occurrence) to assert
len(app.routers) == ROUTER_COUNT and define ROUTER_COUNT once at the top of the
test module or local to the test file so future router additions only require
one change.
---
Duplicate comments:
In `@src/models/requests.py`:
- Around line 1029-1070: VectorStoreUpdateRequest currently allows an empty
payload because all fields are optional; add a Pydantic v2 model validator to
reject no-op updates by raising a validation error when every updatable field is
missing or None. Implement an `@model_validator`(mode="after") on
VectorStoreUpdateRequest that inspects the instance (checking name, expires_at,
metadata) and raises a ValueError (or ValidationError) with a clear message if
all three are None/absent, otherwise returns the model.
- Around line 1089-1098: The attributes Field lacks validation against the
documented limits; add a Pydantic `@field_validator` for "attributes" (on the
model class containing the attributes field) that enforces: at most 16 key/value
pairs, each key is a string ≤64 chars, values are only str|float|bool, and if a
value is a string it is ≤512 chars; raise ValueError with a clear message on
violation. Implement the validator using `@field_validator`("attributes",
mode="after") (or appropriate mode for your Pydantic version) and return the
validated dict unchanged when all checks pass.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 1b09d28c-01cc-4b1d-93ea-0b3dfb0bde6f
📒 Files selected for processing (5)
src/app/routers.pysrc/models/config.pysrc/models/requests.pysrc/models/responses.pytests/unit/app/test_routers.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: build-pr
- GitHub Check: E2E: library mode / ci
- GitHub Check: E2E: server mode / ci
- GitHub Check: E2E Tests for Lightspeed Evaluation job
🧰 Additional context used
📓 Path-based instructions (6)
src/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/**/*.py: Use absolute imports for internal modules (e.g.,from authentication import get_auth_dependency)
Usefrom llama_stack_client import AsyncLlamaStackClientfor Llama Stack imports
Checkconstants.pyfor shared constants before defining new ones
All modules start with descriptive docstrings explaining purpose
Uselogger = get_logger(__name__)fromlog.pyfor module logging
Type aliases defined at module level for clarity
All functions require docstrings with brief descriptions
Use complete type annotations for function parameters and return types
Use union types with modern syntax:str | intinstead ofUnion[str, int]
UseOptional[Type]for optional types in type annotations
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns: return new data structures instead of modifying parameters
Useasync deffor I/O operations and external API calls
HandleAPIConnectionErrorfrom Llama Stack in error handling
Uselogger.debug()for detailed diagnostic information
Uselogger.info()for general information about program execution
Uselogger.warning()for unexpected events or potential problems
Uselogger.error()for serious problems that prevented function execution
All classes require descriptive docstrings explaining purpose
Use PascalCase for class names with descriptive names and standard suffixes:Configuration,Error/Exception,Resolver,Interface
Use complete type annotations for all class attributes; avoid usingAny
Follow Google Python docstring conventions for all modules, classes, and functions
IncludeParameters:,Returns:,Raises:sections in function docstrings as needed
Files:
src/app/routers.pysrc/models/config.pysrc/models/requests.pysrc/models/responses.py
src/app/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/app/**/*.py: Usefrom fastapi import APIRouter, HTTPException, Request, status, Dependsfor FastAPI dependencies
Use FastAPIHTTPExceptionwith appropriate status codes for API endpoint error handling
Files:
src/app/routers.py
tests/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Usepytest.mark.asynciomarker for async unit tests
Files:
tests/unit/app/test_routers.py
tests/unit/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Use
pytest-mockfor AsyncMock objects in unit tests
Files:
tests/unit/app/test_routers.py
src/models/config.py
📄 CodeRabbit inference engine (AGENTS.md)
src/models/config.py: All config uses Pydantic models extendingConfigurationBase
Base class setsextra="forbid"to reject unknown fields in Pydantic configuration models
Use type hints:Optional[FilePath],PositiveInt,SecretStrfor configuration fields
Pydantic configuration models should extendConfigurationBase
Files:
src/models/config.py
src/models/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/models/**/*.py: Use@field_validatorand@model_validatorfor custom validation in Pydantic models
Usetyping_extensions.Selffor model validators in type annotations
Pydantic data models should extendBaseModel
IncludeAttributes:section in Pydantic model docstrings
Files:
src/models/config.pysrc/models/requests.pysrc/models/responses.py
🧠 Learnings (3)
📚 Learning: 2026-01-12T10:58:40.230Z
Learnt from: blublinsky
Repo: lightspeed-core/lightspeed-stack PR: 972
File: src/models/config.py:459-513
Timestamp: 2026-01-12T10:58:40.230Z
Learning: In lightspeed-core/lightspeed-stack, for Python files under src/models, when a user claims a fix is done but the issue persists, verify the current code state before accepting the fix. Steps: review the diff, fetch the latest changes, run relevant tests, reproduce the issue, search the codebase for lingering references to the original problem, confirm the fix is applied and not undone by subsequent commits, and validate with local checks to ensure the issue is resolved.
Applied to files:
src/models/config.pysrc/models/requests.pysrc/models/responses.py
📚 Learning: 2026-02-25T07:46:33.545Z
Learnt from: asimurka
Repo: lightspeed-core/lightspeed-stack PR: 1211
File: src/models/responses.py:8-16
Timestamp: 2026-02-25T07:46:33.545Z
Learning: In the Python codebase, requests.py should use OpenAIResponseInputTool as Tool while responses.py uses OpenAIResponseTool as Tool. This difference is intentional due to differing schemas for input vs output tools in llama-stack-api. Apply this distinction consistently to other models under src/models (e.g., ensure request-related tools use the InputTool variant and response-related tools use the ResponseTool variant). If adding new tools, choose the corresponding InputTool or Tool class based on whether the tool represents input or output, and document the rationale in code comments.
Applied to files:
src/models/config.pysrc/models/requests.pysrc/models/responses.py
📚 Learning: 2026-02-23T14:11:46.950Z
Learnt from: asimurka
Repo: lightspeed-core/lightspeed-stack PR: 1198
File: src/utils/responses.py:184-188
Timestamp: 2026-02-23T14:11:46.950Z
Learning: The query request validator in the Responses API flow requires that `query_request.model` and `query_request.provider` must either both be specified or both be absent. The concatenated format (e.g., `model="provider/model"` with `provider=None`) is not permitted by the validator.
Applied to files:
src/models/requests.py
🔇 Additional comments (8)
src/app/routers.py (2)
31-31: Import update is consistent with router composition.Adding
vector_storesto the endpoint import list is correct and matches its later registration.
58-58: Vector stores router is mounted correctly under/v1.This keeps the new endpoints aligned with existing v1 routing conventions.
tests/unit/app/test_routers.py (1)
141-141: New router assertions are accurate and valuable.The added checks validate both registration and
/v1prefixing forvector_stores.router.Also applies to: 182-182
src/models/config.py (1)
1036-1039: RBAC action additions are consistent and correctly integrated.These enum values align with the new vector store/file authorization decorators and fit the existing
Actionpattern.src/models/responses.py (4)
2686-2714: LGTM!Model structure, docstrings, and type annotations are correct. The
openapi_responsemethod signature is consistent with other new models.
2717-2751: LGTM!Model fields align correctly with the explicit field mapping in
src/app/endpoints/vector_stores.py(context snippet 3). Type annotations and docstrings are complete.
2754-2796: LGTM!Model structure is consistent with the endpoint construction pattern shown in context snippet 2. The
attributesfield type annotation correctly uses modern union syntax (str | float | bool).
2799-2827: LGTM!Model follows the established list response pattern consistent with
VectorStoresListResponse. Type annotations and docstrings are complete.
jrobertboos
left a comment
There was a problem hiding this comment.
Overall looks good, but code rabbit does bring up some good points that should be addressed.
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
Signed-off-by: Lucas <lyoon@redhat.com>
There was a problem hiding this comment.
♻️ Duplicate comments (4)
src/app/endpoints/vector_stores.py (2)
64-71:⚠️ Potential issue | 🟠 MajorReturn HTTP 400 for upload validation failures.
BadRequestErroris a client-side bad request, but this branch turns it into503 Service Unavailableandfile_responsesdoesn't document a 400 response. Callers will misclassify malformed uploads as transient backend outages.Based on learnings,
src/app/**/*.pyshould use FastAPIHTTPExceptionwith appropriate status codes for API endpoint error handling.Also applies to: 488-491
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/app/endpoints/vector_stores.py` around lines 64 - 71, The endpoint currently maps client validation failures to ServiceUnavailableResponse and doesn't document HTTP 400; update the file_responses dict (the file_responses variable in src/app/endpoints/vector_stores.py) to include a 400 entry (e.g., BadRequest/BadRequestResponse.openapi_response()) and change the error handling where BadRequestError is converted into a 503 to instead raise a FastAPI HTTPException(status_code=400) or return the 400 response; apply the same fix to the other similar branch(s) in this file that currently return 503 for BadRequestError (the branches around the other ServiceUnavailableResponse usages).
451-473:⚠️ Potential issue | 🟠 MajorReject oversized uploads before materializing them in memory.
This path loads the full body into
contentand then copies it again intoBytesIO, so a single large upload can spike worker memory twice before the backend rejects it. Enforce a max upload size beforeawait file.read(), and pass(filename, file.file)or(filename, content)directly to the client instead of allocating a second buffer.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/app/endpoints/vector_stores.py` around lines 451 - 473, The code currently reads the entire upload into memory via await file.read() and then copies it into BytesIO, which can double memory usage for large uploads; instead enforce a MAX_UPLOAD_SIZE constant before materializing the body (e.g., check file.spooled and/or file.headers/content-length or read in chunks up to the limit and abort if exceeded) and reject with an error if the size is over the limit, and when calling client.files.create use a zero-copy API by passing a tuple with the filename and the underlying file stream (e.g., (filename, file.file)) or the already-read content only once (e.g., (filename, content)) so you don't allocate a second buffer; update the logic around filename, BytesIO, and client.files.create to avoid creating file_bytes when using the stream tuple and ensure the error path returns before any large allocation.tests/unit/app/endpoints/test_vector_stores.py (1)
180-206: 🧹 Nitpick | 🔵 TrivialAdd coverage for the
extra_bodyadapter path.This success case only exercises
name, so it never verifies the new wrapper logic that movesprovider_id,embedding_model, andembedding_dimensionintoextra_body. A regression there would still leave the core behavior of this PR untested.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/app/endpoints/test_vector_stores.py` around lines 180 - 206, Update the success test to exercise the extra_body adapter path by providing provider_id, embedding_model, and embedding_dimension on the incoming VectorStoreCreateRequest so the wrapper moves them into extra_body; e.g., in test_create_vector_store_success (or a new test) set body = VectorStoreCreateRequest(name="test_store", provider_id="p1", embedding_model="m1", embedding_dimension=128), then assert the mocked mock_client.vector_stores.create was called with an extra_body containing those keys/values and that the returned VectorStore is still handled (response.id/name/status remain correct), keeping the existing mock setup for AsyncLlamaStackClientHolder.get_client and configuration.src/models/requests.py (1)
995-1007: 🛠️ Refactor suggestion | 🟠 MajorReplace
Any-typed request bags with typed models.
chunking_strategyandmetadatastill accept arbitrary payloads here, so invalid shapes bypass validation at the API boundary and fail later against the backend. Reuse a sharedChunkingStrategymodel and a bounded metadata type/dedicated model across these request schemas.As per coding guidelines,
src/models/**/*.pyshould "use complete type annotations for all class attributes; avoid usingAny".Also applies to: 1053-1057, 1116-1120
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/models/requests.py` around lines 995 - 1007, The chunking_strategy and metadata fields are typed as Optional[dict[str, Any]] which bypasses validation; replace them with concrete Pydantic models and bounded types: use the shared ChunkingStrategy model (e.g., replace chunking_strategy: Optional[dict[str, Any]] with Optional[ChunkingStrategy]) and introduce or reuse a Metadata model/typed mapping (e.g., Metadata or Dict[str, str] / a Pydantic model) for the metadata field so shapes are validated at the API boundary; update all occurrences of these fields (including the other similar definitions referenced around the later ranges) to reference the typed models and adjust imports accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@src/app/endpoints/vector_stores.py`:
- Around line 64-71: The endpoint currently maps client validation failures to
ServiceUnavailableResponse and doesn't document HTTP 400; update the
file_responses dict (the file_responses variable in
src/app/endpoints/vector_stores.py) to include a 400 entry (e.g.,
BadRequest/BadRequestResponse.openapi_response()) and change the error handling
where BadRequestError is converted into a 503 to instead raise a FastAPI
HTTPException(status_code=400) or return the 400 response; apply the same fix to
the other similar branch(s) in this file that currently return 503 for
BadRequestError (the branches around the other ServiceUnavailableResponse
usages).
- Around line 451-473: The code currently reads the entire upload into memory
via await file.read() and then copies it into BytesIO, which can double memory
usage for large uploads; instead enforce a MAX_UPLOAD_SIZE constant before
materializing the body (e.g., check file.spooled and/or
file.headers/content-length or read in chunks up to the limit and abort if
exceeded) and reject with an error if the size is over the limit, and when
calling client.files.create use a zero-copy API by passing a tuple with the
filename and the underlying file stream (e.g., (filename, file.file)) or the
already-read content only once (e.g., (filename, content)) so you don't allocate
a second buffer; update the logic around filename, BytesIO, and
client.files.create to avoid creating file_bytes when using the stream tuple and
ensure the error path returns before any large allocation.
In `@src/models/requests.py`:
- Around line 995-1007: The chunking_strategy and metadata fields are typed as
Optional[dict[str, Any]] which bypasses validation; replace them with concrete
Pydantic models and bounded types: use the shared ChunkingStrategy model (e.g.,
replace chunking_strategy: Optional[dict[str, Any]] with
Optional[ChunkingStrategy]) and introduce or reuse a Metadata model/typed
mapping (e.g., Metadata or Dict[str, str] / a Pydantic model) for the metadata
field so shapes are validated at the API boundary; update all occurrences of
these fields (including the other similar definitions referenced around the
later ranges) to reference the typed models and adjust imports accordingly.
In `@tests/unit/app/endpoints/test_vector_stores.py`:
- Around line 180-206: Update the success test to exercise the extra_body
adapter path by providing provider_id, embedding_model, and embedding_dimension
on the incoming VectorStoreCreateRequest so the wrapper moves them into
extra_body; e.g., in test_create_vector_store_success (or a new test) set body =
VectorStoreCreateRequest(name="test_store", provider_id="p1",
embedding_model="m1", embedding_dimension=128), then assert the mocked
mock_client.vector_stores.create was called with an extra_body containing those
keys/values and that the returned VectorStore is still handled
(response.id/name/status remain correct), keeping the existing mock setup for
AsyncLlamaStackClientHolder.get_client and configuration.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: ed9adc5f-0cdf-4dc7-93e2-1881d261d61e
📒 Files selected for processing (4)
src/app/endpoints/vector_stores.pysrc/models/requests.pytests/unit/app/endpoints/test_vector_stores.pytests/unit/models/requests/test_vector_store_requests.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: build-pr
- GitHub Check: E2E: server mode / ci
- GitHub Check: E2E Tests for Lightspeed Evaluation job
- GitHub Check: E2E: library mode / ci
🧰 Additional context used
📓 Path-based instructions (5)
tests/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Usepytest.mark.asynciomarker for async unit tests
Files:
tests/unit/models/requests/test_vector_store_requests.pytests/unit/app/endpoints/test_vector_stores.py
tests/unit/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Use
pytest-mockfor AsyncMock objects in unit tests
Files:
tests/unit/models/requests/test_vector_store_requests.pytests/unit/app/endpoints/test_vector_stores.py
src/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/**/*.py: Use absolute imports for internal modules (e.g.,from authentication import get_auth_dependency)
Usefrom llama_stack_client import AsyncLlamaStackClientfor Llama Stack imports
Checkconstants.pyfor shared constants before defining new ones
All modules start with descriptive docstrings explaining purpose
Uselogger = get_logger(__name__)fromlog.pyfor module logging
Type aliases defined at module level for clarity
All functions require docstrings with brief descriptions
Use complete type annotations for function parameters and return types
Use union types with modern syntax:str | intinstead ofUnion[str, int]
UseOptional[Type]for optional types in type annotations
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns: return new data structures instead of modifying parameters
Useasync deffor I/O operations and external API calls
HandleAPIConnectionErrorfrom Llama Stack in error handling
Uselogger.debug()for detailed diagnostic information
Uselogger.info()for general information about program execution
Uselogger.warning()for unexpected events or potential problems
Uselogger.error()for serious problems that prevented function execution
All classes require descriptive docstrings explaining purpose
Use PascalCase for class names with descriptive names and standard suffixes:Configuration,Error/Exception,Resolver,Interface
Use complete type annotations for all class attributes; avoid usingAny
Follow Google Python docstring conventions for all modules, classes, and functions
IncludeParameters:,Returns:,Raises:sections in function docstrings as needed
Files:
src/app/endpoints/vector_stores.pysrc/models/requests.py
src/app/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/app/**/*.py: Usefrom fastapi import APIRouter, HTTPException, Request, status, Dependsfor FastAPI dependencies
Use FastAPIHTTPExceptionwith appropriate status codes for API endpoint error handling
Files:
src/app/endpoints/vector_stores.py
src/models/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/models/**/*.py: Use@field_validatorand@model_validatorfor custom validation in Pydantic models
Usetyping_extensions.Selffor model validators in type annotations
Pydantic data models should extendBaseModel
IncludeAttributes:section in Pydantic model docstrings
Files:
src/models/requests.py
🧠 Learnings (8)
📚 Learning: 2026-04-05T12:19:36.009Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-05T12:19:36.009Z
Learning: Applies to src/app/**/*.py : Use FastAPI `HTTPException` with appropriate status codes for API endpoint error handling
Applied to files:
src/app/endpoints/vector_stores.py
📚 Learning: 2026-04-05T12:19:36.009Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-05T12:19:36.009Z
Learning: Applies to src/app/**/*.py : Use `from fastapi import APIRouter, HTTPException, Request, status, Depends` for FastAPI dependencies
Applied to files:
src/app/endpoints/vector_stores.py
📚 Learning: 2026-04-06T20:18:07.852Z
Learnt from: major
Repo: lightspeed-core/lightspeed-stack PR: 1463
File: src/app/endpoints/rlsapi_v1.py:266-271
Timestamp: 2026-04-06T20:18:07.852Z
Learning: In the lightspeed-stack codebase, within `src/app/endpoints/` inference/MCP endpoints, treat `tools: Optional[list[Any]]` in MCP tool definitions as an intentional, consistent typing pattern (used across `query`, `responses`, `streaming_query`, `rlsapi_v1`). Do not raise or suggest this as a typing issue during code review; changing it in isolation could break endpoint typing consistency across the codebase.
Applied to files:
src/app/endpoints/vector_stores.py
📚 Learning: 2026-04-05T12:19:36.009Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-05T12:19:36.009Z
Learning: Applies to src/models/**/*.py : Use `field_validator` and `model_validator` for custom validation in Pydantic models
Applied to files:
src/models/requests.py
📚 Learning: 2026-02-23T14:11:46.950Z
Learnt from: asimurka
Repo: lightspeed-core/lightspeed-stack PR: 1198
File: src/utils/responses.py:184-188
Timestamp: 2026-02-23T14:11:46.950Z
Learning: The query request validator in the Responses API flow requires that `query_request.model` and `query_request.provider` must either both be specified or both be absent. The concatenated format (e.g., `model="provider/model"` with `provider=None`) is not permitted by the validator.
Applied to files:
src/models/requests.py
📚 Learning: 2026-04-05T12:19:36.009Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-05T12:19:36.009Z
Learning: Applies to src/**/*.py : Use complete type annotations for all class attributes; avoid using `Any`
Applied to files:
src/models/requests.py
📚 Learning: 2026-01-12T10:58:40.230Z
Learnt from: blublinsky
Repo: lightspeed-core/lightspeed-stack PR: 972
File: src/models/config.py:459-513
Timestamp: 2026-01-12T10:58:40.230Z
Learning: In lightspeed-core/lightspeed-stack, for Python files under src/models, when a user claims a fix is done but the issue persists, verify the current code state before accepting the fix. Steps: review the diff, fetch the latest changes, run relevant tests, reproduce the issue, search the codebase for lingering references to the original problem, confirm the fix is applied and not undone by subsequent commits, and validate with local checks to ensure the issue is resolved.
Applied to files:
src/models/requests.py
📚 Learning: 2026-02-25T07:46:33.545Z
Learnt from: asimurka
Repo: lightspeed-core/lightspeed-stack PR: 1211
File: src/models/responses.py:8-16
Timestamp: 2026-02-25T07:46:33.545Z
Learning: In the Python codebase, requests.py should use OpenAIResponseInputTool as Tool while responses.py uses OpenAIResponseTool as Tool. This difference is intentional due to differing schemas for input vs output tools in llama-stack-api. Apply this distinction consistently to other models under src/models (e.g., ensure request-related tools use the InputTool variant and response-related tools use the ResponseTool variant). If adding new tools, choose the corresponding InputTool or Tool class based on whether the tool represents input or output, and document the rationale in code comments.
Applied to files:
src/models/requests.py
🔇 Additional comments (1)
tests/unit/models/requests/test_vector_store_requests.py (1)
45-52: No changes needed—the current exception handling is correct.
pytest.raises(ValueError)will successfully catch Pydantic'sValidationErrorexceptions because in Pydantic v2,ValidationErroris a subclass ofValueError. The validators in bothVectorStoreUpdateRequestandVectorStoreFileCreateRequestraiseValueError, which Pydantic wraps inValidationError—and sinceValidationErrorinherits fromValueError, the tests will pass as written.While using the more explicit
pytest.raises(ValidationError)would be consistent with other tests in the file (lines 152, 157) and already-importedValidationErrorclass, the current implementation is functionally correct and will not cause test failures.
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/app/endpoints/vector_stores.py`:
- Around line 605-637: The retry loop around client.vector_stores.files.create
can re-raise a lock-related exception which gets caught by an outer
BadRequestError handler and incorrectly mapped to a 404; instead, detect when a
lock error occurs on the final attempt (use the is_lock_error and
is_last_attempt flags inside the except block for the create call) and raise a
proper HTTPException with your InternalServerErrorResponse (e.g., build
InternalServerErrorResponse(response="Failed to create vector store file",
cause="All retry attempts failed to create the vector store file") and raise
HTTPException(**response.model_dump())) rather than re-raising the original
exception; apply the same pattern to the similar block later (lines handling the
vector store file creation result around vs_file and the outer BadRequestError
mapping).
- Around line 338-346: The update response currently sets
metadata=vector_store.metadata or None which converts an empty dict to null;
change this to pass the metadata through unchanged
(metadata=vector_store.metadata) in the VectorStoreResponse returned by the PUT
handler in vector_stores.py so empty dicts remain {} and the response shape
matches other vector-store handlers.
- Around line 515-517: The local import of BytesIO inside the request path
causes pylint C0415; move the import to module scope by adding "from io import
BytesIO" at the top of the file (module imports) and remove the inline import
inside the handler; update any references to BytesIO in the function (e.g., in
the upload/handler function where the comment and inline import appear) to use
the module-level symbol so the request path no longer performs a local import.
In `@tests/unit/app/endpoints/test_vector_stores.py`:
- Around line 882-886: The test currently allocates a 200MB bytes object for
mock_file.read.return_value which is unnecessary; keep mock_file.size = 200 *
1024 * 1024 to trigger the size guard but change mock_file.read to raise if
invoked (e.g., set mock_file.read.side_effect to an exception or AssertionError)
so the test verifies the guard without allocating the large payload and fails if
the code incorrectly calls read().
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: d5b2653e-3339-4655-9839-47fea3d81e43
📒 Files selected for processing (3)
src/app/endpoints/vector_stores.pysrc/constants.pytests/unit/app/endpoints/test_vector_stores.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: build-pr
- GitHub Check: E2E: server mode / ci
- GitHub Check: E2E Tests for Lightspeed Evaluation job
- GitHub Check: E2E: library mode / ci
🧰 Additional context used
📓 Path-based instructions (4)
src/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/**/*.py: Use absolute imports for internal modules (e.g.,from authentication import get_auth_dependency)
Usefrom llama_stack_client import AsyncLlamaStackClientfor Llama Stack imports
Checkconstants.pyfor shared constants before defining new ones
All modules start with descriptive docstrings explaining purpose
Uselogger = get_logger(__name__)fromlog.pyfor module logging
Type aliases defined at module level for clarity
All functions require docstrings with brief descriptions
Use complete type annotations for function parameters and return types
Use union types with modern syntax:str | intinstead ofUnion[str, int]
UseOptional[Type]for optional types in type annotations
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns: return new data structures instead of modifying parameters
Useasync deffor I/O operations and external API calls
HandleAPIConnectionErrorfrom Llama Stack in error handling
Uselogger.debug()for detailed diagnostic information
Uselogger.info()for general information about program execution
Uselogger.warning()for unexpected events or potential problems
Uselogger.error()for serious problems that prevented function execution
All classes require descriptive docstrings explaining purpose
Use PascalCase for class names with descriptive names and standard suffixes:Configuration,Error/Exception,Resolver,Interface
Use complete type annotations for all class attributes; avoid usingAny
Follow Google Python docstring conventions for all modules, classes, and functions
IncludeParameters:,Returns:,Raises:sections in function docstrings as needed
Files:
src/constants.pysrc/app/endpoints/vector_stores.py
src/app/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/app/**/*.py: Usefrom fastapi import APIRouter, HTTPException, Request, status, Dependsfor FastAPI dependencies
Use FastAPIHTTPExceptionwith appropriate status codes for API endpoint error handling
Files:
src/app/endpoints/vector_stores.py
tests/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Usepytest.mark.asynciomarker for async unit tests
Files:
tests/unit/app/endpoints/test_vector_stores.py
tests/unit/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Use
pytest-mockfor AsyncMock objects in unit tests
Files:
tests/unit/app/endpoints/test_vector_stores.py
🧠 Learnings (3)
📚 Learning: 2026-04-05T12:19:36.009Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-05T12:19:36.009Z
Learning: Central `constants.py` for shared constants with descriptive comments
Applied to files:
src/constants.py
📚 Learning: 2026-04-05T12:19:36.009Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-05T12:19:36.009Z
Learning: Applies to src/app/**/*.py : Use FastAPI `HTTPException` with appropriate status codes for API endpoint error handling
Applied to files:
src/app/endpoints/vector_stores.py
📚 Learning: 2026-04-06T20:18:07.852Z
Learnt from: major
Repo: lightspeed-core/lightspeed-stack PR: 1463
File: src/app/endpoints/rlsapi_v1.py:266-271
Timestamp: 2026-04-06T20:18:07.852Z
Learning: In the lightspeed-stack codebase, within `src/app/endpoints/` inference/MCP endpoints, treat `tools: Optional[list[Any]]` in MCP tool definitions as an intentional, consistent typing pattern (used across `query`, `responses`, `streaming_query`, `rlsapi_v1`). Do not raise or suggest this as a typing issue during code review; changing it in isolation could break endpoint typing consistency across the codebase.
Applied to files:
src/app/endpoints/vector_stores.py
🪛 GitHub Actions: Python linter
src/app/endpoints/vector_stores.py
[error] 517-517: pylint failed with C0415: Import outside toplevel (io.BytesIO) (import-outside-toplevel).
| return VectorStoreResponse( | ||
| id=vector_store.id, | ||
| name=vector_store.name, | ||
| created_at=vector_store.created_at, | ||
| last_active_at=vector_store.last_active_at, | ||
| expires_at=vector_store.expires_at, | ||
| status=vector_store.status or "unknown", | ||
| usage_bytes=vector_store.usage_bytes or 0, | ||
| metadata=vector_store.metadata or None, |
There was a problem hiding this comment.
Don't collapse empty metadata to null in the update response.
metadata=vector_store.metadata or None turns a legitimate empty dict into None. The other vector-store handlers return metadata unchanged, so PUT /vector-stores/{id} ends up with a different response shape for the same value.
Proposed fix
return VectorStoreResponse(
id=vector_store.id,
name=vector_store.name,
created_at=vector_store.created_at,
last_active_at=vector_store.last_active_at,
expires_at=vector_store.expires_at,
status=vector_store.status or "unknown",
usage_bytes=vector_store.usage_bytes or 0,
- metadata=vector_store.metadata or None,
+ metadata=vector_store.metadata,
)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| return VectorStoreResponse( | |
| id=vector_store.id, | |
| name=vector_store.name, | |
| created_at=vector_store.created_at, | |
| last_active_at=vector_store.last_active_at, | |
| expires_at=vector_store.expires_at, | |
| status=vector_store.status or "unknown", | |
| usage_bytes=vector_store.usage_bytes or 0, | |
| metadata=vector_store.metadata or None, | |
| return VectorStoreResponse( | |
| id=vector_store.id, | |
| name=vector_store.name, | |
| created_at=vector_store.created_at, | |
| last_active_at=vector_store.last_active_at, | |
| expires_at=vector_store.expires_at, | |
| status=vector_store.status or "unknown", | |
| usage_bytes=vector_store.usage_bytes or 0, | |
| metadata=vector_store.metadata, |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/app/endpoints/vector_stores.py` around lines 338 - 346, The update
response currently sets metadata=vector_store.metadata or None which converts an
empty dict to null; change this to pass the metadata through unchanged
(metadata=vector_store.metadata) in the VectorStoreResponse returned by the PUT
handler in vector_stores.py so empty dicts remain {} and the response shape
matches other vector-store handlers.
Signed-off-by: Lucas <lyoon@redhat.com>
a14ac62 to
2bd58de
Compare
Signed-off-by: Lucas <lyoon@redhat.com>
f684423 to
ffcc389
Compare
Signed-off-by: Lucas <lyoon@redhat.com>
tisnik
left a comment
There was a problem hiding this comment.
It looks good in overall, thank you. We need to have a task to add integration tests and another to add end to end tests, and yet another to add documentation. I'll take care of it.
|
|
||
| @router.delete( | ||
| "/vector-stores/{vector_store_id}", | ||
| responses={"204": {"description": "Vector store deleted"}}, |
There was a problem hiding this comment.
I would prefer this endpoint’s behavior to align with the existing DELETE v1/conversations/{conversation_id}. Namely:
- Do not return 404 at all,
- Return 200 with body suggesting whether the resource was actually deleted or not (see
ConversationDeleteResponse - Continue returning standard error responses 401,403,500,503 as usual (update responses dict accordingly).
Signed-off-by: Lucas <lyoon@redhat.com>
Description
The proposed architectural update aims to decouple the Red Hat Developer Hub (RHDH) plugin from its current tight dependency on the Llama Stack by centralizing all vector store operations within lightspeed-core. Currently, the RHDH plugin directly manages sessions and documents through multiple Llama Stack API calls, making it difficult to switch to alternative vector stores like ChromaDB or Pinecone without significant refactoring. By implementing a proxy/wrapper pattern, lightspeed-core will host domain-focused REST endpoints and handle all business logic, while the RHDH plugin is reduced to a thin HTTP client. This transition ensures a single source of truth for notebook logic, improves maintainability, and allows for backend infrastructure changes without impacting the frontend plugin.
Type of change
Tools used to create PR
Identify any AI code assistants used in this PR (for transparency and review context)
Related Tickets & Documents
Checklist before requesting a review
Testing
Summary by CodeRabbit
New Features
API / Schemas
Tests
Chores