feat(supabase): add async support to SupabaseGroongaDocumentStore and SupabaseGroongaBM25Retriever#3380
Conversation
… SupabaseGroongaBM25Retriever Implements the full async interface for the Groonga document store using supabase-py's AsyncClient and acreate_client. SupabaseGroongaBM25Retriever.run_async() previously delegated to the synchronous run() path instead of using real async I/O. Closes deepset-ai#3379
|
@Aftabbs thanks for your contribution - can you please make the CI green so that I can start the review |
|
Regarding the tests, since we are testing against a local Docker instance no need to mock up; please convert them to integration tests |
- Fix ruff format violation in test_groonga_document_store.py: collapse 3-line MagicMock call to 1 line to match 120-char line length limit - Add async integration tests to test_groonga_integration.py: TestGroongaDocumentStoreAsyncIntegration covers warm_up_async(), write_documents_async(), count_documents_async(), filter_documents_async(), delete_all_documents_async(), delete_documents_async(), and _groonga_retrieval_async() against a live Docker stack TestGroongaBM25RetrieverAsyncIntegration covers run_async() with real data
Coverage report (supabase)Click to see where and how coverage changed
This report was generated by python-coverage-comment-action |
||||||||||||||||||||||||||||||||||||
|
Hey @Aftabbs, thanks for the fix to The new Looking at others, none of them expose a async def _initialize_async_client(self) -> None:
if self._async_client is None:
self._async_client = qdrant_client.AsyncQdrantClient(...)
await self._set_up_collection_async()
async def count_documents_async(self) -> int:
await self._initialize_async_client() |
davidsbatista
left a comment
There was a problem hiding this comment.
I have handled the remaining issues. Thanks for your contribution
Summary
warm_up_async()and a full set of*_async()methods toSupabaseGroongaDocumentStoreusingsupabase-py'sAsyncClientandacreate_clientSupabaseGroongaBM25Retriever.run_async(), which was a stub that delegated to the synchronousrun()instead of using real async I/OCloses #3379
Root Cause
supabase-pyships two parallel clients:create_client(sync) andacreate_client(async). The original implementation only wired up the sync client.run_async()on the retriever simply calledself.run(), blocking the event loop on every invocation — makingAsyncPipelineusage effectively synchronous.Changes
groonga_document_store.py_async_clientfield;warm_up_async(),_setup_table_async(), and async counterparts for all public methods (count_documents_async,filter_documents_async,write_documents_async,delete_documents_async,delete_all_documents_async,delete_by_filter_async,update_by_filter_async,_groonga_retrieval_async)groonga_bm25_retriever.pyrun_async()to call_groonga_retrieval_async(); removed stale docstring note saying async is unsupportedtest_groonga_document_store.pyTestDocumentStoreAsyncclass with 12 async unit tests usingAsyncMockfor the async clienttest_groonga_retriever.pytest_run_asyncto verify_groonga_retrieval_asyncis actually awaited (not the sync path)Testing
TestDocumentStoreAsync— 12 tests covering warm_up, count, filter, write (OVERWRITE/SKIP/FAIL policies), delete, delete_all, retrieval, and uninitialized-client guardtest_run_async_uses_async_retrieval— asserts_groonga_retrieval_asyncis awaitedruff checkpasses with no errorsImpact
Users running
SupabaseGroongaBM25Retrieverinside anAsyncPipelinenow get true non-blocking I/O instead of a blocking call that defeats the purpose of async execution.