perf: use deque for iterator object cache#1968
Conversation
There was a problem hiding this comment.
Orca Security Scan Summary
| Status | Check | Issues by priority | |
|---|---|---|---|
| Infrastructure as Code | View in Orca | ||
| SAST | View in Orca | ||
| Secrets | View in Orca | ||
| Vulnerabilities | View in Orca |
|
To avoid any confusion in the future about your contribution to Weaviate, we work with a Contributor License Agreement. If you agree, you can simply add a comment to this PR that you agree with the CLA so that we can merge. |
Both _ObjectIterator and _ObjectAIterator consume cached objects front-to-back via list.pop(0), which is O(n) per removal. With the default ITERATOR_CACHE_SIZE (typically 100), each batch drain is O(n²). Switch to collections.deque with popleft() for O(1) front removal.
3fb05ac to
fb5d117
Compare
|
I have read the Contributor License Agreement and I agree to its terms. |
|
Thanks for this! Could you fix the linter/formatting issues? Then we can merge it |
Fixes linting issue flagged by ruff (F401: unused import).
There was a problem hiding this comment.
Pull request overview
This PR optimizes object iteration in the collections layer by replacing an internal list-based cache (with O(n) front removals) with a collections.deque (O(1) front removals), improving performance when draining iterator batches.
Changes:
- Switch iterator object caches in
_ObjectIteratorand_ObjectAIteratorfromlisttocollections.deque. - Replace
pop(0)calls withpopleft()and wrap fetchedres.objectsindeque(...)on assignment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Fixed the linting/formatting issues (removed unused |
Problem
Both
_ObjectIterator.__next__()and_ObjectAIterator.__anext__()consume cached objects front-to-back vialist.pop(0), which is O(n) per removal. With the defaultITERATOR_CACHE_SIZE(typically 100), each batch drain is O(n²).Solution
Switch
__iter_object_cachefromlisttocollections.deque, replacing.pop(0)with.popleft()for O(1) front removal. Theres.objectslist returned byfetch_objects()is wrapped indeque()on assignment.Changes
weaviate/collections/iterator.py:dequefromcollections_ObjectIterator: type, init,__iter__,__next__updated_ObjectAIterator: type, init,__aiter__,__anext__updated.pop(0)→.popleft()Testing
ast.parse()