Skip to content

perf: use deque for iterator object cache#1968

Merged
dirkkul merged 3 commits intoweaviate:mainfrom
giulio-leone:fix/iterator-deque-performance
Mar 2, 2026
Merged

perf: use deque for iterator object cache#1968
dirkkul merged 3 commits intoweaviate:mainfrom
giulio-leone:fix/iterator-deque-performance

Conversation

@giulio-leone
Copy link
Contributor

Problem

Both _ObjectIterator.__next__() and _ObjectAIterator.__anext__() consume cached objects front-to-back via list.pop(0), which is O(n) per removal. With the default ITERATOR_CACHE_SIZE (typically 100), each batch drain is O(n²).

Solution

Switch __iter_object_cache from list to collections.deque, replacing .pop(0) with .popleft() for O(1) front removal. The res.objects list returned by fetch_objects() is wrapped in deque() on assignment.

Changes

  • weaviate/collections/iterator.py:
    • Import deque from collections
    • _ObjectIterator: type, init, __iter__, __next__ updated
    • _ObjectAIterator: type, init, __aiter__, __anext__ updated
    • .pop(0).popleft()

Testing

  • Syntax verified via ast.parse()

Copy link

@orca-security-eu orca-security-eu bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Orca Security Scan Summary

Status Check Issues by priority
Passed Passed Infrastructure as Code high 0   medium 0   low 0   info 0 View in Orca
Passed Passed SAST high 0   medium 0   low 0   info 0 View in Orca
Passed Passed Secrets high 0   medium 0   low 0   info 0 View in Orca
Passed Passed Vulnerabilities high 0   medium 0   low 0   info 0 View in Orca

@weaviate-git-bot
Copy link

To avoid any confusion in the future about your contribution to Weaviate, we work with a Contributor License Agreement. If you agree, you can simply add a comment to this PR that you agree with the CLA so that we can merge.

beep boop - the Weaviate bot 👋🤖

PS:
Are you already a member of the Weaviate Forum?

Both _ObjectIterator and _ObjectAIterator consume cached objects
front-to-back via list.pop(0), which is O(n) per removal. With the
default ITERATOR_CACHE_SIZE (typically 100), each batch drain is O(n²).
Switch to collections.deque with popleft() for O(1) front removal.
@giulio-leone giulio-leone force-pushed the fix/iterator-deque-performance branch from 3fb05ac to fb5d117 Compare February 28, 2026 14:42
@giulio-leone
Copy link
Contributor Author

I have read the Contributor License Agreement and I agree to its terms.

@dirkkul
Copy link
Collaborator

dirkkul commented Mar 2, 2026

Thanks for this! Could you fix the linter/formatting issues? Then we can merge it

Fixes linting issue flagged by ruff (F401: unused import).
Copilot AI review requested due to automatic review settings March 2, 2026 06:28
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes object iteration in the collections layer by replacing an internal list-based cache (with O(n) front removals) with a collections.deque (O(1) front removals), improving performance when draining iterator batches.

Changes:

  • Switch iterator object caches in _ObjectIterator and _ObjectAIterator from list to collections.deque.
  • Replace pop(0) calls with popleft() and wrap fetched res.objects in deque(...) on assignment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@giulio-leone
Copy link
Contributor Author

Fixed the linting/formatting issues (removed unused List import flagged by ruff F401) — should be good to merge now. Thanks for the review!

@dirkkul dirkkul merged commit 9bc3a6a into weaviate:main Mar 2, 2026
123 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants