Fix O(N^2) entity addition in CallbackGroup#3109
Fix O(N^2) entity addition in CallbackGroup#3109PavelGuzenfeld wants to merge 3 commits intoros2:rollingfrom
Conversation
Sanitizer ResultsAddressSanitizer (ASan)All 3 callback group-related tests pass clean with ASan — no memory errors detected: ThreadSanitizer (TSan)TSan reports pre-existing warnings in fastrtps ( Benchmark (reproducer from #2942)
|
8fa1ba7 to
d25878b
Compare
|
Thank for you for the PR! I believe that this PR was written and/or opened by an AI agent. Please note that the OSRF has a policy on the use of tools like generative AI tools. The tl;dr is that we accept contributions written partly or wholly with generative AI, but you must disclose it (including the tool/model), which I don't see here. It is also your responsibility to ensure that your contribution is valid. Also, although some bug reports or feature requests mention a specific distro, we prefer making changes to the Rolling distro ( |
|
Thanks for the review and for pointing out the policy! AI disclosure: This PR was authored with the assistance of Claude (Anthropic) as a generative AI coding tool. I have reviewed and validated the changes. Branch target: Will retarget to |
|
Correction on the AI disclosure: the model used was Claude Opus (Anthropic). |
d25878b to
765e0ae
Compare
|
Rebased onto The bug exists on |
765e0ae to
bb770e9
Compare
|
The CI failure is not caused by this PR. The build error is in This function was added to rclcpp rolling HEAD by #3089, but the CI's This PR only modifies |
|
I am not sure about this patch as a whole. The scenario, that someone adds timers and removes hundreds of timers is super artificial, I don't see a real word scenario were someone would do this. |
bb770e9 to
70fec3f
Compare
|
@jmachowinski Thanks for the review! The concern that adding/removing hundreds of timers is artificial is understandable, but this O(N²) path is hit by several real-world patterns documented in existing rclcpp issues:
The key insight is that this fix has zero runtime cost — it doesn't add overhead or complexity to the hot path. It simply moves the expired-entry cleanup from What changed in this push:
Test results (Docker, ros:rolling):
|
70fec3f to
8bf323c
Compare
| std::vector<rclcpp::SubscriptionBase::WeakPtr> subscription_ptrs_; | ||
| std::vector<rclcpp::TimerBase::WeakPtr> timer_ptrs_; | ||
| std::vector<rclcpp::ServiceBase::WeakPtr> service_ptrs_; | ||
| std::vector<rclcpp::ClientBase::WeakPtr> client_ptrs_; | ||
| std::vector<rclcpp::Waitable::WeakPtr> waitable_ptrs_; |
There was a problem hiding this comment.
Don't use mutable here, better change the function collect_all_ptrs to be non const
rclcpp/src/rclcpp/callback_group.cpp
Outdated
| template<typename T, typename Func> | ||
| void collect_and_compact( | ||
| std::vector<typename T::WeakPtr> & ptrs, | ||
| const Func & func) | ||
| { | ||
| size_t write_idx = 0; | ||
| for (size_t read_idx = 0; read_idx < ptrs.size(); ++read_idx) { | ||
| auto ref_ptr = ptrs[read_idx].lock(); | ||
| if (ref_ptr) { | ||
| func(ref_ptr); | ||
| if (write_idx != read_idx) { | ||
| ptrs[write_idx] = std::move(ptrs[read_idx]); | ||
| } | ||
| ++write_idx; | ||
| } | ||
| } | ||
| ptrs.resize(write_idx); | ||
| } | ||
| } // namespace |
There was a problem hiding this comment.
Use remove_if or erase_if (We just bumped to C++20)
|
Addressed both review comments:
Also fixed uncrustify formatting in the test file. Verified in Docker (
|
2130ddd to
023724b
Compare
Move expired weak_ptr cleanup from add_* methods (O(N) per call, O(N²) total when adding N entities) into collect_all_ptrs via a new collect_and_compact helper. collect_all_ptrs already iterates all entries on every executor spin, so cleanup happens at zero extra cost. The add_* methods become O(1) (just push_back). This fixes the quadratic slowdown reported in ros2#2942 (429ms → 6ms for 10,000 timers). This change is safe because: - collect_all_ptrs already skipped expired entries; now it also removes them in the same pass - The executor calls collect_all_ptrs on every spin iteration, so expired entries are cleaned up promptly - All find_*_ptrs_if methods already handle expired weak_ptrs gracefully Fixes ros2#2942 Signed-off-by: Pavel Guzenfeld <pavelguzenfeld@gmail.com>
Tests verify that: - Expired weak_ptrs are compacted during collect_all_ptrs - collect_all_ptrs only yields live entities - Adding N timers is not quadratic (5000 timers in < 5s) - Interleaved add/remove cycles produce correct entity counts - Mixed entity types (timers + subscriptions) are cleaned independently Signed-off-by: Pavel Guzenfeld <pavelguzenfeld@gmail.com>
- Remove mutable from entity vectors, make collect_all_ptrs non-const - Replace collect_and_compact helper with std::remove_if (erase-remove idiom) - Remove leaked test_time_source_deadlock CMake entry from another branch - Fix uncrustify formatting in test file Signed-off-by: Pavel Guzenfeld <pavelguzenfeld@gmail.com>
023724b to
98e7db3
Compare
|
The CI failure ( This was introduced by commit df2ac88 (#3089), which uses an Our changes (callback_group.hpp/cpp + tests) build and pass cleanly — verified locally in Docker with the unrelated commit reverted. |
Summary
Fixes #2942
The
add_*methods (add_timer,add_subscription,add_service,add_client,add_waitable) performed a full linear scan to remove expiredweak_ptrs on every call. When adding N entities, this resulted in O(N^2) total operations — 96% of wall time in the reproducer with 10,000 timers.Changes:
erase(remove_if(...expired...))cleanup from all fiveadd_*methods, making them O(1) (justpush_back)collect_all_ptrs, which already iterates all entries and is called regularly by the executor. A newcollect_and_compacthelper compacts each vector in a single pass during collection — no extra iteration costmutableto allow cleanup within theconstcollect_all_ptrsmethodWhy this is safe:
collect_all_ptrsalready skipped expired entries; now it also removes themcollect_all_ptrson every spin iteration, so expired entries are cleaned up promptlyfind_*_ptrs_ifmethods already handle expiredweak_ptrs gracefully (they call.lock()and skip nulls)Benchmark (reproducer from #2942 — 10,000 timers)
Test plan
test_allocator_memory_strategy— passedtest_add_callback_groups_to_executor— passedtest_memory_strategy— passed