Skip to content
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 8 additions & 1 deletion cpp/nd/string_array_holder.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -29,12 +29,19 @@ class string_stream_array_holder
std::vector<int64_t> offsets_;
std::vector<int64_t> range_offsets_;
std::vector<const void*> holders_;

// Zero-copy buffer cache: raw pointers to buffer data and offsets.
// SAFETY: These pointers remain valid as long as holders_ contains the chunk arrays.
// This eliminates shared_ptr atomic reference counting in get_range_data() hot path.
std::vector<const uint8_t*> buffer_cache_;
std::vector<const uint32_t*> offsets_cache_;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Critical Issue: Uninitialized cache vectors could cause undefined behavior

The new buffer_cache_ and offsets_cache_ members are declared but never initialized or populated in this header file. Since initialize_single_range() signature changed but no implementation exists in this repo, any code using this class will have empty cache vectors, leading to out-of-bounds access if the get_range_data() implementation (which must exist somewhere) tries to use these caches.

Recommendation: Either include the implementation that populates these caches, or mark this as a breaking change that requires the corresponding indra implementation to be synced first. The PR description mentions this mirrors indra/cpp/nd/string_array_holder.hpp - verify that all necessary implementation code is also synced to avoid runtime crashes.


const void* dynamic_std_holder_ = nullptr;
const void* dynamic_icm_holder_ = nullptr;
bool is_valid_ = true;

void initialize(const nd::array& arr);
void initialize_single_range(const auto& range_adapter);
void initialize_single_range(const auto& range_adapter, const nd::array& source_arr);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API Breaking Change: Missing parameter will break existing callers

The signature change from initialize_single_range(const auto& range_adapter) to initialize_single_range(const auto& range_adapter, const nd::array& source_arr) adds a required parameter. Any existing code calling this function will fail to compile.

Recommendation: Verify that no callers exist in deeplake or dependent codebases, or provide a migration path (e.g., overload for backward compatibility or update all call sites in this PR).

void initialize_complex(const nd::array& arr);
bool try_initialize_range_arrays(const auto& vstacked, const nd::array& fallback);
void clear_range_data();
Expand Down
Loading