Expose raft::memory_tracking_resources to C/Python/Java/Rust#2073
Expose raft::memory_tracking_resources to C/Python/Java/Rust#2073achirkin wants to merge 11 commits into
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughSummary by CodeRabbit
WalkthroughThis PR adds background CSV memory-allocation tracking to CuVS resources across C, Python, Java, and Rust. The core ChangesMemory Tracking Resources Feature
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@c/src/core/c_api.cpp`:
- Around line 42-54: The function cuvsResourcesCreateWithMemoryTracking must
validate the output handle and interval: before dereferencing res check that res
is not null and throw std::invalid_argument if it is, and validate
sample_interval_ms > 0 (throw std::invalid_argument for non-positive values) to
avoid unstable sampling; then only allocate the raft::memory_tracking_resources
(res_ptr) and assign *res = reinterpret_cast<uintptr_t>(res_ptr) after these
checks so the translate_exceptions wrapper still returns errors consistently.
In `@java/cuvs-java/src/main/java/com/nvidia/cuvs/spi/CuVSProvider.java`:
- Around line 50-53: Change the new abstract method in the CuVSProvider
interface into a default method to preserve SPI compatibility: declare default
CuVSResources newCuVSResources(Path tempDirectory, Path memoryTrackingCsvPath,
Duration memoryTrackingSampleInterval) throws Throwable, and implement it to
delegate to the existing older overload (e.g., call the existing
newCuVSResources(Path tempDirectory) overload) so existing providers continue to
work; if no older overload exists, have the default implementation throw an
UnsupportedOperationException with a clear message so providers can opt-in to
override the new behavior.
In `@python/cuvs/cuvs/common/resources.pyx`:
- Around line 75-85: The __cinit__ currently treats an empty string
memory_tracking_csv_path as falsy and silently skips memory tracking; change the
validation to explicitly check for None versus empty string: if
memory_tracking_csv_path is None call cuvsResourcesCreate, otherwise verify the
provided path is a non-empty string/bytes (e.g., check length > 0) and then
encode it and call cuvsResourcesCreateWithMemoryTracking with csv_bytes and
memory_tracking_sample_interval_ms, and if the path is empty raise a Python
ValueError; keep using check_cuvs around the C calls (referencing __cinit__,
memory_tracking_csv_path, cuvsResourcesCreateWithMemoryTracking,
cuvsResourcesCreate, check_cuvs).
In `@rust/cuvs/src/resources.rs`:
- Around line 49-50: The current cast from Duration::as_millis() (u128) to i64
for sample_interval_ms is lossy; replace the direct as_millis() as i64 cast with
a fallible conversion using .try_into() on the u128 result (e.g., let
sample_interval_ms: i64 = ...as_millis().try_into()?), and propagate or return a
clear error if conversion fails (do not silently wrap/truncate before passing to
the C FFI). Update the surrounding function (the code that defines
sample_interval_ms and uses sample_interval) to return or propagate an error (or
otherwise handle the failure) so invalid large durations aren’t passed into the
FFI.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: f8434388-318b-4816-a423-a13a3977db91
📒 Files selected for processing (15)
c/include/cuvs/core/c_api.hc/src/core/c_api.cppdocs/source/api_basics.rstjava/cuvs-java/src/main/java/com/nvidia/cuvs/CuVSResources.javajava/cuvs-java/src/main/java/com/nvidia/cuvs/spi/CuVSProvider.javajava/cuvs-java/src/main/java/com/nvidia/cuvs/spi/UnsupportedProvider.javajava/cuvs-java/src/main/java22/com/nvidia/cuvs/internal/CuVSResourcesImpl.javajava/cuvs-java/src/main/java22/com/nvidia/cuvs/spi/JDKProvider.javajava/cuvs-java/src/test/java/com/nvidia/cuvs/MemoryTrackingResourcesIT.javapython/cuvs/cuvs/common/c_api.pxdpython/cuvs/cuvs/common/resources.pyxpython/cuvs/cuvs/tests/test_memory_tracking.pyrust/cuvs-sys/src/bindings.rsrust/cuvs/Cargo.tomlrust/cuvs/src/resources.rs
tfeher
left a comment
There was a problem hiding this comment.
C and Python changes LGTM
mythrocks
left a comment
There was a problem hiding this comment.
One minor documentation nit. But otherwise, this should be good to ship.
yan-zaretskiy
left a comment
There was a problem hiding this comment.
Minor suggestions only
|
Moving this 26.08 to let it merge after rapidsai/raft#3004 , which now breaks the API that this PR uses |
2. Restore documentation in fern 3. Adapt to breaking changes in raft (header moved)
| from cuvs.common.c_api cimport ( | ||
| cuvsResources_t, | ||
| cuvsResourcesCreate, | ||
| cuvsResourcesCreateWithMemoryTracking, |
There was a problem hiding this comment.
Being a little forward thinking here- what happens when we have a bunch of different resources types that we want to use together? If this was a constructor flag (or just an option flag) then it would hide the impl details from underneath.
There was a problem hiding this comment.
The C function cuvsResourcesCreateWithMemoryTracking produces the same pointer type as cuvsResourcesCreate (we expose all raft::resources sub-classes via the base class). So we can freely add more such factory functions (e.g. cuvsResourcesCreateWithDryRun) and use the produced pointers interchangeably.
I think this also helps to keep the C API/ABI stable, because we just add new free functions.
There was a problem hiding this comment.
I'm tihnking about users that are going to want to use different types of resources together. These resources should all be types of "raft::resources" in the end, so that's why I'd like to see the different options exposed as flags instead of as explicit "make me this explicit resource type".
There was a problem hiding this comment.
Ideally, a user should be able to mix and match the resouece types. The only reason we have convenience subclasses like "device_resources" and "device_resources_snmg" is for user convenience, but since they are still just raft::resources in the end, the user can still populate them using the lower-level resources functions.
There was a problem hiding this comment.
These are all types of raft::resources and are interchangeable, we just use different function names to create them. Honestly, I don't see how adding flags to differentiate between the functionality is better for a user. Memory-tracking resources is not something a user should regularly use in the production environment - it's an analysis tool. I'm afraid making this a function argument on the C side would only raise confusion, because on the C++ side the same object is created using a different class/constructor name rather than an argument.
Expose the new
raft::memory_tracking_resourcesin the public API of all supported languages and add documentation.See Raft #2973 for more information about memory tracking resources.
Resolves rapidsai/raft#2982