[Feature] Base/Row alignment support for Tensors#136
[Feature] Base/Row alignment support for Tensors#136zacharyvincze wants to merge 43 commits intoROCm:developfrom
Conversation
Review: [Feature] Base/Row alignment support for TensorsMajor feature addition for Tensor memory layout: Key additions:
Assessment: Needs Review - Large architectural change. This is a substantial PR (+2010/-1165 lines across 24 files) that changes core tensor semantics. The approach looks solid - explicit handling of padding makes the API clearer. Questions for maintainers:
Well documented and includes tests. Ready for deep review. |
There was a problem hiding this comment.
Pull request overview
This PR adds base-pointer and per-row alignment (padding) support to roccv::Tensor, updating stride/reshape semantics and introducing host<->device copy helpers that are padding-aware. It also updates samples, Python bindings, benchmarks, and tests to operate correctly with non-contiguous (row-padded) tensors.
Changes:
- Introduce
MemAlignmentand updateTensorallocation/stride/reshape logic to support row/base alignment and non-contiguous layouts. - Add
Tensor::copyFromHost/Tensor::copyToHostand refactor tests/helpers/samples to use padding-aware copies. - Update Python DLPack import to honor provided strides (and convert element-strides to byte-strides), with a contiguous fallback.
Reviewed changes
Copilot reviewed 24 out of 24 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
src/core/tensor.cpp |
Core implementation updates: alignment-aware stride calc, reshape for non-contiguous tensors, host copy helpers. |
include/core/tensor.hpp |
Public API updates for alignment-aware construction + new copy/contiguity helpers. |
include/core/mem_alignment.hpp / src/core/mem_alignment.cpp |
New alignment configuration type used during tensor allocation. |
include/core/utils.hpp |
New low-level helpers (power-of-two + align-up) used for alignment/stride logic. |
src/core/tensor_storage.cpp / include/core/tensor_storage.hpp |
Expose allocator used by TensorStorage. |
src/op_non_max_suppression.cpp |
Update reshape call sites to new reshape(dtype, shape) signature. |
python/src/py_tensor.cpp / python/include/py_tensor.hpp |
DLPack import correctness fixes + documentation/typing cleanup. |
tests/roccv/cpp/src/tests/core/tensor/test_tensor.cpp |
Add/adjust tensor tests for padding/reshape/copy helpers. |
tests/roccv/cpp/include/test_helpers.hpp |
Refactor host<->device copy helpers to use new tensor copy APIs. |
samples/common/utils.hpp |
Add batch image load/write utilities that respect tensor padding. |
samples/*.cpp (multiple) |
Update CLI parsing + use new image IO helpers (padding-aware). |
benchmarks/src/roccv/roccv_bench_helpers.cpp |
Update benchmark filling logic to account for padded allocation sizes. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #136 +/- ##
===========================================
+ Coverage 73.88% 73.97% +0.09%
===========================================
Files 77 80 +3
Lines 2998 3189 +191
Branches 645 683 +38
===========================================
+ Hits 2215 2359 +144
- Misses 339 373 +34
- Partials 444 457 +13
🚀 New features to boost your workflow:
|
PR Description
General
roccv::Tensor.MemAlignclass which is used as a parameter when allocating memory for Tensors. This specifies what the row alignment and base alignment should be for newly created tensors.Samples
Tensors
copyToHostandcopyFromHostmethods to the tensor, which accepts and produces contiguous host buffers. Since padding makes copying memory from the host to the device non-trivial, this allows the user to perform a copy which is agnostic to the underlying row padding that a tensor may have.Python bindings (rocpycv)
Tests
Tensor::copyFromHost,Tensor::copyToHostdirectly instead of using duplicate code.roccv::Tensorare covered. Special attention is paid to the reshape method, since this has undergone the most changes.