NanoVDB Math: refactor onto MatBase/VecBase, fix correctness bugs, add tests#2207
NanoVDB Math: refactor onto MatBase/VecBase, fix correctness bugs, add tests#2207swahtz wants to merge 22 commits into
MatBase/VecBase, fix correctness bugs, add tests#2207Conversation
Lift the bulk of Mat2/Mat2x3/Mat3x2/Mat3/Mat4 and Vec2/Vec3/Vec4 onto shared base classes (MatBase<T, R, C> and a new VecBase<T, N>) so the arithmetic is implemented once and the derived classes shrink to constructors plus thin delegating wrappers. Along the way, fix several pre-existing correctness issues and round out a number of inconsistent or partially-implemented APIs. No behaviour change for any code that was already correct. Bug fixes - math::Min/Max for int32_t and uint32_t no longer route through fminf/fmaxf, eliminating silent precision loss for values above 2^24 and undefined behaviour for values near INT_MAX / UINT_MAX (heavily exercised by HDDA ray traversal). - Remove an invalid operator*(const Mat3<T>&, const Mat2x3<T>&) overload: the dimensions don't match standard matrix algebra and the body silently dropped the third row/column. - Mat2::inverse() now returns an explicit zero matrix on a singular input (previously returned an uninitialised Mat2<T>()) and uses T(1) / det instead of 1.f / det. - Vec*::operator/(T) and operator/=(T) do per-element division. The old (T(1)/s) * (*this) form produced (0,...,0) for integer T (Vec2i, Vec3i, Vec4i, and the matrices via the same path now divide correctly). - Vec2's mixed overloads (operator+/-/+= /-= with a Coord) now take a Coord2 instead of the 3D Coord; they used to silently drop z. - math::Round(Vec3<float>) and math::Round(Vec3<double>) now use the same rounding strategy (floor(x + 0.5)). The float version previously used rintf (round-half-to-even), giving different results at half-integers, e.g. Round(-1.5). - Vec2/Vec3 floor()/ceil() doc comments corrected (had been swapped). Refactor / consolidation - Promote MatBase::rows()/cols()/size() to public; add ValueType alias and data() accessors. - New generic MatBase helpers: plus, minus, negate, scale, divideBy, addAssign, subAssign, scaleAssign, divideAssign, equals, transposeAs<Result>, multiply<Result, Rhs>, multiplyVec<VecRes, VecRhs>. - New VecBase<T, N> with: plus / minus / mul / div / negate / scale / divideBy, addAssign / subAssign / mulAssign / divAssign / scaleAssign / divideAssignScalar, equals, dot, lengthSqr, length, smallestComponent, largestComponent, mergeMin / mergeMax, floorAs / ceilAs / roundAs, plus operator[] and asPointer. - All matrix and vector classes reduce to constructors, any dimension-specific extras (Mat2::inverse, Vec3::cross/outer), and one-line operator wrappers that delegate to the base. - Unify scalar constructors of Mat3 / Mat3x2 / Mat4 on T arguments, matching Mat2 / Mat2x3 (was template<typename Source>); mixed-literal construction now works uniformly across all five matrix types. Source* array constructors keep template<typename Source> for cross-type interop. API completeness - Element-wise / scalar / equality operators are now uniform across all five matrix types (operator+, -, += -=, * / *= /=, == !=, and unary -). Several of these were previously missing on Mat3x2 / Mat4 or partial on the others. - New mat*mat overloads: Mat3 * Mat3x2 -> Mat3x2 (standalone) and Mat4 * Mat4 -> Mat4 (member). - New mat*vec overloads (all members): Mat2 * Vec2, Mat2x3 * Vec3, Mat3x2 * Vec2. The standalone Mat4 * Vec4 moves to a Mat4 member to match Mat3 * Vec3. - scalar * MatN standalones added for Mat2x3, Mat3x2, Mat3, Mat4 (Mat2 already had one). - Vec4 gains the API it was missing relative to Vec3: min(), max(), asPointer(), floor(), ceil(), round() (returning Vec4<int32_t> since there is no Coord4). Tests - Add 11 new TEST_F blocks to TestNanoVDB.cc covering every change above: IntegerMinMax, Round, Vec2, Vec3Ops, Vec4Ops, Mat2, Mat3, Mat4, Mat2x3_Mat3x2, MatMul, MatVecIntrospection. Each test pins down the specific behaviour (with numeric checks where applicable, including the singular-inverse case, half-integer Round agreement, and the integer-division regression). Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
The template-template constructors and assignment operators on nanovdb::math::Vec2/Vec3/Vec4 are the entry point for converting any Vec-shaped type (including openvdb::math::Vec*) into a NanoVDB vector. Their `static_assert`s were tightened to check `Vec*T<T2>::SIZE` (uppercase) in the recent VecBase refactor, but openvdb::math::Vec* only exposes the lowercase `size` constant (inherited from openvdb::math::Tuple). That broke openvdb -> nanovdb conversion paths such as nanovdb::tools::createNanoGrid<openvdb Vec grid>() with: error: 'SIZE' is not a member of 'openvdb::v13_0::math::Vec4<float>' Switch all six static_asserts back to the lowercase `size`, which is the common-denominator constant shared by both libraries' Vec types. `static_assert` is a constant-expression context, so it never ODR-uses the constant; this revert does not reintroduce the GCC linker error fixed by making `Vec*::size` constexpr. Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
There was a problem hiding this comment.
Pull request overview
This PR refactors NanoVDB’s math/Math.h by consolidating shared vector/matrix arithmetic into new VecBase<T, N> and enhanced MatBase<T, R, C> base classes, while also fixing several long-standing correctness issues and adding unit tests to lock in the corrected behavior.
Changes:
- Refactor vector and matrix implementations onto
VecBase/MatBasehelpers to remove duplicated per-element loops and normalize APIs across types. - Fix correctness issues (e.g., integer
Min/Max, consistentRoundsemantics, singularMat2::inverse, scalar division for integerVec*/Mat*, invalid mat-mul overload removal). - Add 11 new unit tests covering the refactor, new APIs, and each bug fix with numeric checks and compile-time introspection.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
nanovdb/nanovdb/math/Math.h |
Introduces VecBase, expands MatBase, rewrites Vec*/Mat* as thin wrappers, fixes math correctness bugs, and rationalizes overload sets. |
nanovdb/nanovdb/unittest/TestNanoVDB.cc |
Adds new TEST_F blocks that exercise the refactored math types and pin down the corrected semantics. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
VecBase::floorAs / ceilAs / roundAs used util::is_floating_point<T> to detect the floating-point arm, but util::is_floating_point only matches float and double — long double is not in its set. With T = long double the constexpr branch would therefore fall through to the integer pass-through arm and silently truncate via static_cast<int32_t>, instead of going through math::Floor / math::Ceil. That was a behavior regression vs the pre-refactor code, where Vec3<long double>::round() correctly hit the Floor(x + 0.5) path (the same path that fails to compile cleanly for the float + double Floor overloads, producing a loud "ambiguous" diagnostic rather than a silent wrong result). Switch the three constexpr predicates to std::is_floating_point<T>, which does include long double, and add an explicit `#include <type_traits>` plus a comment explaining the choice. The inner math::Floor / math::Ceil calls only have float and double overloads, so Vec<long double>::floor() / ceil() / round() now once again fail to compile with an ambiguous-overload diagnostic — which is the same behavior as master and is strictly preferable to the silent truncation introduced by the refactor. Verified float / double / int rounding paths unchanged against the existing TestNanoVDB Round + Vec3Ops + Vec4Ops blocks. Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
harrism
left a comment
There was a problem hiding this comment.
This is a really great cleanup and improvement. I have comments in 4 areas.
- constexpr -- might as well constexpr all the things
- Alignas -- on the classes where it makes sense, let's guarantee alignment so the compiler can optimize / coalesce loads (this allows each row of a 4x4 matrix to be loaded with a single 128-bit global load, for example. Without it, the compiler will generate 4 32-bit loads).
- Other modernization / best practices topics:
[[nodiscard]],noexcept, consistentexplicitctors, hidden friends. - Documentation -- didn't comment on this much inline, but make sure all methods have doxygen docs. Looks a bit inconsistent?
|
I also asked an agent to review with modern C++ best practices in mind and then had it prune its list to not include what I already mentioned in my review. This first one would be good to implement. The rest are all speculative / nice to have or depend on C++20 or later. |
Add debug-only bounds checks (NANOVDB_ASSERT, no-op in release) to
every operator[] in Math.h, addressing reviewer feedback. Twelve
sites in one file, all of Math.h's indexed accessors:
* Coord::operator[] — i < 3
* Coord2::operator[] — i < 2
* VecBase::operator[] — i in [0, N)
* MatBase::operator[] — row in [0, ROWS), returns a row pointer
(col bound stays the caller's responsibility)
* BBox<Vec3T>::operator[]— i in [0, 2) (min/max)
* Rgba8::operator[] — n in [0, 4)
NANOVDB_ASSERT compiles out under NDEBUG so release-build code paths
are unchanged. For Coord and Coord2 the IndexType is uint32_t and
the lower bound is therefore implicit; for the int-indexed accessors
both bounds are checked.
Constexpr-index sites (which are the overwhelming majority in
practice — e.g. v[0], v[1], v[2], bbox[0]) fold the comparison away
at compile time even in debug builds, so the new asserts only have
runtime cost for genuinely-dynamic index expressions.
Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
Mark every function in nanovdb/math/Math.h `constexpr` that the C++17 rules allow. Addresses reviewer feedback on PR AcademySoftwareFoundation#2207 specifically calling out VecBase::operator[], VecBase::asPointer, all VecBase operators/helpers, all MatBase operators/methods, and Mat2::inverse — plus everything else in the file that's trivially constexpr-eligible while we're in here. Scope of the sweep: * All free math helpers — Tolerance::value, Delta::value, Maximum::value, Min/Max (int + uint overloads), Pow2/3/4, generic Abs, Sign, MinIndex/MaxIndex, AlignUp. * Coord and Coord2 — every method except `Floor` (which depends on the non-constexpr math::Floor). * VecBase — operator[], asPointer (both const + non-const), and all element-wise / scalar helpers (plus, minus, mul, div, negate, scale, divideBy, addAssign, subAssign, mulAssign, divAssign, scaleAssign, divideAssignScalar, equals, dot, lengthSqr, smallestComponent, largestComponent, mergeMin, mergeMax, floorAs, ceilAs, roundAs). `length()` stays non-constexpr — std::sqrt isn't constexpr until C++26 — with a one-line comment noting the blocker. * Vec2 / Vec3 / Vec4 — every operator and accessor; cross/outer on Vec3. The floor/ceil/round wrappers are constexpr only for integer T (the floating-point arms transit through math::Floor / math::Ceil which aren't constexpr). * MatBase — operator[], data, plus, minus, negate, scale, addAssign, subAssign, scaleAssign, divideBy, divideAssign, equals, transposeAs, multiply, multiplyVec. * Mat2 / Mat2x3 / Mat3x2 / Mat3 / Mat4 — every constructor (element list + Source*), every operator (unary -, binary +/-, +=/-=, mat-mat *, mat-vec *, scalar */ /, *=/=, ==/!=), transpose. Mat2::inverse is now constexpr — required also making math::isApproxZero and math::Tolerance<float|double>::value constexpr, which the reviewer specifically flagged as the precondition. * All free scalar*matrix and mixed-shape matrix*matrix operator overloads. * BaseBBox and both BBox specializations (floating + integer) — including the integer-bbox Iterator class. * Rgba8 — all six constructors plus operator<, operator==, lengthSqr, asFloat, operator[], packed, r/g/b/a, the Vec3<float> and Vec4<float> conversions. * Pre-C++20 nitpick honored throughout: every helper that builds a result element-by-element now uses value-initialization (`Derived out{};` / `Result r{};`) so the constexpr rule about uninitialized locals is satisfied even on older standards. Functions deliberately left non-constexpr (each blocked by a stdlib function that isn't constexpr in C++17): * VecBase::length, Vec*::normalize, Rgba8::length (sqrt is C++26). * math::Min(float,float) / Min(double,double) / Max(float,float) / Max(double,double) (fminf/fmaxf are C++23). * math::Clamp(float,…) / Clamp(double,…) (call the non-constexpr Min/Max). * math::Fract / Floor / Ceil / Round (floorf/ceilf are C++23). * math::Sqrt(float|double). * math::Abs(float|double|int) (fabs/abs are C++23). * Coord::Floor / Coord2::Floor — depend on math::Floor. * matMult / matMultT free overloads (use fma/fmaf — C++23). * All __device__-only *Atomic methods. * BBox<CoordT,false>::transform<Map> — depends on an external Map::applyMap whose constexpr-ness varies. Verified by a standalone TU that constructs the constexpr surface at compile time (a ~50-line block of static_asserts covering Vec, Mat, Coord, BBox, Rgba8, and the math helpers) and by recompiling the existing TestNanoVDB unittest TU at -O2 -Wall — both clean. Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
C++17 audit of the constexpr sweep flagged one strict violation:
every public Rgba8 constructor initialises `mData{{r, g, b, a}}` —
i.e. it makes the union's `c[4]` member active. Reading
`mData.packed` (the other union member) at constant evaluation in
C++17 is undefined behaviour, and a constexpr function isn't
allowed to perform UB at constant evaluation. With no public path
to make `packed` the active union member at compile time there's
no legal way to constant-evaluate these three accessors, so the
constexpr mark falls into "ill-formed, no diagnostic required"
territory under [dcl.constexpr]/6 and should be removed.
Drop the constexpr from:
* Rgba8::operator< (reads mData.packed)
* Rgba8::operator== (reads mData.packed)
* Rgba8::packed() const / non-const (returns reference to mData.packed)
The rest of Rgba8's constexpr surface (r/g/b/a, asFloat, operator[],
lengthSqr, the explicit ctors) reads only mData.c and is unaffected.
Verified with two TUs under -std=c++17 -pedantic:
* a positive TU that constant-evaluates ~40 Vec/Mat/Coord/BBox/Rgba8
surface expressions via static_assert (incl. Mat2::inverse) —
compiles clean;
* a negative TU that puts `Rgba8::operator==`, `operator<`, and
`packed()` in a constexpr context — fails to compile with three
"call to non-'constexpr' function" errors, confirming the strict
C++17 rules now reject what was previously a silent NDR.
Also re-ran TestNanoVDB.cc full TU compile at -O2 -Wall — clean.
(The C++20 "common initial sequence" union-access relaxation
doesn't apply here either: uint8_t[4] and uint32_t are not
layout-compatible types.)
Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
The fma / fmaf calls in math::matMult were the last hop blocking constexpr on Map::applyMap and BBox<CoordT,false>::transform<Map>. Switch the six matMult / matMultT overloads to plain `a * b + c` form and constexpr-mark them, then propagate the mark through every Map method (applyMap, applyMapF, applyJacobian, applyJacobianF, applyInverseMap, applyInverseMapF, applyInverseJacobian, applyInverseJacobianF, applyIJT, applyIJTF, getVoxelSize) and the two Map constructors, and finally BBox::transform itself. Trade-off: `a * b + c` rounds twice vs the single rounding of fma(a, b, c). Worst-case ~1 ulp drift; for NanoVDB voxel transforms (values in meters/kilometers, voxel sizes in sub-millimeters to meters) that's well below the geometric precision of the operation. Device-side codegen is unchanged in practice — nvcc's default -fmad=true contracts `a * b + c` back into a single hardware FMA instruction. Host-side without explicit -ffp-contract=off most compilers still contract within an expression too. The only place the difference is observable is in a compile-time constexpr evaluation, where the abstract-machine rules forbid contraction — the constexpr result is strictly double-rounded. Once the project moves to C++23, constexpr fma will be available (P1383) and the matMult bodies can revert to fma/fmaf if a 1-ulp gain is ever worth wanting in some not-yet-imagined precision audit. The runtime instruction set on every platform we care about is identical either way. Verified at compile time with a focused TU that constant-evaluates the identity Map, a scale+translate Map, every apply* variant, and BBox<Coord>::transform<Map> applied to a 5x5x5 CoordBBox. Re-ran the existing TestNanoVDB.cc unittest TU at -O2 -Wall — clean. Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
Reviewer asked for named component accessors on top of operator[].
Add them generically to VecBase<T, N>, with a body static_assert
that fires at the call site when the Vec is too small for the
requested component:
constexpr T& x() { return mVec[0]; } // N >= 1
constexpr T& y() { static_assert(N >= 2, "..."); ... }
constexpr T& z() { static_assert(N >= 3, "..."); ... }
constexpr T& w() { static_assert(N >= 4, "..."); ... }
Plus matching const overloads. All eight methods are __hostdev__ and
constexpr, matching the rest of VecBase's accessor surface. Lives on
VecBase so the same code services Vec2 / Vec3 / Vec4 and any future
dimension without per-class duplication.
Choice of static_assert over SFINAE-disabling: simplest pattern,
clearest call-site error ("VecBase::w() requires N >= 4"), and
matches function-body static_assert usage already in this file
(MatBase::multiply, BBox<Vec3T>::asReal). The SFINAE-friendly
trade-off is that vec2.w *names* something on the class — only the
call errors — but that's fine for an accessor that nobody is going
to introspect via is_invocable_v.
The MatVecIntrospection test grew a block exercising x/y/z/w at
compile time on each of Vec2 / Vec3 / Vec4, plus a runtime
write-through-non-const-ref check on Vec4. The static_assert
negative case (Vec2<T>::w() failing to compile) was verified out of
tree but isn't part of the test TU since gtest can't assert
"must-fail-to-compile".
Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
Reviewer asked for alignas on the dimensionally-friendly Vec / Mat classes. Apply the "align to full byte size" pattern to every class whose element count is already a power of 2 (so the alignment costs zero extra bytes), and document the explicit decision NOT to align the rest: * Vec2<T>: alignas(alignof(T) * 2) 8B / 16B for float / double * Vec3<T>: comment, NO alignas * Vec4<T>: alignas(alignof(T) * 4) 16B / 32B * Mat2<T>: alignas(alignof(T) * 4) 16B / 32B * Mat2x3<T>: comment, NO alignas * Mat3x2<T>: comment, NO alignas * Mat3<T>: comment, NO alignas * Mat4<T>: alignas(alignof(T) * 16) 64B / 128B Vec3 / Mat3 / Mat2x3 / Mat3x2 each carry an inline comment explaining why they stay at natural alignment: their byte sizes (3*T, 9*T, 6*T) are not powers of 2, so any alignas(N > alignof(T)) would inject tail padding and break packed-array layout plus on-disk format compatibility. The comments are there specifically so a future contributor doesn't "fix" the omission. Mat4 alignment is heavy (64-byte for float, 128-byte for double). This follows the reviewer's "align to full size" rule and lets a Mat4<float> load with a single AVX-512 instruction. Mat4's comment explicitly calls out the alternative (alignas(alignof(T) * 4) for row-alignment only, matching Vec4) so anyone hitting downstream allocator friction can drop to it. Coord / Coord2 / BBox / Rgba8 are deliberately not touched: Coord and Coord2 are serialized integer index types where layout compatibility matters; BBox composes Vec3 / Coord so it inherits their natural alignment; Rgba8 is already 4-byte aligned via its uint32_t packed union member. Verified with a focused TU of 14 static_asserts covering size + alignment for float and double instantiations of every Vec / Mat class. Re-ran the existing TestNanoVDB.cc unittest TU at -O2 -Wall to confirm the alignment changes don't ripple into any existing test failure — clean. Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
Reviewer asked for [[nodiscard]] on every value-returning query, called out specifically because m.transpose() returns a NEW matrix rather than mutating in place — and a user expecting the openvdb::Mat3::transpose() in-place semantics will silently discard the result and quietly produce a wrong-but-plausible answer downstream. 282 [[nodiscard]] annotations added across Math.h: * Free math helpers (Tolerance/Delta/Maximum, Min/Max/Clamp/Fract/ Floor/Ceil/Pow2-4/Abs/Round/RoundDown/Sqrt/Sign/MinIndex/MaxIndex/ AlignUp, isApproxZero, the six matMult/matMultT overloads): ~46 * Coord + Coord2: 46 * VecBase + Vec2/Vec3/Vec4: 74 * MatBase + Mat2/Mat2x3/Mat3x2/Mat3/Mat4: 72 * BaseBBox + BBox<Vec3T,true> + BBox<CoordT,false> (incl. Iterator): 29 * Rgba8: 7 Targeted: pure queries returning by value or returning a new value (plus, minus, cross, outer, dot, lengthSqr, length, equals, operator+ / - / * / / / == / !=, transpose, inverse, floorAs/ceilAs/ roundAs, asVec3s/asVec3d, etc.). Deliberately NOT annotated: operator[], asPointer / data, named component accessors (x/y/z/w/r/g/b/a, packed), compound-assignment operators (+=, -=, *=, /=, &=, <<=, >>=), constructors, destructors, void-returning mutators, mutating chain methods that return *this by reference (Vec::normalize, BBox::expand/intersect/ translate), pre/post operator++, and the device-only *Atomic methods. None of those exhibit the bug class the attribute is designed to catch. Verified by recompiling TestNanoVDB.cc with -Wall -Wextra at -O2 — zero new -Wunused-result / nodiscard warnings, meaning the existing test code consumes every return value it gets back. The pre-existing warnings (omp simd pragmas in MaskPrefixSum.h and VoxelBlockManager.h) are infrastructure, unchanged. Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
Three review items addressed in one commit, all confined to Math.h: 1. noexcept sweep Reviewer noted that no method in Math.h throws (verified — zero hits for throw/exception). Added noexcept to every function in the file: 489 annotations across free helpers, every Coord / Coord2 / VecBase / Vec* / MatBase / Mat* / BBox / Iterator / Rgba8 method, plus every standalone operator overload. Default constructors marked as `noexcept = default;`. Out-of-class definitions of Coord::asVec3s/asVec3d and Coord2::asVec2s/asVec2d got matching noexcept on both their in-class declarations and their out-of-class definitions so the exception specifications agree. 2. explicit on the templated Vec*T<T2> ctors Vec2/Vec3/Vec4 each have three "convert-from-some-other-Vec" ctors: the typed `Vec*(const Vec*<T2>&)` and the Coord-flavored `Vec*(const Coord*&)` overloads were already explicit, but the open-template `template<template<class> class Vec*T, class T2> Vec*(const Vec*T<T2>&)` overload was implicit. Made it explicit on all three classes for consistency. 3. final on every leaf derived class Vec2, Vec3, Vec4, Mat2, Mat2x3, Mat3x2, Mat3, Mat4 are all leaf classes — they derive from VecBase/MatBase and aren't intended as further base classes themselves. Added `final` after each class name so a future contributor can't accidentally derive from them. Placement is after the alignas / class name and before the `: public ...` inheritance. Verified by recompiling TestNanoVDB.cc at -Wall -Wextra -O2: exit code 0, zero new warnings (the existing 262 warnings are all pre-existing OpenMP / unused-parameter / unused-function in unrelated headers — unchanged from baseline). The standalone constexpr sanity TU also still compiles clean, confirming the noexcept marks didn't break constant evaluation of any of the math primitives. Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
Move the six namespace-scope `scalar*Vec` / `scalar/Vec` templates for Vec2/Vec3/Vec4 and the five namespace-scope `scalar*Mat` templates for Mat2/Mat2x3/Mat3x2/Mat3/Mat4 into their respective class bodies as `friend` definitions. Hidden friends are found only via ADL on the owning type, so they no longer pollute the enclosing namespace's overload set for unrelated calls. Behavior is unchanged; the existing scalar*Vec test at TestNanoVDB.cc:1461 and scalar*Mat test at TestNanoVDB.cc:1729 still pass. Addresses PR AcademySoftwareFoundation#2207 review feedback. Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
Add a protected variadic constructor to VecBase and MatBase that
initializes the storage array directly in the member initializer list
(`mVec{T(args)...}` / `mData{T(args)...}`) with a `static_assert` on
the argument count.
Refactor every derived Vec/Mat constructor to delegate via the
initializer list (`Vec2(T x, T y) : Base(x, y) {}` etc.) instead of
default-constructing the base subobject and then assigning each
element in the body.
For fundamental @c T this is a no-op (compilers fold the
default-init-then-assign sequence). For any future @c Vec<UserType>
or @c Mat<UserType> with a non-trivial default constructor and copy
assignment, this avoids one default-construct + copy-assign per
element in favor of a single direct initialization.
Addresses PR AcademySoftwareFoundation#2207 review feedback.
Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
|
Implemented in a1072df. Added a protected variadic ctor to both Tracking the other suggestions for follow-up but not in this PR — happy to file separate issues for the C++17-feasible ones (tuple-like interface, begin/end, CTAD guides) if useful. |
Four small, no-risk improvements: * Add `const` to `BaseBBox::isInside` — pure query, was preventing callers from invoking it on a `const BBox&`. * Remove the explicit `int32_t` / `uint32_t` overloads of `Min` and `Max`. The primary `template<typename Type>` already produces bit-identical constexpr code for integer types; the overloads predated the template's `constexpr`-ification. * Add `Tolerance<long double>` and `Delta<long double>` specializations for symmetry with `pi<long double>`. Otherwise instantiating `Vec<long double>` and touching a tolerance path yields an unhelpful "incomplete type" error. * Rename `MatBase::divideAssign(const T&)` to `divideAssignScalar` to match the existing name used in `VecBase`. Updates the five callers in `Mat2`/`Mat2x3`/`Mat3x2`/`Mat3`/`Mat4::operator/=`. No external callers in the nanovdb tree. Addresses PR AcademySoftwareFoundation#2207 follow-up audit (best practices for C++17 / gcc 11.2). Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
Earlier in this PR (commit 001dbe3) we added `explicit` to every converting ctor including the cross-template-template `Vec*(const Vec*T<T2>& v)` overload. That broke a handful of sites that relied on the long-standing pre-refactor implicit conversion. The principled fix is to keep `explicit` (consistent with the same-class `Vec*(const Vec*<T2>&)` overload, which has always been explicit) and update the call sites instead, surfacing each precision/namespace crossing in the source. Audited the full tree; the fallout is: * `nanovdb/tools/cuda/PointsToGrid.cuh` (3 prod sites in the `worldToVoxel(Vec3u8/Vec3u16/Vec3f, ...)` family): wrap the `applyInverseMap(world)` result in an explicit `Vec3d(...)` so the intentional Vec3T -> Vec3d promotion is visible. Floating-point semantics are unchanged (still computed in the input precision, then explicitly promoted). * `nanovdb/unittest/TestNanoVDB.cu` (3 test sites at lines 2078, 2205, 2334): pass `<Vec3T>` explicitly to `voxelToWorld` so the return type matches the test's `begin[i]` element type and no implicit conversion is required. * `nanovdb/unittest/TestOpenVDB.cc` (12 test sites across 4 `EXPECT_EQ` blocks): wrap the openvdb-side accessor results in `nanovdb::Vec3f(...)` so the gtest equality check resolves without depending on implicit `openvdb::Vec3f -> nanovdb::Vec3f` conversion. Also restores the explicit `int32_t`/`uint32_t` overloads of `math::Min`/`Max` removed in commit 5183bf0. They are not redundant with the `template<typename Type>` primary — they disambiguate mixed-int calls (e.g. `Min(uint32_t, size_t)` in PointsToGrid.cuh) that would otherwise fall through equally-bad implicit conversions to the `float`/`double` overloads. Verified with cmake build of nanovdb tree (host + CUDA targets) on nvcc 13.2 / gcc 11.2; no remaining implicit-conversion sites in either the nanovdb or openvdb trees. Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
Comments-only sweep across nanovdb/math/Math.h to make every public class and method discoverable in generated docs and to fix silently-broken Doxygen comments. No code or signatures changed. Touched: * Free math functions (Tolerance/Delta/Maximum/Min/Max/Clamp/Fract/ Floor/Ceil/Pow2/Pow3/Pow4/Abs/Round/RoundDown/MinIndex/MaxIndex/ isApproxZero): added missing @brief lines; upgraded plain `///` descriptions to proper @brief; documented why the integer Min/Max overloads exist; documented constexpr/non-constexpr split. * Coord / Coord2: fixed several `// @brief` comments that were missing a leading slash (so Doxygen ignored them entirely); added briefs to every previously-undocumented ctor, accessor, static, operator, atomic device-only helper, and to `round()`. Replaced stale "node centered conversion" prose with concrete behaviour. * VecBase: added briefs to plus/minus/mul/div/negate/scale, in-place addAssign/subAssign/mulAssign/divAssign/scaleAssign/divideAssign- Scalar, equals, lengthSqr, length, mergeMin, mergeMax. * Vec2 / Vec3 / Vec4: added briefs to default ctor, broadcast ctor, component ctor, cross-template + same-class converting ctors, Coord ctor, operator=, every element-wise / scalar / mixed-Coord / equality / minComponent / maxComponent operator, normalize, and the hidden-friend scalar*Vec / scalar/Vec operators. * MatBase: added a class-level @brief; documented rows/cols/size, default ctor, array ctor, operator[], addAssign/subAssign/ scaleAssign/divideAssignScalar, equals. * Mat2 / Mat2x3 / Mat3x2 / Mat3 / Mat4: lifted each class's leading comment-block into a proper @brief @details; documented every default constructor. * matMult / matMultT: filled in @brief for the 4-arg matMultT(float*, float*, ...) and matMultT(double*, double*, ...) overloads. * BaseBBox: class @brief + briefs on every operator, accessor, translate, and the protected ctors. Documented the inclusive-vs- exclusive convention split. * BBox<Vec3T, true> (float): briefs on every ctor, createCube, empty, operator bool, dim, isInside. * BBox<CoordT, false> (integer): briefs on Iterator (pre/post increment, comparison ops, dereference, conversion to bool), begin/end, default ctor, value ctor, splitting ctor, both createCube overloads, is_divisible, dim, volume, isInside(Coord), and the device-only expand/intersect atomic helpers. * Rgba8: fixed "@brief @brief" typo on the broadcast ctor; reworded "alpha channel it set to 1" to a proper sentence; added briefs to operator< / operator== / lengthSqr / length / operator[] / packed / r/g/b/a / and both `operator Vec3<float>` / `Vec4<float>` conversions. Diff: +356 / -54 lines, all in comments. cmake build of the nanovdb tree still passes cleanly on nvcc 13.2 / gcc 11.2. Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
The expression `(BuildT).5 / d_grid->voxelSize()` resolves to the hidden-friend `operator/(T1, Vec3<double>)`, which returns a `Vec3<double>` (the owning class). Assigning that result to a local `Vec3T` (`Vec3<BuildT>`, typically `Vec3<float>`) used to rely on the implicit cross-template-template ctor, which is now `explicit` to match the rest of the converting-ctor sweep in this PR. Wrap the RHS in an explicit `Vec3T(...)` cast. Floating-point semantics are unchanged: the reciprocal is still computed in double, then narrowed to `BuildT`. Verified by extracting the kernel into a standalone nvcc compile. This was the only remaining build break — the four failing CI jobs (linux-nanovdb gcc/clang × Debug/Release) all reported the identical error at the same line. Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
harrism
left a comment
There was a problem hiding this comment.
Fantastic. Thanks for addressing all my suggestions.
The vector classes only had a mutating normalize() (returns Derived&), which can't be called on a const vector. Add a non-mutating normalized() that returns a unit-length copy. The shared implementation lives on VecBase<T, N> as a Derived-templated helper (divideBy(length())); Vec2/Vec3/Vec4 expose thin delegating wrappers, matching the existing base-helper + derived-wrapper pattern. Marked [[nodiscard]] and non-constexpr (length() calls Sqrt). Extends the Vec2/Vec3Ops/Vec4Ops unit tests to call normalized() on a const vector and verify unit length and an unchanged source. Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
Consolidates
nanovdb/math/Math.hby lifting Mat / Vec arithmetic onto two shared base classes (MatBase<T, R, C>and a newVecBase<T, N>). The five Mat and three Vec derived classes become thin delegating wrappers. The pass also fixes several long-standing correctness bugs and fills inconsistent / partial APIs.Files changed:
Math.h(the refactor),NanoVDB.h(MapAPI now constexpr — depends on thematMultrework),tools/cuda/PointsToGrid.cuh(three call-site casts from theexplicitctor tightening), and three test files (additive: 11 newTEST_Fblocks plus mechanical cast updates), and one python-binding kernel.Bug fixes
math::Min/Maxforint32_t/uint32_tno longer route throughfminf/fmaxf— old form lost precision above 2^24 and could be UB nearINT_MAX. Heavily used byHDDA.h.operator*(Mat3, Mat2x3) -> Mat2x3. No callers in the tree.Mat2::inverse()now returns an explicit zero on singular input (was uninitialised) and usesT(1)/detforT=double.Vec*::operator/(T)andMat*::operator/(T)now per-element-divide. The old(T(1)/s) * (*this)did integer division for integerT, soVec3i(6,8,10) / 2returned(0,0,0).Vec2'soperator+/-/+=/-=againstCoordnow takeCoord2. Old signature silently droppedz.math::Round(Vec3<float>)andRound(Vec3<double>)both usefloor(x + 0.5)now. Thefloatoverload previously usedrintf(round-half-to-even), so the two disagreed at half-integers.floor()/ceil()doc swap onVec2/Vec3. Implementations were always right.Refactor
MatBase<T, R, C>: shared element-wise / scalar / equality / transpose / matmat / matvec implementations, all__hostdev__ constexpr.rows()/cols()/size()promoted public.VecBase<T, N>(new): shared element-wise / scalar / equality / reductions / mutating min/max / integer rounding.floorAs<Result>/ceilAs<Result>/roundAs<Result>unifyRoundsemantics acrossfloat/double/long double.Mat2::inverse,Vec3::cross/outer) + one-line operators delegating to the base.Mat3/Mat3x2/Mat4scalar ctors switched fromtemplate<typename Source>toT, matchingMat2/Mat2x3. Mixed-literal construction (Mat3<double>(1, 2.0, 3, ...)) now works uniformly. Array ctors keeptemplate<typename Source>for cross-type interop.API completeness
-, binary+/-,+=/-=,*(T)//(T),*=(T)//=(T),==/!=. Previously scattered.Mat3 * Mat3x2standalone,Mat4 * Mat4member.Mat2 * Vec2,Mat2x3 * Vec3 -> Vec2,Mat3x2 * Vec2 -> Vec3.Mat4 * Vec4promoted to member.scalar * MatNstandalones forMat2x3/Mat3x2/Mat3/Mat4.Vec4gains parity withVec3:min,max,asPointer,floor,ceil,round(returnsVec4<int32_t>— noCoord4).constexprsweepEvery function in
Math.hand theMapAPI inNanoVDB.hthat C++17 allows is nowconstexpr. Carve-outs:length/normalize(std::sqrt, C++26), thefloat/doubleoverloads ofMin/Max/Floor/Ceil/Round/Abs(C++23), and CUDA-device-only*Atomichelpers.Rewriting
matMult/matMultTfromfma(a, b, c)toa * b + cwas required (fma isn't constexpr until C++23, P1383R2). GPU codegen unchanged — nvcc-fmad=truecontracts back to hardware FMA. Host codegen unchanged under default-ffp-contract=on. Only compile-timeconstexprevaluation observes the double-rounded result. Worst-case ≤1 ulp drift, well below NanoVDB's geometric precision.Other quality passes
operator[]inMath.hviaNANOVDB_ASSERT(twelve sites). No-op underNDEBUG; const-index accesses fold at compile time even in debug.@brief; several// @brieftypos (missing leading slash — silently invisible to Doxygen) fixed.scalar*Vec/scalar/Vec/scalar*Matconverted from namespace-scope free templates to in-classfrienddefinitions — ADL-found only, no longer pollute the enclosing namespace.Tests
Eleven additive
TEST_Fblocks inTestNanoVDB.cc:IntegerMinMax2^24+1precision;INT_MAX/UINT_MAXoverflow safety.Roundfloatanddouble(especially-1.5).Vec2/Vec3Ops/Vec4OpsVec4methods; integer-T division.Mat2/Mat3/Mat4/Mat2x3_Mat3x2Mat2::inversenon-singular roundtrip + singular-returns-zero; mixed-literal construction.MatMulMatVecIntrospectionstatic_asserts forrows/cols/size/SIZE/ValueType;data()/asPointer()ordering.Risk and compatibility
matMultrounding: ≤1 ulp drift only in pure abstract-machine evaluation. GPU and host codegen identical to before.Mat3 * Mat2x3(no callers).Vec2'sCoordoverloads now requireCoord2. Code passing a 3DCoordtoVec2will fail to compile — but it was silently droppingz.explicit:Vec*(const Vec*T<T2>&)now matches the same-class ctor's convention. Audit found 18 affected sites across the nanovdb + openvdb trees, all updated here (3 inPointsToGrid.cuh, 3 inTestNanoVDB.cu, 12 inTestOpenVDB.cc, 1 inPySampleFromVoxels.cu). All mechanical and value-preserving. Downstream consumers that relied on the implicit conversion will need the same one-line cast — typicallynanovdb::Vec3f(yourVec)or an explicit template argument to a returning helper.