NanoVDB Math: refactor onto `MatBase`/`VecBase`, fix correctness bugs, add tests by swahtz · Pull Request #2207 · AcademySoftwareFoundation/openvdb

swahtz · 2026-05-20T04:25:32Z

Consolidates nanovdb/math/Math.h by lifting Mat / Vec arithmetic onto two shared base classes (MatBase<T, R, C> and a new VecBase<T, N>). The five Mat and three Vec derived classes become thin delegating wrappers. The pass also fixes several long-standing correctness bugs and fills inconsistent / partial APIs.

Files changed: Math.h (the refactor), NanoVDB.h (Map API now constexpr — depends on the matMult rework), tools/cuda/PointsToGrid.cuh (three call-site casts from the explicit ctor tightening), and three test files (additive: 11 new TEST_F blocks plus mechanical cast updates), and one python-binding kernel.

Bug fixes

math::Min / Max for int32_t / uint32_t no longer route through fminf / fmaxf — old form lost precision above 2^24 and could be UB near INT_MAX. Heavily used by HDDA.h.
Removed dimensionally-invalid operator*(Mat3, Mat2x3) -> Mat2x3. No callers in the tree.
Mat2::inverse() now returns an explicit zero on singular input (was uninitialised) and uses T(1)/det for T=double.
Vec*::operator/(T) and Mat*::operator/(T) now per-element-divide. The old (T(1)/s) * (*this) did integer division for integer T, so Vec3i(6,8,10) / 2 returned (0,0,0).
Vec2's operator+/-/+=/-= against Coord now take Coord2. Old signature silently dropped z.
math::Round(Vec3<float>) and Round(Vec3<double>) both use floor(x + 0.5) now. The float overload previously used rintf (round-half-to-even), so the two disagreed at half-integers.
Corrected floor() / ceil() doc swap on Vec2 / Vec3. Implementations were always right.

Refactor

MatBase<T, R, C>: shared element-wise / scalar / equality / transpose / matmat / matvec implementations, all __hostdev__ constexpr. rows() / cols() / size() promoted public.
VecBase<T, N> (new): shared element-wise / scalar / equality / reductions / mutating min/max / integer rounding. floorAs<Result> / ceilAs<Result> / roundAs<Result> unify Round semantics across float / double / long double.
All five Mat and three Vec classes become thin wrappers: ctors + per-type extras (Mat2::inverse, Vec3::cross/outer) + one-line operators delegating to the base.
Mat3 / Mat3x2 / Mat4 scalar ctors switched from template<typename Source> to T, matching Mat2 / Mat2x3. Mixed-literal construction (Mat3<double>(1, 2.0, 3, ...)) now works uniformly. Array ctors keep template<typename Source> for cross-type interop.

API completeness

Every Mat type has the same operator surface: unary -, binary + / -, += / -=, *(T) / /(T), *=(T) / /=(T), == / !=. Previously scattered.
New mat*mat: Mat3 * Mat3x2 standalone, Mat4 * Mat4 member.
New mat*vec members: Mat2 * Vec2, Mat2x3 * Vec3 -> Vec2, Mat3x2 * Vec2 -> Vec3. Mat4 * Vec4 promoted to member.
New scalar * MatN standalones for Mat2x3 / Mat3x2 / Mat3 / Mat4.
Vec4 gains parity with Vec3: min, max, asPointer, floor, ceil, round (returns Vec4<int32_t> — no Coord4).

`constexpr` sweep

Every function in Math.h and the Map API in NanoVDB.h that C++17 allows is now constexpr. Carve-outs: length / normalize (std::sqrt, C++26), the float / double overloads of Min / Max / Floor / Ceil / Round / Abs (C++23), and CUDA-device-only *Atomic helpers.

Rewriting matMult / matMultT from fma(a, b, c) to a * b + c was required (fma isn't constexpr until C++23, P1383R2). GPU codegen unchanged — nvcc -fmad=true contracts back to hardware FMA. Host codegen unchanged under default -ffp-contract=on. Only compile-time constexpr evaluation observes the double-rounded result. Worst-case ≤1 ulp drift, well below NanoVDB's geometric precision.

Other quality passes

Bounds checks on every operator[] in Math.h via NANOVDB_ASSERT (twelve sites). No-op under NDEBUG; const-index accesses fold at compile time even in debug.
Doxygen sweep: every previously-undocumented public method gains an @brief; several // @brief typos (missing leading slash — silently invisible to Doxygen) fixed.
Hidden friends: scalar*Vec / scalar/Vec / scalar*Mat converted from namespace-scope free templates to in-class friend definitions — ADL-found only, no longer pollute the enclosing namespace.

Tests

Eleven additive TEST_F blocks in TestNanoVDB.cc:

Test	Covers
`IntegerMinMax`	`2^24+1` precision; `INT_MAX` / `UINT_MAX` overflow safety.
`Round`	Half-integer agreement between `float` and `double` (especially `-1.5`).
`Vec2` / `Vec3Ops` / `Vec4Ops`	Full operator coverage; new `Vec4` methods; integer-T division.
`Mat2` / `Mat3` / `Mat4` / `Mat2x3_Mat3x2`	Constructors, every operator, transpose, matmat, matvec; `Mat2::inverse` non-singular roundtrip + singular-returns-zero; mixed-literal construction.
`MatMul`	Every dimensionally-valid matmat plus every member matvec across the five matrix types.
`MatVecIntrospection`	`static_assert`s for `rows`/`cols`/`size`/`SIZE`/`ValueType`; `data()`/`asPointer()` ordering.

Risk and compatibility

Behaviour preserved for every previously-correct code path.
Behaviour changes are exactly the bug fixes above — each moves an incorrect / undefined output to a defined, correct one.
matMult rounding: ≤1 ulp drift only in pure abstract-machine evaluation. GPU and host codegen identical to before.
API additions are pure. The only removed overload is the broken Mat3 * Mat2x3 (no callers).
One signature tightening: Vec2's Coord overloads now require Coord2. Code passing a 3D Coord to Vec2 will fail to compile — but it was silently dropping z.
Cross-template ctor tightened to explicit: Vec*(const Vec*T<T2>&) now matches the same-class ctor's convention. Audit found 18 affected sites across the nanovdb + openvdb trees, all updated here (3 in PointsToGrid.cuh, 3 in TestNanoVDB.cu, 12 in TestOpenVDB.cc, 1 in PySampleFromVoxels.cu). All mechanical and value-preserving. Downstream consumers that relied on the implicit conversion will need the same one-line cast — typically nanovdb::Vec3f(yourVec) or an explicit template argument to a returning helper.

Lift the bulk of Mat2/Mat2x3/Mat3x2/Mat3/Mat4 and Vec2/Vec3/Vec4 onto shared base classes (MatBase<T, R, C> and a new VecBase<T, N>) so the arithmetic is implemented once and the derived classes shrink to constructors plus thin delegating wrappers. Along the way, fix several pre-existing correctness issues and round out a number of inconsistent or partially-implemented APIs. No behaviour change for any code that was already correct. Bug fixes - math::Min/Max for int32_t and uint32_t no longer route through fminf/fmaxf, eliminating silent precision loss for values above 2^24 and undefined behaviour for values near INT_MAX / UINT_MAX (heavily exercised by HDDA ray traversal). - Remove an invalid operator*(const Mat3<T>&, const Mat2x3<T>&) overload: the dimensions don't match standard matrix algebra and the body silently dropped the third row/column. - Mat2::inverse() now returns an explicit zero matrix on a singular input (previously returned an uninitialised Mat2<T>()) and uses T(1) / det instead of 1.f / det. - Vec*::operator/(T) and operator/=(T) do per-element division. The old (T(1)/s) * (*this) form produced (0,...,0) for integer T (Vec2i, Vec3i, Vec4i, and the matrices via the same path now divide correctly). - Vec2's mixed overloads (operator+/-/+= /-= with a Coord) now take a Coord2 instead of the 3D Coord; they used to silently drop z. - math::Round(Vec3<float>) and math::Round(Vec3<double>) now use the same rounding strategy (floor(x + 0.5)). The float version previously used rintf (round-half-to-even), giving different results at half-integers, e.g. Round(-1.5). - Vec2/Vec3 floor()/ceil() doc comments corrected (had been swapped). Refactor / consolidation - Promote MatBase::rows()/cols()/size() to public; add ValueType alias and data() accessors. - New generic MatBase helpers: plus, minus, negate, scale, divideBy, addAssign, subAssign, scaleAssign, divideAssign, equals, transposeAs<Result>, multiply<Result, Rhs>, multiplyVec<VecRes, VecRhs>. - New VecBase<T, N> with: plus / minus / mul / div / negate / scale / divideBy, addAssign / subAssign / mulAssign / divAssign / scaleAssign / divideAssignScalar, equals, dot, lengthSqr, length, smallestComponent, largestComponent, mergeMin / mergeMax, floorAs / ceilAs / roundAs, plus operator[] and asPointer. - All matrix and vector classes reduce to constructors, any dimension-specific extras (Mat2::inverse, Vec3::cross/outer), and one-line operator wrappers that delegate to the base. - Unify scalar constructors of Mat3 / Mat3x2 / Mat4 on T arguments, matching Mat2 / Mat2x3 (was template<typename Source>); mixed-literal construction now works uniformly across all five matrix types. Source* array constructors keep template<typename Source> for cross-type interop. API completeness - Element-wise / scalar / equality operators are now uniform across all five matrix types (operator+, -, += -=, * / *= /=, == !=, and unary -). Several of these were previously missing on Mat3x2 / Mat4 or partial on the others. - New mat*mat overloads: Mat3 * Mat3x2 -> Mat3x2 (standalone) and Mat4 * Mat4 -> Mat4 (member). - New mat*vec overloads (all members): Mat2 * Vec2, Mat2x3 * Vec3, Mat3x2 * Vec2. The standalone Mat4 * Vec4 moves to a Mat4 member to match Mat3 * Vec3. - scalar * MatN standalones added for Mat2x3, Mat3x2, Mat3, Mat4 (Mat2 already had one). - Vec4 gains the API it was missing relative to Vec3: min(), max(), asPointer(), floor(), ceil(), round() (returning Vec4<int32_t> since there is no Coord4). Tests - Add 11 new TEST_F blocks to TestNanoVDB.cc covering every change above: IntegerMinMax, Round, Vec2, Vec3Ops, Vec4Ops, Mat2, Mat3, Mat4, Mat2x3_Mat3x2, MatMul, MatVecIntrospection. Each test pins down the specific behaviour (with numeric checks where applicable, including the singular-inverse case, half-integer Round agreement, and the integer-division regression). Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>

Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>

The template-template constructors and assignment operators on nanovdb::math::Vec2/Vec3/Vec4 are the entry point for converting any Vec-shaped type (including openvdb::math::Vec*) into a NanoVDB vector. Their `static_assert`s were tightened to check `Vec*T<T2>::SIZE` (uppercase) in the recent VecBase refactor, but openvdb::math::Vec* only exposes the lowercase `size` constant (inherited from openvdb::math::Tuple). That broke openvdb -> nanovdb conversion paths such as nanovdb::tools::createNanoGrid<openvdb Vec grid>() with: error: 'SIZE' is not a member of 'openvdb::v13_0::math::Vec4<float>' Switch all six static_asserts back to the lowercase `size`, which is the common-denominator constant shared by both libraries' Vec types. `static_assert` is a constant-expression context, so it never ODR-uses the constant; this revert does not reintroduce the GCC linker error fixed by making `Vec*::size` constexpr. Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>

Copilot

Pull request overview

This PR refactors NanoVDB’s math/Math.h by consolidating shared vector/matrix arithmetic into new VecBase<T, N> and enhanced MatBase<T, R, C> base classes, while also fixing several long-standing correctness issues and adding unit tests to lock in the corrected behavior.

Changes:

Refactor vector and matrix implementations onto VecBase/MatBase helpers to remove duplicated per-element loops and normalize APIs across types.
Fix correctness issues (e.g., integer Min/Max, consistent Round semantics, singular Mat2::inverse, scalar division for integer Vec*/Mat*, invalid mat-mul overload removal).
Add 11 new unit tests covering the refactor, new APIs, and each bug fix with numeric checks and compile-time introspection.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
`nanovdb/nanovdb/math/Math.h`	Introduces `VecBase`, expands `MatBase`, rewrites `Vec`/`Mat` as thin wrappers, fixes math correctness bugs, and rationalizes overload sets.
`nanovdb/nanovdb/unittest/TestNanoVDB.cc`	Adds new `TEST_F` blocks that exercise the refactored math types and pin down the corrected semantics.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

VecBase::floorAs / ceilAs / roundAs used util::is_floating_point<T> to detect the floating-point arm, but util::is_floating_point only matches float and double — long double is not in its set. With T = long double the constexpr branch would therefore fall through to the integer pass-through arm and silently truncate via static_cast<int32_t>, instead of going through math::Floor / math::Ceil. That was a behavior regression vs the pre-refactor code, where Vec3<long double>::round() correctly hit the Floor(x + 0.5) path (the same path that fails to compile cleanly for the float + double Floor overloads, producing a loud "ambiguous" diagnostic rather than a silent wrong result). Switch the three constexpr predicates to std::is_floating_point<T>, which does include long double, and add an explicit `#include <type_traits>` plus a comment explaining the choice. The inner math::Floor / math::Ceil calls only have float and double overloads, so Vec<long double>::floor() / ceil() / round() now once again fail to compile with an ambiguous-overload diagnostic — which is the same behavior as master and is strictly preferable to the silent truncation introduced by the refactor. Verified float / double / int rounding paths unchanged against the existing TestNanoVDB Round + Vec3Ops + Vec4Ops blocks. Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>

harrism

This is a really great cleanup and improvement. I have comments in 4 areas.

constexpr -- might as well constexpr all the things
Alignas -- on the classes where it makes sense, let's guarantee alignment so the compiler can optimize / coalesce loads (this allows each row of a 4x4 matrix to be loaded with a single 128-bit global load, for example. Without it, the compiler will generate 4 32-bit loads).
Other modernization / best practices topics: [[nodiscard]], noexcept, consistent explicit ctors, hidden friends.
Documentation -- didn't comment on this much inline, but make sure all methods have doxygen docs. Looks a bit inconsistent?

harrism · 2026-05-26T06:22:36Z

I also asked an agent to review with modern C++ best practices in mind and then had it prune its list to not include what I already mentioned in my review.

This first one would be good to implement.

 Body-assignment in derived constructors. The new derived ctors do this->mVec[0] = x;
  this->mVec[1] = y; instead of the pre-PR : mVec{x, y} {}. Invisible for fundamental T, a
  regression for any future Vec<UserType>. Fix is a protected variadic constructor on VecBase:

  template<typename... Args>
  __hostdev__ explicit VecBase(Args... args) : mVec{T(args)...} {
      static_assert(sizeof...(Args) == N, "wrong number of args");
  }

  Derived ctors delegate: Vec2(T x, T y) : Base(x, y) {}.

The rest are all speculative / nice to have or depend on C++20 or later.

Defaulted operator== (C++20 only). bool operator==(const Vec3&) const noexcept = default;
  replaces the manual equals() chain; operator!= auto-synthesizes. Only worth raising if NanoVDB's
   C++ baseline allows.

  Tuple-like interface (tuple_size / tuple_element / get) for auto [x, y, z] = v; and std::apply.
  C++17.

  begin()/end() so Vec/Mat work with range-based for and <algorithm>. Six lines per base.

  [[deprecated]] on the lowercase size alias if the goal is to migrate callers to SIZE. Nudges
  without breaking.

  static_cast<T>(x) over functional-style T(x). Identical for arithmetic types but greppable and
  doesn't silently become a constructor call for non-arithmetic T. Minor.

  CTAD deduction guides so Vec3 v(1.0f, 2.0f, 3.0f); deduces Vec3<float>. Three lines per Vec /
  Mat class. C++17.

  C++20 requires clauses on cross-Vec templated functions (dot<V>, mergeMin<V>, the templated
  converting constructor). Replaces static_assert(V::SIZE == N) in the body with constraints at
  the declaration. Better diagnostics, SFINAE-friendly.

  std::array<T, N> instead of T[N] in the base. Free .data(), iterators, structured bindings,
  .size(), .fill(), ==. Layout-compatible with T[N] for fundamental T. Possibly tangles with CUDA
  portability (cuda::std::array exists). Out of scope for this PR; worth knowing.

Add debug-only bounds checks (NANOVDB_ASSERT, no-op in release) to every operator[] in Math.h, addressing reviewer feedback. Twelve sites in one file, all of Math.h's indexed accessors: * Coord::operator[] — i < 3 * Coord2::operator[] — i < 2 * VecBase::operator[] — i in [0, N) * MatBase::operator[] — row in [0, ROWS), returns a row pointer (col bound stays the caller's responsibility) * BBox<Vec3T>::operator[]— i in [0, 2) (min/max) * Rgba8::operator[] — n in [0, 4) NANOVDB_ASSERT compiles out under NDEBUG so release-build code paths are unchanged. For Coord and Coord2 the IndexType is uint32_t and the lower bound is therefore implicit; for the int-indexed accessors both bounds are checked. Constexpr-index sites (which are the overwhelming majority in practice — e.g. v[0], v[1], v[2], bbox[0]) fold the comparison away at compile time even in debug builds, so the new asserts only have runtime cost for genuinely-dynamic index expressions. Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>

Mark every function in nanovdb/math/Math.h `constexpr` that the C++17 rules allow. Addresses reviewer feedback on PR AcademySoftwareFoundation#2207 specifically calling out VecBase::operator[], VecBase::asPointer, all VecBase operators/helpers, all MatBase operators/methods, and Mat2::inverse — plus everything else in the file that's trivially constexpr-eligible while we're in here. Scope of the sweep: * All free math helpers — Tolerance::value, Delta::value, Maximum::value, Min/Max (int + uint overloads), Pow2/3/4, generic Abs, Sign, MinIndex/MaxIndex, AlignUp. * Coord and Coord2 — every method except `Floor` (which depends on the non-constexpr math::Floor). * VecBase — operator[], asPointer (both const + non-const), and all element-wise / scalar helpers (plus, minus, mul, div, negate, scale, divideBy, addAssign, subAssign, mulAssign, divAssign, scaleAssign, divideAssignScalar, equals, dot, lengthSqr, smallestComponent, largestComponent, mergeMin, mergeMax, floorAs, ceilAs, roundAs). `length()` stays non-constexpr — std::sqrt isn't constexpr until C++26 — with a one-line comment noting the blocker. * Vec2 / Vec3 / Vec4 — every operator and accessor; cross/outer on Vec3. The floor/ceil/round wrappers are constexpr only for integer T (the floating-point arms transit through math::Floor / math::Ceil which aren't constexpr). * MatBase — operator[], data, plus, minus, negate, scale, addAssign, subAssign, scaleAssign, divideBy, divideAssign, equals, transposeAs, multiply, multiplyVec. * Mat2 / Mat2x3 / Mat3x2 / Mat3 / Mat4 — every constructor (element list + Source*), every operator (unary -, binary +/-, +=/-=, mat-mat *, mat-vec *, scalar */ /, *=/=, ==/!=), transpose. Mat2::inverse is now constexpr — required also making math::isApproxZero and math::Tolerance<float|double>::value constexpr, which the reviewer specifically flagged as the precondition. * All free scalar*matrix and mixed-shape matrix*matrix operator overloads. * BaseBBox and both BBox specializations (floating + integer) — including the integer-bbox Iterator class. * Rgba8 — all six constructors plus operator<, operator==, lengthSqr, asFloat, operator[], packed, r/g/b/a, the Vec3<float> and Vec4<float> conversions. * Pre-C++20 nitpick honored throughout: every helper that builds a result element-by-element now uses value-initialization (`Derived out{};` / `Result r{};`) so the constexpr rule about uninitialized locals is satisfied even on older standards. Functions deliberately left non-constexpr (each blocked by a stdlib function that isn't constexpr in C++17): * VecBase::length, Vec*::normalize, Rgba8::length (sqrt is C++26). * math::Min(float,float) / Min(double,double) / Max(float,float) / Max(double,double) (fminf/fmaxf are C++23). * math::Clamp(float,…) / Clamp(double,…) (call the non-constexpr Min/Max). * math::Fract / Floor / Ceil / Round (floorf/ceilf are C++23). * math::Sqrt(float|double). * math::Abs(float|double|int) (fabs/abs are C++23). * Coord::Floor / Coord2::Floor — depend on math::Floor. * matMult / matMultT free overloads (use fma/fmaf — C++23). * All __device__-only *Atomic methods. * BBox<CoordT,false>::transform<Map> — depends on an external Map::applyMap whose constexpr-ness varies. Verified by a standalone TU that constructs the constexpr surface at compile time (a ~50-line block of static_asserts covering Vec, Mat, Coord, BBox, Rgba8, and the math helpers) and by recompiling the existing TestNanoVDB unittest TU at -O2 -Wall — both clean. Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>

C++17 audit of the constexpr sweep flagged one strict violation: every public Rgba8 constructor initialises `mData{{r, g, b, a}}` — i.e. it makes the union's `c[4]` member active. Reading `mData.packed` (the other union member) at constant evaluation in C++17 is undefined behaviour, and a constexpr function isn't allowed to perform UB at constant evaluation. With no public path to make `packed` the active union member at compile time there's no legal way to constant-evaluate these three accessors, so the constexpr mark falls into "ill-formed, no diagnostic required" territory under [dcl.constexpr]/6 and should be removed. Drop the constexpr from: * Rgba8::operator< (reads mData.packed) * Rgba8::operator== (reads mData.packed) * Rgba8::packed() const / non-const (returns reference to mData.packed) The rest of Rgba8's constexpr surface (r/g/b/a, asFloat, operator[], lengthSqr, the explicit ctors) reads only mData.c and is unaffected. Verified with two TUs under -std=c++17 -pedantic: * a positive TU that constant-evaluates ~40 Vec/Mat/Coord/BBox/Rgba8 surface expressions via static_assert (incl. Mat2::inverse) — compiles clean; * a negative TU that puts `Rgba8::operator==`, `operator<`, and `packed()` in a constexpr context — fails to compile with three "call to non-'constexpr' function" errors, confirming the strict C++17 rules now reject what was previously a silent NDR. Also re-ran TestNanoVDB.cc full TU compile at -O2 -Wall — clean. (The C++20 "common initial sequence" union-access relaxation doesn't apply here either: uint8_t[4] and uint32_t are not layout-compatible types.) Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>

Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>

The fma / fmaf calls in math::matMult were the last hop blocking constexpr on Map::applyMap and BBox<CoordT,false>::transform<Map>. Switch the six matMult / matMultT overloads to plain `a * b + c` form and constexpr-mark them, then propagate the mark through every Map method (applyMap, applyMapF, applyJacobian, applyJacobianF, applyInverseMap, applyInverseMapF, applyInverseJacobian, applyInverseJacobianF, applyIJT, applyIJTF, getVoxelSize) and the two Map constructors, and finally BBox::transform itself. Trade-off: `a * b + c` rounds twice vs the single rounding of fma(a, b, c). Worst-case ~1 ulp drift; for NanoVDB voxel transforms (values in meters/kilometers, voxel sizes in sub-millimeters to meters) that's well below the geometric precision of the operation. Device-side codegen is unchanged in practice — nvcc's default -fmad=true contracts `a * b + c` back into a single hardware FMA instruction. Host-side without explicit -ffp-contract=off most compilers still contract within an expression too. The only place the difference is observable is in a compile-time constexpr evaluation, where the abstract-machine rules forbid contraction — the constexpr result is strictly double-rounded. Once the project moves to C++23, constexpr fma will be available (P1383) and the matMult bodies can revert to fma/fmaf if a 1-ulp gain is ever worth wanting in some not-yet-imagined precision audit. The runtime instruction set on every platform we care about is identical either way. Verified at compile time with a focused TU that constant-evaluates the identity Map, a scale+translate Map, every apply* variant, and BBox<Coord>::transform<Map> applied to a 5x5x5 CoordBBox. Re-ran the existing TestNanoVDB.cc unittest TU at -O2 -Wall — clean. Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>

Reviewer asked for named component accessors on top of operator[]. Add them generically to VecBase<T, N>, with a body static_assert that fires at the call site when the Vec is too small for the requested component: constexpr T& x() { return mVec[0]; } // N >= 1 constexpr T& y() { static_assert(N >= 2, "..."); ... } constexpr T& z() { static_assert(N >= 3, "..."); ... } constexpr T& w() { static_assert(N >= 4, "..."); ... } Plus matching const overloads. All eight methods are __hostdev__ and constexpr, matching the rest of VecBase's accessor surface. Lives on VecBase so the same code services Vec2 / Vec3 / Vec4 and any future dimension without per-class duplication. Choice of static_assert over SFINAE-disabling: simplest pattern, clearest call-site error ("VecBase::w() requires N >= 4"), and matches function-body static_assert usage already in this file (MatBase::multiply, BBox<Vec3T>::asReal). The SFINAE-friendly trade-off is that vec2.w *names* something on the class — only the call errors — but that's fine for an accessor that nobody is going to introspect via is_invocable_v. The MatVecIntrospection test grew a block exercising x/y/z/w at compile time on each of Vec2 / Vec3 / Vec4, plus a runtime write-through-non-const-ref check on Vec4. The static_assert negative case (Vec2<T>::w() failing to compile) was verified out of tree but isn't part of the test TU since gtest can't assert "must-fail-to-compile". Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>

Reviewer asked for alignas on the dimensionally-friendly Vec / Mat classes. Apply the "align to full byte size" pattern to every class whose element count is already a power of 2 (so the alignment costs zero extra bytes), and document the explicit decision NOT to align the rest: * Vec2<T>: alignas(alignof(T) * 2) 8B / 16B for float / double * Vec3<T>: comment, NO alignas * Vec4<T>: alignas(alignof(T) * 4) 16B / 32B * Mat2<T>: alignas(alignof(T) * 4) 16B / 32B * Mat2x3<T>: comment, NO alignas * Mat3x2<T>: comment, NO alignas * Mat3<T>: comment, NO alignas * Mat4<T>: alignas(alignof(T) * 16) 64B / 128B Vec3 / Mat3 / Mat2x3 / Mat3x2 each carry an inline comment explaining why they stay at natural alignment: their byte sizes (3*T, 9*T, 6*T) are not powers of 2, so any alignas(N > alignof(T)) would inject tail padding and break packed-array layout plus on-disk format compatibility. The comments are there specifically so a future contributor doesn't "fix" the omission. Mat4 alignment is heavy (64-byte for float, 128-byte for double). This follows the reviewer's "align to full size" rule and lets a Mat4<float> load with a single AVX-512 instruction. Mat4's comment explicitly calls out the alternative (alignas(alignof(T) * 4) for row-alignment only, matching Vec4) so anyone hitting downstream allocator friction can drop to it. Coord / Coord2 / BBox / Rgba8 are deliberately not touched: Coord and Coord2 are serialized integer index types where layout compatibility matters; BBox composes Vec3 / Coord so it inherits their natural alignment; Rgba8 is already 4-byte aligned via its uint32_t packed union member. Verified with a focused TU of 14 static_asserts covering size + alignment for float and double instantiations of every Vec / Mat class. Re-ran the existing TestNanoVDB.cc unittest TU at -O2 -Wall to confirm the alignment changes don't ripple into any existing test failure — clean. Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>

Reviewer asked for [[nodiscard]] on every value-returning query, called out specifically because m.transpose() returns a NEW matrix rather than mutating in place — and a user expecting the openvdb::Mat3::transpose() in-place semantics will silently discard the result and quietly produce a wrong-but-plausible answer downstream. 282 [[nodiscard]] annotations added across Math.h: * Free math helpers (Tolerance/Delta/Maximum, Min/Max/Clamp/Fract/ Floor/Ceil/Pow2-4/Abs/Round/RoundDown/Sqrt/Sign/MinIndex/MaxIndex/ AlignUp, isApproxZero, the six matMult/matMultT overloads): ~46 * Coord + Coord2: 46 * VecBase + Vec2/Vec3/Vec4: 74 * MatBase + Mat2/Mat2x3/Mat3x2/Mat3/Mat4: 72 * BaseBBox + BBox<Vec3T,true> + BBox<CoordT,false> (incl. Iterator): 29 * Rgba8: 7 Targeted: pure queries returning by value or returning a new value (plus, minus, cross, outer, dot, lengthSqr, length, equals, operator+ / - / * / / / == / !=, transpose, inverse, floorAs/ceilAs/ roundAs, asVec3s/asVec3d, etc.). Deliberately NOT annotated: operator[], asPointer / data, named component accessors (x/y/z/w/r/g/b/a, packed), compound-assignment operators (+=, -=, *=, /=, &=, <<=, >>=), constructors, destructors, void-returning mutators, mutating chain methods that return *this by reference (Vec::normalize, BBox::expand/intersect/ translate), pre/post operator++, and the device-only *Atomic methods. None of those exhibit the bug class the attribute is designed to catch. Verified by recompiling TestNanoVDB.cc with -Wall -Wextra at -O2 — zero new -Wunused-result / nodiscard warnings, meaning the existing test code consumes every return value it gets back. The pre-existing warnings (omp simd pragmas in MaskPrefixSum.h and VoxelBlockManager.h) are infrastructure, unchanged. Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>

Three review items addressed in one commit, all confined to Math.h: 1. noexcept sweep Reviewer noted that no method in Math.h throws (verified — zero hits for throw/exception). Added noexcept to every function in the file: 489 annotations across free helpers, every Coord / Coord2 / VecBase / Vec* / MatBase / Mat* / BBox / Iterator / Rgba8 method, plus every standalone operator overload. Default constructors marked as `noexcept = default;`. Out-of-class definitions of Coord::asVec3s/asVec3d and Coord2::asVec2s/asVec2d got matching noexcept on both their in-class declarations and their out-of-class definitions so the exception specifications agree. 2. explicit on the templated Vec*T<T2> ctors Vec2/Vec3/Vec4 each have three "convert-from-some-other-Vec" ctors: the typed `Vec*(const Vec*<T2>&)` and the Coord-flavored `Vec*(const Coord*&)` overloads were already explicit, but the open-template `template<template<class> class Vec*T, class T2> Vec*(const Vec*T<T2>&)` overload was implicit. Made it explicit on all three classes for consistency. 3. final on every leaf derived class Vec2, Vec3, Vec4, Mat2, Mat2x3, Mat3x2, Mat3, Mat4 are all leaf classes — they derive from VecBase/MatBase and aren't intended as further base classes themselves. Added `final` after each class name so a future contributor can't accidentally derive from them. Placement is after the alignas / class name and before the `: public ...` inheritance. Verified by recompiling TestNanoVDB.cc at -Wall -Wextra -O2: exit code 0, zero new warnings (the existing 262 warnings are all pre-existing OpenMP / unused-parameter / unused-function in unrelated headers — unchanged from baseline). The standalone constexpr sanity TU also still compiles clean, confirming the noexcept marks didn't break constant evaluation of any of the math primitives. Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>

Move the six namespace-scope `scalar*Vec` / `scalar/Vec` templates for Vec2/Vec3/Vec4 and the five namespace-scope `scalar*Mat` templates for Mat2/Mat2x3/Mat3x2/Mat3/Mat4 into their respective class bodies as `friend` definitions. Hidden friends are found only via ADL on the owning type, so they no longer pollute the enclosing namespace's overload set for unrelated calls. Behavior is unchanged; the existing scalar*Vec test at TestNanoVDB.cc:1461 and scalar*Mat test at TestNanoVDB.cc:1729 still pass. Addresses PR AcademySoftwareFoundation#2207 review feedback. Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>

Add a protected variadic constructor to VecBase and MatBase that initializes the storage array directly in the member initializer list (`mVec{T(args)...}` / `mData{T(args)...}`) with a `static_assert` on the argument count. Refactor every derived Vec/Mat constructor to delegate via the initializer list (`Vec2(T x, T y) : Base(x, y) {}` etc.) instead of default-constructing the base subobject and then assigning each element in the body. For fundamental @c T this is a no-op (compilers fold the default-init-then-assign sequence). For any future @c Vec<UserType> or @c Mat<UserType> with a non-trivial default constructor and copy assignment, this avoids one default-construct + copy-assign per element in favor of a single direct initialization. Addresses PR AcademySoftwareFoundation#2207 review feedback. Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>

swahtz · 2026-05-29T02:16:08Z

Implemented in a1072df. Added a protected variadic ctor to both VecBase and MatBase and converted every derived Vec/Mat ctor (Vec2/3/4, Mat2/2x3/3x2/3/4) to delegate via initializer list — no more body-assignment, no more default-construct-then-overwrite of the base subobject. The static_assert(sizeof...(Args) == N) (and == ROWS*COLS for MatBase) catches arity mismatches at the call site.

Tracking the other suggestions for follow-up but not in this PR — happy to file separate issues for the C++17-feasible ones (tuple-like interface, begin/end, CTAD guides) if useful.

Four small, no-risk improvements: * Add `const` to `BaseBBox::isInside` — pure query, was preventing callers from invoking it on a `const BBox&`. * Remove the explicit `int32_t` / `uint32_t` overloads of `Min` and `Max`. The primary `template<typename Type>` already produces bit-identical constexpr code for integer types; the overloads predated the template's `constexpr`-ification. * Add `Tolerance<long double>` and `Delta<long double>` specializations for symmetry with `pi<long double>`. Otherwise instantiating `Vec<long double>` and touching a tolerance path yields an unhelpful "incomplete type" error. * Rename `MatBase::divideAssign(const T&)` to `divideAssignScalar` to match the existing name used in `VecBase`. Updates the five callers in `Mat2`/`Mat2x3`/`Mat3x2`/`Mat3`/`Mat4::operator/=`. No external callers in the nanovdb tree. Addresses PR AcademySoftwareFoundation#2207 follow-up audit (best practices for C++17 / gcc 11.2). Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>

Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>

Earlier in this PR (commit 001dbe3) we added `explicit` to every converting ctor including the cross-template-template `Vec*(const Vec*T<T2>& v)` overload. That broke a handful of sites that relied on the long-standing pre-refactor implicit conversion. The principled fix is to keep `explicit` (consistent with the same-class `Vec*(const Vec*<T2>&)` overload, which has always been explicit) and update the call sites instead, surfacing each precision/namespace crossing in the source. Audited the full tree; the fallout is: * `nanovdb/tools/cuda/PointsToGrid.cuh` (3 prod sites in the `worldToVoxel(Vec3u8/Vec3u16/Vec3f, ...)` family): wrap the `applyInverseMap(world)` result in an explicit `Vec3d(...)` so the intentional Vec3T -> Vec3d promotion is visible. Floating-point semantics are unchanged (still computed in the input precision, then explicitly promoted). * `nanovdb/unittest/TestNanoVDB.cu` (3 test sites at lines 2078, 2205, 2334): pass `<Vec3T>` explicitly to `voxelToWorld` so the return type matches the test's `begin[i]` element type and no implicit conversion is required. * `nanovdb/unittest/TestOpenVDB.cc` (12 test sites across 4 `EXPECT_EQ` blocks): wrap the openvdb-side accessor results in `nanovdb::Vec3f(...)` so the gtest equality check resolves without depending on implicit `openvdb::Vec3f -> nanovdb::Vec3f` conversion. Also restores the explicit `int32_t`/`uint32_t` overloads of `math::Min`/`Max` removed in commit 5183bf0. They are not redundant with the `template<typename Type>` primary — they disambiguate mixed-int calls (e.g. `Min(uint32_t, size_t)` in PointsToGrid.cuh) that would otherwise fall through equally-bad implicit conversions to the `float`/`double` overloads. Verified with cmake build of nanovdb tree (host + CUDA targets) on nvcc 13.2 / gcc 11.2; no remaining implicit-conversion sites in either the nanovdb or openvdb trees. Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>

@brief

Comments-only sweep across nanovdb/math/Math.h to make every public class and method discoverable in generated docs and to fix silently-broken Doxygen comments. No code or signatures changed. Touched: * Free math functions (Tolerance/Delta/Maximum/Min/Max/Clamp/Fract/ Floor/Ceil/Pow2/Pow3/Pow4/Abs/Round/RoundDown/MinIndex/MaxIndex/ isApproxZero): added missing @brief lines; upgraded plain `///` descriptions to proper @brief; documented why the integer Min/Max overloads exist; documented constexpr/non-constexpr split. * Coord / Coord2: fixed several `// @brief` comments that were missing a leading slash (so Doxygen ignored them entirely); added briefs to every previously-undocumented ctor, accessor, static, operator, atomic device-only helper, and to `round()`. Replaced stale "node centered conversion" prose with concrete behaviour. * VecBase: added briefs to plus/minus/mul/div/negate/scale, in-place addAssign/subAssign/mulAssign/divAssign/scaleAssign/divideAssign- Scalar, equals, lengthSqr, length, mergeMin, mergeMax. * Vec2 / Vec3 / Vec4: added briefs to default ctor, broadcast ctor, component ctor, cross-template + same-class converting ctors, Coord ctor, operator=, every element-wise / scalar / mixed-Coord / equality / minComponent / maxComponent operator, normalize, and the hidden-friend scalar*Vec / scalar/Vec operators. * MatBase: added a class-level @brief; documented rows/cols/size, default ctor, array ctor, operator[], addAssign/subAssign/ scaleAssign/divideAssignScalar, equals. * Mat2 / Mat2x3 / Mat3x2 / Mat3 / Mat4: lifted each class's leading comment-block into a proper @brief @details; documented every default constructor. * matMult / matMultT: filled in @brief for the 4-arg matMultT(float*, float*, ...) and matMultT(double*, double*, ...) overloads. * BaseBBox: class @brief + briefs on every operator, accessor, translate, and the protected ctors. Documented the inclusive-vs- exclusive convention split. * BBox<Vec3T, true> (float): briefs on every ctor, createCube, empty, operator bool, dim, isInside. * BBox<CoordT, false> (integer): briefs on Iterator (pre/post increment, comparison ops, dereference, conversion to bool), begin/end, default ctor, value ctor, splitting ctor, both createCube overloads, is_divisible, dim, volume, isInside(Coord), and the device-only expand/intersect atomic helpers. * Rgba8: fixed "@brief @brief" typo on the broadcast ctor; reworded "alpha channel it set to 1" to a proper sentence; added briefs to operator< / operator== / lengthSqr / length / operator[] / packed / r/g/b/a / and both `operator Vec3<float>` / `Vec4<float>` conversions. Diff: +356 / -54 lines, all in comments. cmake build of the nanovdb tree still passes cleanly on nvcc 13.2 / gcc 11.2. Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>

The expression `(BuildT).5 / d_grid->voxelSize()` resolves to the hidden-friend `operator/(T1, Vec3<double>)`, which returns a `Vec3<double>` (the owning class). Assigning that result to a local `Vec3T` (`Vec3<BuildT>`, typically `Vec3<float>`) used to rely on the implicit cross-template-template ctor, which is now `explicit` to match the rest of the converting-ctor sweep in this PR. Wrap the RHS in an explicit `Vec3T(...)` cast. Floating-point semantics are unchanged: the reciprocal is still computed in double, then narrowed to `BuildT`. Verified by extracting the kernel into a standalone nvcc compile. This was the only remaining build break — the four failing CI jobs (linux-nanovdb gcc/clang × Debug/Release) all reported the identical error at the same line. Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>

Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>

Copilot

Pull request overview

Copilot reviewed 6 out of 7 changed files in this pull request and generated no new comments.

harrism

Fantastic. Thanks for addressing all my suggestions.

The vector classes only had a mutating normalize() (returns Derived&), which can't be called on a const vector. Add a non-mutating normalized() that returns a unit-length copy. The shared implementation lives on VecBase<T, N> as a Derived-templated helper (divideBy(length())); Vec2/Vec3/Vec4 expose thin delegating wrappers, matching the existing base-helper + derived-wrapper pattern. Marked [[nodiscard]] and non-constexpr (length() calls Sqrt). Extends the Vec2/Vec3Ops/Vec4Ops unit tests to call normalized() on a const vector and verify unit length and an unchanged source. Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>

swahtz requested a review from kmuseth as a code owner May 20, 2026 04:25

swahtz added the nanovdb label May 20, 2026

swahtz requested a review from apradhana May 20, 2026 04:27

swahtz added 2 commits May 20, 2026 16:38

NanoVDB Math: make Vec*::size constexpr to fix linker ODR-use

d4c1198

Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>

swahtz requested a review from Copilot May 21, 2026 23:45

Copilot started reviewing on behalf of swahtz May 21, 2026 23:45 View session

Copilot AI reviewed May 21, 2026

View reviewed changes

Comment thread nanovdb/nanovdb/math/Math.h Outdated

harrism suggested changes May 26, 2026

View reviewed changes

swahtz added 11 commits May 28, 2026 22:37

comment cleanup

9b9a6da

Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>

swahtz added 6 commits May 29, 2026 02:28

seperate floorAs/ceilAs doxygen comments

c129f34

Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>

comment cleanup

80ae020

Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>

swahtz requested a review from Copilot May 29, 2026 03:45

Copilot started reviewing on behalf of swahtz May 29, 2026 03:45 View session

Copilot AI reviewed May 29, 2026

View reviewed changes

harrism approved these changes Jun 2, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NanoVDB Math: refactor onto `MatBase`/`VecBase`, fix correctness bugs, add tests#2207

NanoVDB Math: refactor onto `MatBase`/`VecBase`, fix correctness bugs, add tests#2207
swahtz wants to merge 22 commits into
AcademySoftwareFoundation:masterfrom
swahtz:nano_math_matvecbase

swahtz commented May 20, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

harrism left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

harrism commented May 26, 2026

Uh oh!

swahtz commented May 29, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

harrism left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

swahtz commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Bug fixes

Refactor

API completeness

constexpr sweep

Other quality passes

Tests

Risk and compatibility

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

harrism left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

harrism commented May 26, 2026

Uh oh!

swahtz commented May 29, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

harrism left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

swahtz commented May 20, 2026 •

edited

Loading

`constexpr` sweep