Skip to content

NanoVDB Math: refactor onto MatBase/VecBase, fix correctness bugs, add tests#2207

Open
swahtz wants to merge 22 commits into
AcademySoftwareFoundation:masterfrom
swahtz:nano_math_matvecbase
Open

NanoVDB Math: refactor onto MatBase/VecBase, fix correctness bugs, add tests#2207
swahtz wants to merge 22 commits into
AcademySoftwareFoundation:masterfrom
swahtz:nano_math_matvecbase

Conversation

@swahtz
Copy link
Copy Markdown
Contributor

@swahtz swahtz commented May 20, 2026

Consolidates nanovdb/math/Math.h by lifting Mat / Vec arithmetic onto two shared base classes (MatBase<T, R, C> and a new VecBase<T, N>). The five Mat and three Vec derived classes become thin delegating wrappers. The pass also fixes several long-standing correctness bugs and fills inconsistent / partial APIs.

Files changed: Math.h (the refactor), NanoVDB.h (Map API now constexpr — depends on the matMult rework), tools/cuda/PointsToGrid.cuh (three call-site casts from the explicit ctor tightening), and three test files (additive: 11 new TEST_F blocks plus mechanical cast updates), and one python-binding kernel.

Bug fixes

  • math::Min / Max for int32_t / uint32_t no longer route through fminf / fmaxf — old form lost precision above 2^24 and could be UB near INT_MAX. Heavily used by HDDA.h.
  • Removed dimensionally-invalid operator*(Mat3, Mat2x3) -> Mat2x3. No callers in the tree.
  • Mat2::inverse() now returns an explicit zero on singular input (was uninitialised) and uses T(1)/det for T=double.
  • Vec*::operator/(T) and Mat*::operator/(T) now per-element-divide. The old (T(1)/s) * (*this) did integer division for integer T, so Vec3i(6,8,10) / 2 returned (0,0,0).
  • Vec2's operator+/-/+=/-= against Coord now take Coord2. Old signature silently dropped z.
  • math::Round(Vec3<float>) and Round(Vec3<double>) both use floor(x + 0.5) now. The float overload previously used rintf (round-half-to-even), so the two disagreed at half-integers.
  • Corrected floor() / ceil() doc swap on Vec2 / Vec3. Implementations were always right.

Refactor

  • MatBase<T, R, C>: shared element-wise / scalar / equality / transpose / matmat / matvec implementations, all __hostdev__ constexpr. rows() / cols() / size() promoted public.
  • VecBase<T, N> (new): shared element-wise / scalar / equality / reductions / mutating min/max / integer rounding. floorAs<Result> / ceilAs<Result> / roundAs<Result> unify Round semantics across float / double / long double.
  • All five Mat and three Vec classes become thin wrappers: ctors + per-type extras (Mat2::inverse, Vec3::cross/outer) + one-line operators delegating to the base.
  • Mat3 / Mat3x2 / Mat4 scalar ctors switched from template<typename Source> to T, matching Mat2 / Mat2x3. Mixed-literal construction (Mat3<double>(1, 2.0, 3, ...)) now works uniformly. Array ctors keep template<typename Source> for cross-type interop.

API completeness

  • Every Mat type has the same operator surface: unary -, binary + / -, += / -=, *(T) / /(T), *=(T) / /=(T), == / !=. Previously scattered.
  • New mat*mat: Mat3 * Mat3x2 standalone, Mat4 * Mat4 member.
  • New mat*vec members: Mat2 * Vec2, Mat2x3 * Vec3 -> Vec2, Mat3x2 * Vec2 -> Vec3. Mat4 * Vec4 promoted to member.
  • New scalar * MatN standalones for Mat2x3 / Mat3x2 / Mat3 / Mat4.
  • Vec4 gains parity with Vec3: min, max, asPointer, floor, ceil, round (returns Vec4<int32_t> — no Coord4).

constexpr sweep

Every function in Math.h and the Map API in NanoVDB.h that C++17 allows is now constexpr. Carve-outs: length / normalize (std::sqrt, C++26), the float / double overloads of Min / Max / Floor / Ceil / Round / Abs (C++23), and CUDA-device-only *Atomic helpers.

Rewriting matMult / matMultT from fma(a, b, c) to a * b + c was required (fma isn't constexpr until C++23, P1383R2). GPU codegen unchanged — nvcc -fmad=true contracts back to hardware FMA. Host codegen unchanged under default -ffp-contract=on. Only compile-time constexpr evaluation observes the double-rounded result. Worst-case ≤1 ulp drift, well below NanoVDB's geometric precision.

Other quality passes

  • Bounds checks on every operator[] in Math.h via NANOVDB_ASSERT (twelve sites). No-op under NDEBUG; const-index accesses fold at compile time even in debug.
  • Doxygen sweep: every previously-undocumented public method gains an @brief; several // @brief typos (missing leading slash — silently invisible to Doxygen) fixed.
  • Hidden friends: scalar*Vec / scalar/Vec / scalar*Mat converted from namespace-scope free templates to in-class friend definitions — ADL-found only, no longer pollute the enclosing namespace.

Tests

Eleven additive TEST_F blocks in TestNanoVDB.cc:

Test Covers
IntegerMinMax 2^24+1 precision; INT_MAX / UINT_MAX overflow safety.
Round Half-integer agreement between float and double (especially -1.5).
Vec2 / Vec3Ops / Vec4Ops Full operator coverage; new Vec4 methods; integer-T division.
Mat2 / Mat3 / Mat4 / Mat2x3_Mat3x2 Constructors, every operator, transpose, matmat, matvec; Mat2::inverse non-singular roundtrip + singular-returns-zero; mixed-literal construction.
MatMul Every dimensionally-valid matmat plus every member matvec across the five matrix types.
MatVecIntrospection static_asserts for rows/cols/size/SIZE/ValueType; data()/asPointer() ordering.

Risk and compatibility

  • Behaviour preserved for every previously-correct code path.
  • Behaviour changes are exactly the bug fixes above — each moves an incorrect / undefined output to a defined, correct one.
  • matMult rounding: ≤1 ulp drift only in pure abstract-machine evaluation. GPU and host codegen identical to before.
  • API additions are pure. The only removed overload is the broken Mat3 * Mat2x3 (no callers).
  • One signature tightening: Vec2's Coord overloads now require Coord2. Code passing a 3D Coord to Vec2 will fail to compile — but it was silently dropping z.
  • Cross-template ctor tightened to explicit: Vec*(const Vec*T<T2>&) now matches the same-class ctor's convention. Audit found 18 affected sites across the nanovdb + openvdb trees, all updated here (3 in PointsToGrid.cuh, 3 in TestNanoVDB.cu, 12 in TestOpenVDB.cc, 1 in PySampleFromVoxels.cu). All mechanical and value-preserving. Downstream consumers that relied on the implicit conversion will need the same one-line cast — typically nanovdb::Vec3f(yourVec) or an explicit template argument to a returning helper.

Lift the bulk of Mat2/Mat2x3/Mat3x2/Mat3/Mat4 and Vec2/Vec3/Vec4 onto
shared base classes (MatBase<T, R, C> and a new VecBase<T, N>) so the
arithmetic is implemented once and the derived classes shrink to
constructors plus thin delegating wrappers. Along the way, fix several
pre-existing correctness issues and round out a number of inconsistent
or partially-implemented APIs. No behaviour change for any code that
was already correct.

Bug fixes
- math::Min/Max for int32_t and uint32_t no longer route through
  fminf/fmaxf, eliminating silent precision loss for values above 2^24
  and undefined behaviour for values near INT_MAX / UINT_MAX (heavily
  exercised by HDDA ray traversal).
- Remove an invalid operator*(const Mat3<T>&, const Mat2x3<T>&)
  overload: the dimensions don't match standard matrix algebra and the
  body silently dropped the third row/column.
- Mat2::inverse() now returns an explicit zero matrix on a singular
  input (previously returned an uninitialised Mat2<T>()) and uses
  T(1) / det instead of 1.f / det.
- Vec*::operator/(T) and operator/=(T) do per-element division. The
  old (T(1)/s) * (*this) form produced (0,...,0) for integer T (Vec2i,
  Vec3i, Vec4i, and the matrices via the same path now divide
  correctly).
- Vec2's mixed overloads (operator+/-/+= /-= with a Coord) now take a
  Coord2 instead of the 3D Coord; they used to silently drop z.
- math::Round(Vec3<float>) and math::Round(Vec3<double>) now use the
  same rounding strategy (floor(x + 0.5)). The float version
  previously used rintf (round-half-to-even), giving different results
  at half-integers, e.g. Round(-1.5).
- Vec2/Vec3 floor()/ceil() doc comments corrected (had been swapped).

Refactor / consolidation
- Promote MatBase::rows()/cols()/size() to public; add ValueType alias
  and data() accessors.
- New generic MatBase helpers: plus, minus, negate, scale, divideBy,
  addAssign, subAssign, scaleAssign, divideAssign, equals,
  transposeAs<Result>, multiply<Result, Rhs>, multiplyVec<VecRes, VecRhs>.
- New VecBase<T, N> with: plus / minus / mul / div / negate / scale /
  divideBy, addAssign / subAssign / mulAssign / divAssign /
  scaleAssign / divideAssignScalar, equals, dot, lengthSqr, length,
  smallestComponent, largestComponent, mergeMin / mergeMax,
  floorAs / ceilAs / roundAs, plus operator[] and asPointer.
- All matrix and vector classes reduce to constructors, any
  dimension-specific extras (Mat2::inverse, Vec3::cross/outer), and
  one-line operator wrappers that delegate to the base.
- Unify scalar constructors of Mat3 / Mat3x2 / Mat4 on T arguments,
  matching Mat2 / Mat2x3 (was template<typename Source>); mixed-literal
  construction now works uniformly across all five matrix types.
  Source* array constructors keep template<typename Source> for
  cross-type interop.

API completeness
- Element-wise / scalar / equality operators are now uniform across
  all five matrix types (operator+, -, += -=, * / *= /=, == !=, and
  unary -). Several of these were previously missing on Mat3x2 / Mat4
  or partial on the others.
- New mat*mat overloads: Mat3 * Mat3x2 -> Mat3x2 (standalone) and
  Mat4 * Mat4 -> Mat4 (member).
- New mat*vec overloads (all members): Mat2 * Vec2, Mat2x3 * Vec3,
  Mat3x2 * Vec2. The standalone Mat4 * Vec4 moves to a Mat4 member to
  match Mat3 * Vec3.
- scalar * MatN standalones added for Mat2x3, Mat3x2, Mat3, Mat4
  (Mat2 already had one).
- Vec4 gains the API it was missing relative to Vec3: min(), max(),
  asPointer(), floor(), ceil(), round() (returning Vec4<int32_t> since
  there is no Coord4).

Tests
- Add 11 new TEST_F blocks to TestNanoVDB.cc covering every change
  above: IntegerMinMax, Round, Vec2, Vec3Ops, Vec4Ops, Mat2, Mat3,
  Mat4, Mat2x3_Mat3x2, MatMul, MatVecIntrospection. Each test pins
  down the specific behaviour (with numeric checks where applicable,
  including the singular-inverse case, half-integer Round
  agreement, and the integer-division regression).

Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
@swahtz swahtz requested a review from kmuseth as a code owner May 20, 2026 04:25
@swahtz swahtz added the nanovdb label May 20, 2026
@swahtz swahtz requested a review from apradhana May 20, 2026 04:27
swahtz added 2 commits May 20, 2026 16:38
Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
The template-template constructors and assignment operators on
nanovdb::math::Vec2/Vec3/Vec4 are the entry point for converting any
Vec-shaped type (including openvdb::math::Vec*) into a NanoVDB vector.
Their `static_assert`s were tightened to check `Vec*T<T2>::SIZE`
(uppercase) in the recent VecBase refactor, but openvdb::math::Vec*
only exposes the lowercase `size` constant (inherited from
openvdb::math::Tuple). That broke openvdb -> nanovdb conversion paths
such as nanovdb::tools::createNanoGrid<openvdb Vec grid>() with:

  error: 'SIZE' is not a member of 'openvdb::v13_0::math::Vec4<float>'

Switch all six static_asserts back to the lowercase `size`, which is
the common-denominator constant shared by both libraries' Vec types.
`static_assert` is a constant-expression context, so it never
ODR-uses the constant; this revert does not reintroduce the GCC
linker error fixed by making `Vec*::size` constexpr.

Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors NanoVDB’s math/Math.h by consolidating shared vector/matrix arithmetic into new VecBase<T, N> and enhanced MatBase<T, R, C> base classes, while also fixing several long-standing correctness issues and adding unit tests to lock in the corrected behavior.

Changes:

  • Refactor vector and matrix implementations onto VecBase/MatBase helpers to remove duplicated per-element loops and normalize APIs across types.
  • Fix correctness issues (e.g., integer Min/Max, consistent Round semantics, singular Mat2::inverse, scalar division for integer Vec*/Mat*, invalid mat-mul overload removal).
  • Add 11 new unit tests covering the refactor, new APIs, and each bug fix with numeric checks and compile-time introspection.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
nanovdb/nanovdb/math/Math.h Introduces VecBase, expands MatBase, rewrites Vec*/Mat* as thin wrappers, fixes math correctness bugs, and rationalizes overload sets.
nanovdb/nanovdb/unittest/TestNanoVDB.cc Adds new TEST_F blocks that exercise the refactored math types and pin down the corrected semantics.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread nanovdb/nanovdb/math/Math.h Outdated
VecBase::floorAs / ceilAs / roundAs used util::is_floating_point<T>
to detect the floating-point arm, but util::is_floating_point only
matches float and double — long double is not in its set. With T =
long double the constexpr branch would therefore fall through to the
integer pass-through arm and silently truncate via
static_cast<int32_t>, instead of going through math::Floor /
math::Ceil. That was a behavior regression vs the pre-refactor code,
where Vec3<long double>::round() correctly hit the Floor(x + 0.5)
path (the same path that fails to compile cleanly for the float +
double Floor overloads, producing a loud "ambiguous" diagnostic
rather than a silent wrong result).

Switch the three constexpr predicates to std::is_floating_point<T>,
which does include long double, and add an explicit
`#include <type_traits>` plus a comment explaining the choice. The
inner math::Floor / math::Ceil calls only have float and double
overloads, so Vec<long double>::floor() / ceil() / round() now once
again fail to compile with an ambiguous-overload diagnostic — which
is the same behavior as master and is strictly preferable to the
silent truncation introduced by the refactor.

Verified float / double / int rounding paths unchanged against the
existing TestNanoVDB Round + Vec3Ops + Vec4Ops blocks.

Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
Copy link
Copy Markdown
Contributor

@harrism harrism left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a really great cleanup and improvement. I have comments in 4 areas.

  1. constexpr -- might as well constexpr all the things
  2. Alignas -- on the classes where it makes sense, let's guarantee alignment so the compiler can optimize / coalesce loads (this allows each row of a 4x4 matrix to be loaded with a single 128-bit global load, for example. Without it, the compiler will generate 4 32-bit loads).
  3. Other modernization / best practices topics: [[nodiscard]], noexcept, consistent explicit ctors, hidden friends.
  4. Documentation -- didn't comment on this much inline, but make sure all methods have doxygen docs. Looks a bit inconsistent?

Comment thread nanovdb/nanovdb/math/Math.h
Comment thread nanovdb/nanovdb/math/Math.h Outdated
Comment thread nanovdb/nanovdb/math/Math.h Outdated
Comment thread nanovdb/nanovdb/math/Math.h Outdated
Comment thread nanovdb/nanovdb/math/Math.h Outdated
Comment thread nanovdb/nanovdb/math/Math.h Outdated
Comment thread nanovdb/nanovdb/math/Math.h Outdated
Comment thread nanovdb/nanovdb/math/Math.h Outdated
Comment thread nanovdb/nanovdb/math/Math.h Outdated
Comment thread nanovdb/nanovdb/math/Math.h Outdated
@harrism
Copy link
Copy Markdown
Contributor

harrism commented May 26, 2026

I also asked an agent to review with modern C++ best practices in mind and then had it prune its list to not include what I already mentioned in my review.

This first one would be good to implement.

 Body-assignment in derived constructors. The new derived ctors do this->mVec[0] = x;
  this->mVec[1] = y; instead of the pre-PR : mVec{x, y} {}. Invisible for fundamental T, a
  regression for any future Vec<UserType>. Fix is a protected variadic constructor on VecBase:

  template<typename... Args>
  __hostdev__ explicit VecBase(Args... args) : mVec{T(args)...} {
      static_assert(sizeof...(Args) == N, "wrong number of args");
  }

  Derived ctors delegate: Vec2(T x, T y) : Base(x, y) {}.

The rest are all speculative / nice to have or depend on C++20 or later.

Defaulted operator== (C++20 only). bool operator==(const Vec3&) const noexcept = default;
  replaces the manual equals() chain; operator!= auto-synthesizes. Only worth raising if NanoVDB's
   C++ baseline allows.

  Tuple-like interface (tuple_size / tuple_element / get) for auto [x, y, z] = v; and std::apply.
  C++17.

  begin()/end() so Vec/Mat work with range-based for and <algorithm>. Six lines per base.

  [[deprecated]] on the lowercase size alias if the goal is to migrate callers to SIZE. Nudges
  without breaking.

  static_cast<T>(x) over functional-style T(x). Identical for arithmetic types but greppable and
  doesn't silently become a constructor call for non-arithmetic T. Minor.

  CTAD deduction guides so Vec3 v(1.0f, 2.0f, 3.0f); deduces Vec3<float>. Three lines per Vec /
  Mat class. C++17.

  C++20 requires clauses on cross-Vec templated functions (dot<V>, mergeMin<V>, the templated
  converting constructor). Replaces static_assert(V::SIZE == N) in the body with constraints at
  the declaration. Better diagnostics, SFINAE-friendly.

  std::array<T, N> instead of T[N] in the base. Free .data(), iterators, structured bindings,
  .size(), .fill(), ==. Layout-compatible with T[N] for fundamental T. Possibly tangles with CUDA
  portability (cuda::std::array exists). Out of scope for this PR; worth knowing.

swahtz added 11 commits May 28, 2026 22:37
Add debug-only bounds checks (NANOVDB_ASSERT, no-op in release) to
every operator[] in Math.h, addressing reviewer feedback. Twelve
sites in one file, all of Math.h's indexed accessors:

* Coord::operator[]      — i < 3
* Coord2::operator[]     — i < 2
* VecBase::operator[]    — i in [0, N)
* MatBase::operator[]    — row in [0, ROWS), returns a row pointer
                           (col bound stays the caller's responsibility)
* BBox<Vec3T>::operator[]— i in [0, 2)  (min/max)
* Rgba8::operator[]      — n in [0, 4)

NANOVDB_ASSERT compiles out under NDEBUG so release-build code paths
are unchanged. For Coord and Coord2 the IndexType is uint32_t and
the lower bound is therefore implicit; for the int-indexed accessors
both bounds are checked.

Constexpr-index sites (which are the overwhelming majority in
practice — e.g. v[0], v[1], v[2], bbox[0]) fold the comparison away
at compile time even in debug builds, so the new asserts only have
runtime cost for genuinely-dynamic index expressions.

Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
Mark every function in nanovdb/math/Math.h `constexpr` that the
C++17 rules allow. Addresses reviewer feedback on PR AcademySoftwareFoundation#2207
specifically calling out VecBase::operator[], VecBase::asPointer,
all VecBase operators/helpers, all MatBase operators/methods, and
Mat2::inverse — plus everything else in the file that's trivially
constexpr-eligible while we're in here.

Scope of the sweep:

* All free math helpers — Tolerance::value, Delta::value,
  Maximum::value, Min/Max (int + uint overloads), Pow2/3/4, generic
  Abs, Sign, MinIndex/MaxIndex, AlignUp.

* Coord and Coord2 — every method except `Floor` (which depends on
  the non-constexpr math::Floor).

* VecBase — operator[], asPointer (both const + non-const), and all
  element-wise / scalar helpers (plus, minus, mul, div, negate,
  scale, divideBy, addAssign, subAssign, mulAssign, divAssign,
  scaleAssign, divideAssignScalar, equals, dot, lengthSqr,
  smallestComponent, largestComponent, mergeMin, mergeMax,
  floorAs, ceilAs, roundAs). `length()` stays non-constexpr —
  std::sqrt isn't constexpr until C++26 — with a one-line comment
  noting the blocker.

* Vec2 / Vec3 / Vec4 — every operator and accessor; cross/outer on
  Vec3. The floor/ceil/round wrappers are constexpr only for
  integer T (the floating-point arms transit through math::Floor /
  math::Ceil which aren't constexpr).

* MatBase — operator[], data, plus, minus, negate, scale,
  addAssign, subAssign, scaleAssign, divideBy, divideAssign,
  equals, transposeAs, multiply, multiplyVec.

* Mat2 / Mat2x3 / Mat3x2 / Mat3 / Mat4 — every constructor (element
  list + Source*), every operator (unary -, binary +/-, +=/-=,
  mat-mat *, mat-vec *, scalar */ /, *=/=, ==/!=), transpose.
  Mat2::inverse is now constexpr — required also making
  math::isApproxZero and math::Tolerance<float|double>::value
  constexpr, which the reviewer specifically flagged as the
  precondition.

* All free scalar*matrix and mixed-shape matrix*matrix operator
  overloads.

* BaseBBox and both BBox specializations (floating + integer) —
  including the integer-bbox Iterator class.

* Rgba8 — all six constructors plus operator<, operator==,
  lengthSqr, asFloat, operator[], packed, r/g/b/a, the Vec3<float>
  and Vec4<float> conversions.

* Pre-C++20 nitpick honored throughout: every helper that builds a
  result element-by-element now uses value-initialization
  (`Derived out{};` / `Result r{};`) so the constexpr rule about
  uninitialized locals is satisfied even on older standards.

Functions deliberately left non-constexpr (each blocked by a stdlib
function that isn't constexpr in C++17):

* VecBase::length, Vec*::normalize, Rgba8::length (sqrt is C++26).
* math::Min(float,float) / Min(double,double) /
  Max(float,float) / Max(double,double) (fminf/fmaxf are C++23).
* math::Clamp(float,…) / Clamp(double,…) (call the non-constexpr
  Min/Max).
* math::Fract / Floor / Ceil / Round (floorf/ceilf are C++23).
* math::Sqrt(float|double).
* math::Abs(float|double|int) (fabs/abs are C++23).
* Coord::Floor / Coord2::Floor — depend on math::Floor.
* matMult / matMultT free overloads (use fma/fmaf — C++23).
* All __device__-only *Atomic methods.
* BBox<CoordT,false>::transform<Map> — depends on an external
  Map::applyMap whose constexpr-ness varies.

Verified by a standalone TU that constructs the constexpr surface
at compile time (a ~50-line block of static_asserts covering Vec,
Mat, Coord, BBox, Rgba8, and the math helpers) and by recompiling
the existing TestNanoVDB unittest TU at -O2 -Wall — both clean.

Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
C++17 audit of the constexpr sweep flagged one strict violation:
every public Rgba8 constructor initialises `mData{{r, g, b, a}}` —
i.e. it makes the union's `c[4]` member active. Reading
`mData.packed` (the other union member) at constant evaluation in
C++17 is undefined behaviour, and a constexpr function isn't
allowed to perform UB at constant evaluation. With no public path
to make `packed` the active union member at compile time there's
no legal way to constant-evaluate these three accessors, so the
constexpr mark falls into "ill-formed, no diagnostic required"
territory under [dcl.constexpr]/6 and should be removed.

Drop the constexpr from:
* Rgba8::operator<  (reads mData.packed)
* Rgba8::operator== (reads mData.packed)
* Rgba8::packed() const / non-const (returns reference to mData.packed)

The rest of Rgba8's constexpr surface (r/g/b/a, asFloat, operator[],
lengthSqr, the explicit ctors) reads only mData.c and is unaffected.

Verified with two TUs under -std=c++17 -pedantic:
* a positive TU that constant-evaluates ~40 Vec/Mat/Coord/BBox/Rgba8
  surface expressions via static_assert (incl. Mat2::inverse) —
  compiles clean;
* a negative TU that puts `Rgba8::operator==`, `operator<`, and
  `packed()` in a constexpr context — fails to compile with three
  "call to non-'constexpr' function" errors, confirming the strict
  C++17 rules now reject what was previously a silent NDR.

Also re-ran TestNanoVDB.cc full TU compile at -O2 -Wall — clean.

(The C++20 "common initial sequence" union-access relaxation
doesn't apply here either: uint8_t[4] and uint32_t are not
layout-compatible types.)

Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
The fma / fmaf calls in math::matMult were the last hop blocking
constexpr on Map::applyMap and BBox<CoordT,false>::transform<Map>.
Switch the six matMult / matMultT overloads to plain `a * b + c`
form and constexpr-mark them, then propagate the mark through every
Map method (applyMap, applyMapF, applyJacobian, applyJacobianF,
applyInverseMap, applyInverseMapF, applyInverseJacobian,
applyInverseJacobianF, applyIJT, applyIJTF, getVoxelSize) and the
two Map constructors, and finally BBox::transform itself.

Trade-off: `a * b + c` rounds twice vs the single rounding of
fma(a, b, c). Worst-case ~1 ulp drift; for NanoVDB voxel
transforms (values in meters/kilometers, voxel sizes in
sub-millimeters to meters) that's well below the geometric
precision of the operation. Device-side codegen is unchanged in
practice — nvcc's default -fmad=true contracts `a * b + c` back
into a single hardware FMA instruction. Host-side without explicit
-ffp-contract=off most compilers still contract within an expression
too. The only place the difference is observable is in a
compile-time constexpr evaluation, where the abstract-machine rules
forbid contraction — the constexpr result is strictly double-rounded.

Once the project moves to C++23, constexpr fma will be available
(P1383) and the matMult bodies can revert to fma/fmaf if a 1-ulp
gain is ever worth wanting in some not-yet-imagined precision
audit. The runtime instruction set on every platform we care about
is identical either way.

Verified at compile time with a focused TU that constant-evaluates
the identity Map, a scale+translate Map, every apply* variant, and
BBox<Coord>::transform<Map> applied to a 5x5x5 CoordBBox. Re-ran
the existing TestNanoVDB.cc unittest TU at -O2 -Wall — clean.

Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
Reviewer asked for named component accessors on top of operator[].
Add them generically to VecBase<T, N>, with a body static_assert
that fires at the call site when the Vec is too small for the
requested component:

  constexpr T& x()       { return mVec[0]; }      // N >= 1
  constexpr T& y()       { static_assert(N >= 2, "..."); ... }
  constexpr T& z()       { static_assert(N >= 3, "..."); ... }
  constexpr T& w()       { static_assert(N >= 4, "..."); ... }

Plus matching const overloads. All eight methods are __hostdev__ and
constexpr, matching the rest of VecBase's accessor surface. Lives on
VecBase so the same code services Vec2 / Vec3 / Vec4 and any future
dimension without per-class duplication.

Choice of static_assert over SFINAE-disabling: simplest pattern,
clearest call-site error ("VecBase::w() requires N >= 4"), and
matches function-body static_assert usage already in this file
(MatBase::multiply, BBox<Vec3T>::asReal). The SFINAE-friendly
trade-off is that vec2.w *names* something on the class — only the
call errors — but that's fine for an accessor that nobody is going
to introspect via is_invocable_v.

The MatVecIntrospection test grew a block exercising x/y/z/w at
compile time on each of Vec2 / Vec3 / Vec4, plus a runtime
write-through-non-const-ref check on Vec4. The static_assert
negative case (Vec2<T>::w() failing to compile) was verified out of
tree but isn't part of the test TU since gtest can't assert
"must-fail-to-compile".

Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
Reviewer asked for alignas on the dimensionally-friendly Vec / Mat
classes. Apply the "align to full byte size" pattern to every class
whose element count is already a power of 2 (so the alignment costs
zero extra bytes), and document the explicit decision NOT to align
the rest:

* Vec2<T>: alignas(alignof(T) * 2)     8B / 16B for float / double
* Vec3<T>: comment, NO alignas
* Vec4<T>: alignas(alignof(T) * 4)    16B / 32B
* Mat2<T>: alignas(alignof(T) * 4)    16B / 32B
* Mat2x3<T>: comment, NO alignas
* Mat3x2<T>: comment, NO alignas
* Mat3<T>: comment, NO alignas
* Mat4<T>: alignas(alignof(T) * 16)   64B / 128B

Vec3 / Mat3 / Mat2x3 / Mat3x2 each carry an inline comment
explaining why they stay at natural alignment: their byte sizes
(3*T, 9*T, 6*T) are not powers of 2, so any alignas(N > alignof(T))
would inject tail padding and break packed-array layout plus
on-disk format compatibility. The comments are there specifically
so a future contributor doesn't "fix" the omission.

Mat4 alignment is heavy (64-byte for float, 128-byte for double).
This follows the reviewer's "align to full size" rule and lets a
Mat4<float> load with a single AVX-512 instruction. Mat4's comment
explicitly calls out the alternative (alignas(alignof(T) * 4) for
row-alignment only, matching Vec4) so anyone hitting downstream
allocator friction can drop to it.

Coord / Coord2 / BBox / Rgba8 are deliberately not touched: Coord
and Coord2 are serialized integer index types where layout
compatibility matters; BBox composes Vec3 / Coord so it inherits
their natural alignment; Rgba8 is already 4-byte aligned via its
uint32_t packed union member.

Verified with a focused TU of 14 static_asserts covering size +
alignment for float and double instantiations of every Vec / Mat
class. Re-ran the existing TestNanoVDB.cc unittest TU at -O2 -Wall
to confirm the alignment changes don't ripple into any existing
test failure — clean.

Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
Reviewer asked for [[nodiscard]] on every value-returning query,
called out specifically because m.transpose() returns a NEW matrix
rather than mutating in place — and a user expecting the
openvdb::Mat3::transpose() in-place semantics will silently discard
the result and quietly produce a wrong-but-plausible answer
downstream.

282 [[nodiscard]] annotations added across Math.h:

* Free math helpers (Tolerance/Delta/Maximum, Min/Max/Clamp/Fract/
  Floor/Ceil/Pow2-4/Abs/Round/RoundDown/Sqrt/Sign/MinIndex/MaxIndex/
  AlignUp, isApproxZero, the six matMult/matMultT overloads): ~46
* Coord + Coord2: 46
* VecBase + Vec2/Vec3/Vec4: 74
* MatBase + Mat2/Mat2x3/Mat3x2/Mat3/Mat4: 72
* BaseBBox + BBox<Vec3T,true> + BBox<CoordT,false> (incl. Iterator): 29
* Rgba8: 7

Targeted: pure queries returning by value or returning a new value
(plus, minus, cross, outer, dot, lengthSqr, length, equals,
operator+ / - / * / / / == / !=, transpose, inverse, floorAs/ceilAs/
roundAs, asVec3s/asVec3d, etc.).

Deliberately NOT annotated: operator[], asPointer / data, named
component accessors (x/y/z/w/r/g/b/a, packed), compound-assignment
operators (+=, -=, *=, /=, &=, <<=, >>=), constructors,
destructors, void-returning mutators, mutating chain methods that
return *this by reference (Vec::normalize, BBox::expand/intersect/
translate), pre/post operator++, and the device-only *Atomic
methods. None of those exhibit the bug class the attribute is
designed to catch.

Verified by recompiling TestNanoVDB.cc with -Wall -Wextra at -O2 —
zero new -Wunused-result / nodiscard warnings, meaning the existing
test code consumes every return value it gets back. The pre-existing
warnings (omp simd pragmas in MaskPrefixSum.h and VoxelBlockManager.h)
are infrastructure, unchanged.

Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
Three review items addressed in one commit, all confined to Math.h:

1. noexcept sweep
   Reviewer noted that no method in Math.h throws (verified — zero
   hits for throw/exception). Added noexcept to every function in
   the file: 489 annotations across free helpers, every Coord /
   Coord2 / VecBase / Vec* / MatBase / Mat* / BBox / Iterator /
   Rgba8 method, plus every standalone operator overload. Default
   constructors marked as `noexcept = default;`. Out-of-class
   definitions of Coord::asVec3s/asVec3d and Coord2::asVec2s/asVec2d
   got matching noexcept on both their in-class declarations and
   their out-of-class definitions so the exception specifications
   agree.

2. explicit on the templated Vec*T<T2> ctors
   Vec2/Vec3/Vec4 each have three "convert-from-some-other-Vec" ctors:
   the typed `Vec*(const Vec*<T2>&)` and the Coord-flavored
   `Vec*(const Coord*&)` overloads were already explicit, but the
   open-template `template<template<class> class Vec*T, class T2>
   Vec*(const Vec*T<T2>&)` overload was implicit. Made it explicit
   on all three classes for consistency.

3. final on every leaf derived class
   Vec2, Vec3, Vec4, Mat2, Mat2x3, Mat3x2, Mat3, Mat4 are all leaf
   classes — they derive from VecBase/MatBase and aren't intended
   as further base classes themselves. Added `final` after each
   class name so a future contributor can't accidentally derive
   from them. Placement is after the alignas / class name and
   before the `: public ...` inheritance.

Verified by recompiling TestNanoVDB.cc at -Wall -Wextra -O2: exit
code 0, zero new warnings (the existing 262 warnings are all
pre-existing OpenMP / unused-parameter / unused-function in
unrelated headers — unchanged from baseline). The standalone
constexpr sanity TU also still compiles clean, confirming the
noexcept marks didn't break constant evaluation of any of the
math primitives.

Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
Move the six namespace-scope `scalar*Vec` / `scalar/Vec` templates for
Vec2/Vec3/Vec4 and the five namespace-scope `scalar*Mat` templates for
Mat2/Mat2x3/Mat3x2/Mat3/Mat4 into their respective class bodies as
`friend` definitions.

Hidden friends are found only via ADL on the owning type, so they no
longer pollute the enclosing namespace's overload set for unrelated
calls. Behavior is unchanged; the existing scalar*Vec test at
TestNanoVDB.cc:1461 and scalar*Mat test at TestNanoVDB.cc:1729 still
pass.

Addresses PR AcademySoftwareFoundation#2207 review feedback.

Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
Add a protected variadic constructor to VecBase and MatBase that
initializes the storage array directly in the member initializer list
(`mVec{T(args)...}` / `mData{T(args)...}`) with a `static_assert` on
the argument count.

Refactor every derived Vec/Mat constructor to delegate via the
initializer list (`Vec2(T x, T y) : Base(x, y) {}` etc.) instead of
default-constructing the base subobject and then assigning each
element in the body.

For fundamental @c T this is a no-op (compilers fold the
default-init-then-assign sequence). For any future @c Vec<UserType>
or @c Mat<UserType> with a non-trivial default constructor and copy
assignment, this avoids one default-construct + copy-assign per
element in favor of a single direct initialization.

Addresses PR AcademySoftwareFoundation#2207 review feedback.

Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
@swahtz
Copy link
Copy Markdown
Contributor Author

swahtz commented May 29, 2026

Implemented in a1072df. Added a protected variadic ctor to both VecBase and MatBase and converted every derived Vec/Mat ctor (Vec2/3/4, Mat2/2x3/3x2/3/4) to delegate via initializer list — no more body-assignment, no more default-construct-then-overwrite of the base subobject. The static_assert(sizeof...(Args) == N) (and == ROWS*COLS for MatBase) catches arity mismatches at the call site.

Tracking the other suggestions for follow-up but not in this PR — happy to file separate issues for the C++17-feasible ones (tuple-like interface, begin/end, CTAD guides) if useful.

swahtz added 6 commits May 29, 2026 02:28
Four small, no-risk improvements:

* Add `const` to `BaseBBox::isInside` — pure query, was preventing
  callers from invoking it on a `const BBox&`.
* Remove the explicit `int32_t` / `uint32_t` overloads of `Min` and
  `Max`. The primary `template<typename Type>` already produces
  bit-identical constexpr code for integer types; the overloads
  predated the template's `constexpr`-ification.
* Add `Tolerance<long double>` and `Delta<long double>` specializations
  for symmetry with `pi<long double>`. Otherwise instantiating
  `Vec<long double>` and touching a tolerance path yields an
  unhelpful "incomplete type" error.
* Rename `MatBase::divideAssign(const T&)` to `divideAssignScalar` to
  match the existing name used in `VecBase`. Updates the five callers
  in `Mat2`/`Mat2x3`/`Mat3x2`/`Mat3`/`Mat4::operator/=`. No external
  callers in the nanovdb tree.

Addresses PR AcademySoftwareFoundation#2207 follow-up audit (best practices for C++17 / gcc 11.2).

Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
Earlier in this PR (commit 001dbe3) we added `explicit` to every
converting ctor including the cross-template-template
`Vec*(const Vec*T<T2>& v)` overload. That broke a handful of sites
that relied on the long-standing pre-refactor implicit conversion.
The principled fix is to keep `explicit` (consistent with the
same-class `Vec*(const Vec*<T2>&)` overload, which has always been
explicit) and update the call sites instead, surfacing each
precision/namespace crossing in the source.

Audited the full tree; the fallout is:

* `nanovdb/tools/cuda/PointsToGrid.cuh` (3 prod sites in the
  `worldToVoxel(Vec3u8/Vec3u16/Vec3f, ...)` family): wrap the
  `applyInverseMap(world)` result in an explicit `Vec3d(...)` so the
  intentional Vec3T -> Vec3d promotion is visible. Floating-point
  semantics are unchanged (still computed in the input precision,
  then explicitly promoted).
* `nanovdb/unittest/TestNanoVDB.cu` (3 test sites at lines 2078,
  2205, 2334): pass `<Vec3T>` explicitly to `voxelToWorld` so the
  return type matches the test's `begin[i]` element type and no
  implicit conversion is required.
* `nanovdb/unittest/TestOpenVDB.cc` (12 test sites across 4
  `EXPECT_EQ` blocks): wrap the openvdb-side accessor results in
  `nanovdb::Vec3f(...)` so the gtest equality check resolves
  without depending on implicit `openvdb::Vec3f -> nanovdb::Vec3f`
  conversion.

Also restores the explicit `int32_t`/`uint32_t` overloads of
`math::Min`/`Max` removed in commit 5183bf0. They are not redundant
with the `template<typename Type>` primary — they disambiguate
mixed-int calls (e.g. `Min(uint32_t, size_t)` in PointsToGrid.cuh)
that would otherwise fall through equally-bad implicit conversions
to the `float`/`double` overloads.

Verified with cmake build of nanovdb tree (host + CUDA targets) on
nvcc 13.2 / gcc 11.2; no remaining implicit-conversion sites in
either the nanovdb or openvdb trees.

Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
Comments-only sweep across nanovdb/math/Math.h to make every public
class and method discoverable in generated docs and to fix
silently-broken Doxygen comments. No code or signatures changed.

Touched:

* Free math functions (Tolerance/Delta/Maximum/Min/Max/Clamp/Fract/
  Floor/Ceil/Pow2/Pow3/Pow4/Abs/Round/RoundDown/MinIndex/MaxIndex/
  isApproxZero): added missing @brief lines; upgraded plain `///`
  descriptions to proper @brief; documented why the integer Min/Max
  overloads exist; documented constexpr/non-constexpr split.
* Coord / Coord2: fixed several `// @brief` comments that were missing
  a leading slash (so Doxygen ignored them entirely); added briefs to
  every previously-undocumented ctor, accessor, static, operator,
  atomic device-only helper, and to `round()`. Replaced stale "node
  centered conversion" prose with concrete behaviour.
* VecBase: added briefs to plus/minus/mul/div/negate/scale, in-place
  addAssign/subAssign/mulAssign/divAssign/scaleAssign/divideAssign-
  Scalar, equals, lengthSqr, length, mergeMin, mergeMax.
* Vec2 / Vec3 / Vec4: added briefs to default ctor, broadcast ctor,
  component ctor, cross-template + same-class converting ctors,
  Coord ctor, operator=, every element-wise / scalar / mixed-Coord /
  equality / minComponent / maxComponent operator, normalize, and the
  hidden-friend scalar*Vec / scalar/Vec operators.
* MatBase: added a class-level @brief; documented rows/cols/size,
  default ctor, array ctor, operator[], addAssign/subAssign/
  scaleAssign/divideAssignScalar, equals.
* Mat2 / Mat2x3 / Mat3x2 / Mat3 / Mat4: lifted each class's leading
  comment-block into a proper @brief @details; documented every
  default constructor.
* matMult / matMultT: filled in @brief for the 4-arg matMultT(float*,
  float*, ...) and matMultT(double*, double*, ...) overloads.
* BaseBBox: class @brief + briefs on every operator, accessor,
  translate, and the protected ctors. Documented the inclusive-vs-
  exclusive convention split.
* BBox<Vec3T, true> (float): briefs on every ctor, createCube, empty,
  operator bool, dim, isInside.
* BBox<CoordT, false> (integer): briefs on Iterator (pre/post
  increment, comparison ops, dereference, conversion to bool),
  begin/end, default ctor, value ctor, splitting ctor, both
  createCube overloads, is_divisible, dim, volume, isInside(Coord),
  and the device-only expand/intersect atomic helpers.
* Rgba8: fixed "@brief @brief" typo on the broadcast ctor; reworded
  "alpha channel it set to 1" to a proper sentence; added briefs to
  operator< / operator== / lengthSqr / length / operator[] / packed /
  r/g/b/a / and both `operator Vec3<float>` / `Vec4<float>`
  conversions.

Diff: +356 / -54 lines, all in comments. cmake build of the
nanovdb tree still passes cleanly on nvcc 13.2 / gcc 11.2.

Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
The expression `(BuildT).5 / d_grid->voxelSize()` resolves to the
hidden-friend `operator/(T1, Vec3<double>)`, which returns a
`Vec3<double>` (the owning class). Assigning that result to a local
`Vec3T` (`Vec3<BuildT>`, typically `Vec3<float>`) used to rely on the
implicit cross-template-template ctor, which is now `explicit` to
match the rest of the converting-ctor sweep in this PR.

Wrap the RHS in an explicit `Vec3T(...)` cast. Floating-point
semantics are unchanged: the reciprocal is still computed in double,
then narrowed to `BuildT`. Verified by extracting the kernel into a
standalone nvcc compile.

This was the only remaining build break — the four failing CI jobs
(linux-nanovdb gcc/clang × Debug/Release) all reported the identical
error at the same line.

Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 7 changed files in this pull request and generated no new comments.

Copy link
Copy Markdown
Contributor

@harrism harrism left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic. Thanks for addressing all my suggestions.

The vector classes only had a mutating normalize() (returns Derived&),
which can't be called on a const vector. Add a non-mutating normalized()
that returns a unit-length copy.

The shared implementation lives on VecBase<T, N> as a Derived-templated
helper (divideBy(length())); Vec2/Vec3/Vec4 expose thin delegating
wrappers, matching the existing base-helper + derived-wrapper pattern.
Marked [[nodiscard]] and non-constexpr (length() calls Sqrt).

Extends the Vec2/Vec3Ops/Vec4Ops unit tests to call normalized() on a
const vector and verify unit length and an unchanged source.

Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants