Skip to content

Fix Jacobi eigensolver convergence and improve ArrayFire bindings#71

Merged
dmjio merged 14 commits into
masterfrom
af-api-updates
Jun 13, 2026
Merged

Fix Jacobi eigensolver convergence and improve ArrayFire bindings#71
dmjio merged 14 commits into
masterfrom
af-api-updates

Conversation

@dmjio

@dmjio dmjio commented Jun 12, 2026

Copy link
Copy Markdown
Member
  • Replace sweep-based iteration with bounded rotation count and scale-invariant convergence detection (tol = 1e-14 * max|A|)
  • Return error code 2 on non-convergence instead of silent inaccurate results
  • Add input validation in af_eigsh: reject non-square or non-2D inputs
  • Fix type signatures: add Fractional constraints to trigonometric functions, correct matcher return types to (Array Word32, Array a)
  • Fix constant array creation for b8 and non-fp64 OpenCL devices
  • Fix memory leaks in getInfoString and arrayToString (free af_alloc_host strings)
  • Fix momentsAll to return variable-length list based on moment bitmask
  • Guard createSparseArrayFromDense against all-zero matrices (prevents segfault)
  • Add vision test skips for broken OpenCL backend on AF 3.8.2

- Replace sweep-based iteration with bounded rotation count and scale-invariant
  convergence detection (tol = 1e-14 * max|A|)
- Return error code 2 on non-convergence instead of silent inaccurate results
- Add input validation in af_eigsh: reject non-square or non-2D inputs
- Fix type signatures: add Fractional constraints to trigonometric functions,
  correct matcher return types to (Array Word32, Array a)
- Fix constant array creation for b8 and non-fp64 OpenCL devices
- Fix memory leaks in getInfoString and arrayToString (free af_alloc_host strings)
- Fix momentsAll to return variable-length list based on moment bitmask
- Guard createSparseArrayFromDense against all-zero matrices (prevents segfault)
- Add vision test skips for broken OpenCL backend on AF 3.8.2
@dmjio dmjio marked this pull request as draft June 13, 2026 00:24
dmjio and others added 3 commits June 12, 2026 23:08
AF's af_abs promotes through f64 internally, which makes the
`abs x * signum x == x` law fail for signed-type minBound (overflow)
and for 64-bit integers with |x| > 2^53 (precision loss).  The ring
structure is fully covered by semiringLaws + ringLaws, so numLaws is
skipped for all integral element types.

Also re-enables the hspec suite (586 examples, 0 failures, 17 platform-
specific pendings) that was previously commented out.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
af_orb crashes the process at the C level on both the OpenCL and CPU
backends in AF 3.8.2 on macOS.  The previous guard only skipped on
OpenCL, so running via the CPU backend (e.g. nix flake build) would
kill the test runner before the suite could report results.

Introduce skipOrb (unconditional pending) for the two ORB tests and
keep skipOnBrokenBackend for FAST/Harris/SUSAN which work correctly on
the CPU backend.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@dmjio dmjio marked this pull request as ready for review June 13, 2026 14:36
dmjio and others added 10 commits June 13, 2026 10:21
af_abs promotes all integer inputs to f32 internally (complex.cpp uses
implicit(in_type, f32) which returns f32 for every integer dtype), so
any value with |x| > 2^24 gets rounded.  For example abs(16777217)
returned 16777216.

Fix: dispatch abs in the Num (Array a) instance on the element dtype:
  - signed integers (s16/s32/s64): select (x < 0) (0 - x) x
  - unsigned / boolean (u8/u16/u32/u64/b8): identity (already >= 0)
  - float / complex (f32/f64/c32/c64): delegate to A.abs as before

Also restrict Arbitrary CBool to {0, 1} — AF's b8 type normalises
non-zero floats to 1 on cast-back, so CBool 2 produced abs(2) = 1.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace `head` with `listToMaybe` + explicit failure message (matching
the existing pattern for `meanWeighted`), and `tail` with `drop 1`.
Also removes unused Foreign.Marshal/Foreign.Storable imports from
Graphics.hs to silence -Wunused-imports.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
af_orb was not fundamentally broken — the crash was caused by passing a
32×32 image to the CPU backend.  When min(h,w)/scl_fctr < REF_PAT_SIZE
(31), the pyramid-sizing loop exits with max_levels=0, then
lvl_best[max_levels-1] = lvl_best[UINT_MAX] writes out of bounds →
process abort.

Fix: introduce orbImg (128×128 quadrant, well above the 47px minimum for
scl_fctr=1.5) and guard with skipOnBrokenBackend instead of the blanket
skipOrb, so ORB runs on the CPU backend.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
af_orb returns descriptors shaped [8, n_features] (8 uint32 words per
256-bit descriptor, column-major).  dim0 is always 8; dim1 is the
feature count.  The test was checking d0 which always equals 8, not n.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
On asynchronous backends (OpenCL), a scalar array's fill kernel is
enqueued but may not have retired by the time the JIT for eqBatched
fires.  Without eval the comparison reads stale buffer contents, causing
false inequality (e.g. Scalar 1 /= Scalar 1).  The CPU backend is
synchronous so the bug doesn't surface there.

Update the comment to explain the async fill race rather than just
asserting "stale results".

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
numLaws comment claimed the laws were skipped, but they run and pass
with the new pure-Haskell abs implementation.  Remove the stale comment.

getNumDims shrink bug: af_get_numdims collapses trailing unit dims
([2,1,1] → 1), losing the constructed dimensionality.  Shrinking then
treats a generated 3-D [2,1,1] array as 1-D, reducing test coverage
for multi-dimensional Eq tests.  Derive ndim directly from getDims
(dropWhile (==1) . reverse) instead.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
af_matmul on OpenCL enqueues clBLAS kernels asynchronously without
syncing the output buffer.  Subsequent JIT operations (elementwise mul,
sumAll) read the unfilled buffer, producing wrong results.  The CPU
backend uses synchronous BLAS so it is unaffected.

Extract brokenOpenCL / skipOnBrokenOpenCL into a shared TestHelper so
VisionSpec and NumericalSpec (and future specs) can use the same guard.
Remove the now-redundant brokenVisionBackend / skipOnBrokenBackend
definitions from VisionSpec.

NumericalSpec: gate power-iteration (uses mm/matmul) and Parseval
(uses fft) against the broken OpenCL backend.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The C wrapper af_random_engine_set_seed_ takes unsigned long long, but
the FFI declaration used IntL (CLLong, signed 64-bit) and the public API
took Int.  Seeds are non-negative by definition; change the FFI to UIntL
and the public API to Word.  Binary representation is identical so no
ABI change, but the type now correctly rejects negative seeds at compile
time.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@dmjio dmjio merged commit beec55c into master Jun 13, 2026
2 checks passed
@dmjio dmjio deleted the af-api-updates branch June 13, 2026 20:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant