Skip to content

Proposal: Add fma_{mul,div} for FMA-based complex operations #146

@zhongyi51

Description

@zhongyi51

Proposal

I propose adding fma_mul and fma_div methods to the Complex type. These methods would leverage fused multiply-add (FMA) operations for the calculation.

Motivation

Using FMA can offer significant performance benefits on hardware with native support, but it comes with important trade-offs:

  • Performance Variance: On modern CPUs that support FMA instructions (e.g., AArch64), these methods can be faster. However, without native hardware support, the compiler may fall back to a slow software library call (fmaf).

  • Numerical Differences: FMA computes a * b + c with a single rounding operation. This means the results from an FMA-based method are not guaranteed to be bit-for-bit identical to the standard methods.

Implementation

This Compiler Explorer link clearly illustrates the performance dichotomy between architectures and compiler settings: https://godbolt.org/z/joW4eqvT9

If this approach is ok, I would be happy to implement it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions