Skip to content

[Breaking] Migrate from int32 to int64 indexing (NumPy npy_intp alignment) #584

@Nucs

Description

@Nucs

Int64 Index Migration Plan

Migration from int (int32) to long (int64) for all index, stride, offset, and size operations.

Rationale: Support arrays >2GB (int32 max = 2.1B elements). NumPy uses npy_intp = Py_ssize_t (64-bit on x64).

Performance Impact: Benchmarked at 1-3% overhead for scalar loops, <1% for SIMD loops. Acceptable.


Conversion Strategy: int ↔ long Handling

C# Conversion Rules

Conversion Type Notes
intlong Implicit Always safe, zero cost
longint Explicit Requires cast, may overflow
int[]long[] Manual Element-by-element conversion required
long[]int[] Manual Element-by-element + overflow check

Why Pointer Arithmetic Works

NumSharp uses unmanaged memory (byte*), not managed arrays. Pointer arithmetic natively supports long offsets:

byte* ptr = baseAddress;
long largeOffset = 3_000_000_000L;  // > int.MaxValue
byte* result = ptr + largeOffset;   // WORKS! No cast needed

This is why we can migrate internally to long without breaking memory access.

Public API Strategy: Dual Overloads

Keep int overloads for backward compatibility, delegate to long internally:

// Single index - int delegates to long (zero-cost implicit conversion)
public T this[int index] => this[(long)index];
public T this[long index] { get; set; }  // Main implementation

// Multi-index - int[] converts to long[] with stackalloc optimization
public NDArray this[params int[] indices]
{
    get
    {
        // Stack alloc for common case (<=8 dims), heap for rare large case
        Span<long> longIndices = indices.Length <= 8
            ? stackalloc long[indices.Length]
            : new long[indices.Length];

        for (int i = 0; i < indices.Length; i++)
            longIndices[i] = indices[i];

        return GetByIndicesInternal(longIndices);
    }
}

public NDArray this[params long[] indices]  // Main implementation
{
    get => GetByIndicesInternal(indices);
}

Shape Constructor Overloads

// Backward compatible - accept int[]
public Shape(params int[] dims) : this(ToLongArray(dims)) { }

// New primary constructor - long[]
public Shape(params long[] dims)
{
    this.dimensions = dims;
    // ... rest of initialization
}

[MethodImpl(MethodImplOptions.AggressiveInlining)]
private static long[] ToLongArray(int[] arr)
{
    var result = new long[arr.Length];
    for (int i = 0; i < arr.Length; i++)
        result[i] = arr[i];
    return result;
}

Backward Compatible Properties

// Keep int[] for backward compat, throw on overflow
public int[] shape
{
    get
    {
        var dims = Shape.dimensions;  // Internal long[]
        var result = new int[dims.Length];
        for (int i = 0; i < dims.Length; i++)
        {
            if (dims[i] > int.MaxValue)
                throw new OverflowException(
                    $"Dimension {i} size {dims[i]} exceeds int.MaxValue. Use shapeLong property.");
            result[i] = (int)dims[i];
        }
        return result;
    }
}

// New property for large arrays
public long[] shapeLong => Shape.dimensions;

// size - same pattern
public int size => Size > int.MaxValue
    ? throw new OverflowException("Array size exceeds int.MaxValue. Use sizeLong.")
    : (int)Size;

public long sizeLong => Size;  // New property

What Stays int (No Change Needed)

Member Reason
NDim / ndim Max ~32 dimensions, never exceeds int
Slice.Start/Stop/Step Python slice semantics use int
Loop counters in IL (where safe) JIT optimizes better
NPTypeCode enum values Small fixed set

Conversion Helper Methods

internal static class IndexConvert
{
    /// <summary>Throws if value exceeds int range.</summary>
    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public static int ToIntChecked(long value)
    {
        if (value > int.MaxValue || value < int.MinValue)
            throw new OverflowException($"Value {value} exceeds int range");
        return (int)value;
    }

    /// <summary>Converts long[] to int[], throws on overflow.</summary>
    public static int[] ToIntArrayChecked(long[] arr)
    {
        var result = new int[arr.Length];
        for (int i = 0; i < arr.Length; i++)
        {
            if (arr[i] > int.MaxValue)
                throw new OverflowException($"Index {i} value {arr[i]} exceeds int.MaxValue");
            result[i] = (int)arr[i];
        }
        return result;
    }

    /// <summary>Converts int[] to long[] (always safe).</summary>
    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public static long[] ToLongArray(int[] arr)
    {
        var result = new long[arr.Length];
        for (int i = 0; i < arr.Length; i++)
            result[i] = arr[i];
        return result;
    }

    /// <summary>Converts int[] to Span&lt;long&gt; using stackalloc.</summary>
    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public static Span<long> ToLongSpan(int[] arr, Span<long> buffer)
    {
        for (int i = 0; i < arr.Length; i++)
            buffer[i] = arr[i];
        return buffer.Slice(0, arr.Length);
    }
}

IL Kernel Considerations

For IL-generated kernels, loop counters can often stay int when:

  • Array size is guaranteed < int.MaxValue (checked at call site)
  • Counter is only used for iteration, not offset calculation

Offset calculations must use long:

// Before: int offset = baseOffset + i * stride;
// After:  long offset = baseOffset + (long)i * stride;

Phase 1: Core Types (CRITICAL PATH)

These changes cascade to everything else. Must be done atomically.

1.1 Shape Struct (View/Shape.cs)

Current Change To Lines
internal readonly int size long size 208
internal readonly int[] dimensions long[] dimensions 209
internal readonly int[] strides long[] strides 210
internal readonly int bufferSize long bufferSize 218
internal readonly int offset long offset 225
public readonly int OriginalSize long OriginalSize 295
public readonly int NDim int NDim 359 (KEEP int - max 32 dims)
public readonly int Size long Size 380
public readonly int Offset long Offset 391
public readonly int BufferSize long BufferSize 402
public readonly int this[int dim] long this[int dim] 565
public readonly int TransformOffset(int offset) long TransformOffset(long offset) 581
public readonly int GetOffset(params int[] indices) long GetOffset(params long[] indices) 598
public readonly int[] GetCoordinates(int offset) long[] GetCoordinates(long offset) 755

Related files:

  • View/Shape.Unmanaged.cs - unsafe pointer versions of GetOffset, GetSubshape
  • View/Shape.Reshaping.cs - reshape operations
  • View/Slice.cs - Start, Stop, Step should stay int (Python slice semantics)
  • View/SliceDef.cs - may need long for large dimension slicing

1.2 IArraySlice Interface (Backends/Unmanaged/Interfaces/IArraySlice.cs)

Current Change To
T GetIndex<T>(int index) T GetIndex<T>(long index)
object GetIndex(int index) object GetIndex(long index)
void SetIndex<T>(int index, T value) void SetIndex<T>(long index, T value)
void SetIndex(int index, object value) void SetIndex(long index, object value)
object this[int index] object this[long index]
IArraySlice Slice(int start) IArraySlice Slice(long start)
IArraySlice Slice(int start, int count) IArraySlice Slice(long start, long count)

1.3 ArraySlice Implementation (Backends/Unmanaged/ArraySlice.cs, ArraySlice1.cs`)

All index/count parameters and Count property → long

1.4 IMemoryBlock Interface (Backends/Unmanaged/Interfaces/IMemoryBlock.cs)

Current Change To
int Count long Count

1.5 UnmanagedStorage (Backends/Unmanaged/UnmanagedStorage.cs)

Current Change To Line
public int Count public long Count 47

Related files (same changes):

  • UnmanagedStorage.Getters.cs - index parameters
  • UnmanagedStorage.Setters.cs - index parameters
  • UnmanagedStorage.Slicing.cs - slice parameters
  • UnmanagedStorage.Cloning.cs - count parameters

Phase 2: NDArray Public API

2.1 NDArray Core (Backends/NDArray.cs)

Property/Method Change
int size long size
int[] shape Keep int[] for API compat OR migrate to long[]
int ndim Keep int (max 32 dimensions)
int[] strides long[] strides

2.2 NDArray Indexing (Selection/NDArray.Indexing.cs)

Current Change To
NDArray this[int* dims, int ndims] NDArray this[long* dims, int ndims]
All coordinate arrays int[]long[]

Related files:

  • NDArray.Indexing.Selection.cs
  • NDArray.Indexing.Selection.Getter.cs
  • NDArray.Indexing.Selection.Setter.cs
  • NDArray.Indexing.Masking.cs

2.3 Generic NDArray (Generics/NDArray1.cs`)

Current Change To
NDArray(int size, bool fillZeros) NDArray(long size, bool fillZeros)
NDArray(int size) NDArray(long size)

Phase 3: Iterators

3.1 NDIterator (Backends/Iterators/NDIterator.cs)

Current Change To
Func<int[], int> getOffset Func<long[], long> getOffset
Internal index tracking intlong

Files (12 type-specific generated files):

  • NDIterator.template.cs
  • NDIteratorCasts/NDIterator.Cast.*.cs (Boolean, Byte, Char, Decimal, Double, Int16, Int32, Int64, Single, UInt16, UInt32, UInt64)

3.2 MultiIterator (Backends/Iterators/MultiIterator.cs)

Same changes as NDIterator.

3.3 Incrementors (Utilities/Incrementors/)

File Changes
NDCoordinatesIncrementor.cs coords int[]long[]
NDCoordinatesAxisIncrementor.cs coords int[]long[]
NDCoordinatesLeftToAxisIncrementor.cs coords int[]long[]
NDExtendedCoordinatesIncrementor.cs coords int[]long[]
NDOffsetIncrementor.cs offset intlong
ValueOffsetIncrementor.cs offset intlong

Phase 4: IL Kernel Generator (924 occurrences)

4.1 IL Emission Changes

Pattern: Replace Ldc_I4 with Ldc_I8, Conv_I4 with Conv_I8

Current IL Change To
il.Emit(OpCodes.Ldc_I4, value) il.Emit(OpCodes.Ldc_I8, (long)value)
il.Emit(OpCodes.Ldc_I4_0) il.Emit(OpCodes.Ldc_I4_0); il.Emit(OpCodes.Conv_I8) or use Ldc_I8
il.Emit(OpCodes.Conv_I4) il.Emit(OpCodes.Conv_I8)
Loop counters (Ldloc/Stloc for int) Use int64 locals

4.2 Files with IL Changes

File Occurrences Focus Areas
ILKernelGenerator.MixedType.cs 170 Loop indices, stride calculations
ILKernelGenerator.Reduction.cs 151 Index tracking, accumulator positions
ILKernelGenerator.MatMul.cs 130 Matrix indices, row/col offsets
ILKernelGenerator.Comparison.cs 125 Loop counters
ILKernelGenerator.Unary.cs 78 Loop counters
ILKernelGenerator.Shift.cs 73 Loop counters
ILKernelGenerator.Binary.cs 53 Loop counters
ILKernelGenerator.Scan.cs 52 Cumulative indices
ILKernelGenerator.Unary.Math.cs 41 Loop counters
ILKernelGenerator.cs 35 Core emit helpers
Other partials ~16 Various

4.3 DynamicMethod Signatures

Current pattern:

new DynamicMethod("Kernel", typeof(void),
    new[] { typeof(byte*), typeof(byte*), typeof(byte*), typeof(int) });
//                                                        ^^^^ count

Change to:

new DynamicMethod("Kernel", typeof(void),
    new[] { typeof(byte*), typeof(byte*), typeof(byte*), typeof(long) });
//                                                        ^^^^ count

4.4 Delegate Types

Current Change To
delegate void ContiguousKernel<T>(T* a, T* b, T* result, int count) long count
delegate void MixedTypeKernel(...) All index/count params → long
delegate void UnaryKernel(...) All index/count params → long
delegate void ComparisonKernel(...) All index/count params → long
delegate void TypedElementReductionKernel<T>(...) All index/count params → long

Phase 5: DefaultEngine Operations

5.1 Math Operations (Backends/Default/Math/)

File Changes
Default.Clip.cs Loop indices
Default.ClipNDArray.cs Loop indices
Default.Modf.cs Loop indices
Default.Round.cs Loop indices
Default.Shift.cs Loop indices

5.2 Reduction Operations (Backends/Default/Math/Reduction/)

File Changes
Default.Reduction.Add.cs Index tracking
Default.Reduction.Product.cs Index tracking
Default.Reduction.AMax.cs Index tracking
Default.Reduction.AMin.cs Index tracking
Default.Reduction.ArgMax.cs Index tracking, return type stays int for NumPy compat
Default.Reduction.ArgMin.cs Index tracking, return type stays int for NumPy compat
Default.Reduction.Mean.cs Index tracking
Default.Reduction.Var.cs Index tracking
Default.Reduction.Std.cs Index tracking

5.3 BLAS Operations (Backends/Default/Math/BLAS/)

File Changes
Default.Dot.NDMD.cs Matrix indices, blocked iteration
Default.MatMul.2D2D.cs Matrix indices
Default.MatMul.cs Matrix indices

5.4 Array Manipulation (Backends/Default/ArrayManipulation/)

File Changes
Default.Transpose.cs Stride/index calculations
Default.Broadcasting.cs Shape/stride calculations

Phase 6: API Functions

6.1 Creation (Creation/)

File Changes
np.arange.cs count parameter
np.linspace.cs num parameter
np.zeros.cs shape parameters
np.ones.cs shape parameters
np.empty.cs shape parameters
np.full.cs shape parameters
np.eye.cs N, M parameters

6.2 Manipulation (Manipulation/)

File Changes
np.repeat.cs repeats parameter
np.roll.cs shift parameter
NDArray.unique.cs index tracking

6.3 Selection (Selection/)

All indexing operations need long indices.

6.4 Statistics (Statistics/)

File Changes
np.nanmean.cs count tracking
np.nanstd.cs count tracking
np.nanvar.cs count tracking

Phase 7: Utilities

7.1 Array Utilities (Utilities/)

File Changes
Arrays.cs Index parameters
ArrayConvert.cs 158 loop occurrences
Hashset1.cs` Index parameters

7.2 Casting (Casting/)

File Changes
NdArrayToJaggedArray.cs 24 loop occurrences
UnmanagedMemoryBlock.Casting.cs 291 loop occurrences

Migration Strategy

Option A: Big Bang (Recommended)

  1. Create feature branch int64-indexing
  2. Change Phase 1 (core types) atomically
  3. Fix all compilation errors (cascading changes)
  4. Run full test suite
  5. Performance benchmark comparison

Pros: Clean, no hybrid state
Cons: Large PR, harder to review

Option B: Incremental with Overloads

  1. Add long overloads alongside int versions
  2. Deprecate int versions
  3. Migrate callers incrementally
  4. Remove int versions

Pros: Easier to review, can ship incrementally
Cons: Code bloat during transition, easy to miss conversions

Option C: Type Alias

// In a central location
global using npy_intp = System.Int64;

Then search/replace intnpy_intp for index-related uses.

Pros: Easy to toggle for testing, self-documenting
Cons: Requires careful identification of which int to replace


Files Summary by Impact

High Impact (Core Types)

  • View/Shape.cs - 20+ changes
  • View/Shape.Unmanaged.cs - 10+ changes
  • Backends/Unmanaged/Interfaces/IArraySlice.cs - 8 changes
  • Backends/Unmanaged/ArraySlice1.cs` - 15+ changes
  • Backends/Unmanaged/UnmanagedStorage.cs - 5+ changes

Medium Impact (IL Generation)

  • Backends/Kernels/ILKernelGenerator.*.cs - 924 IL emission changes across 13 files

Medium Impact (Iterators)

  • Backends/Iterators/*.cs - 28 files (including generated casts)

Lower Impact (API Functions)

  • Creation/*.cs - parameter changes
  • Manipulation/*.cs - parameter changes
  • Selection/*.cs - index changes
  • Math/*.cs - loop indices

Generated Code (Regen)

  • Utilities/ArrayConvert.cs - 158 changes
  • Backends/Unmanaged/UnmanagedMemoryBlock.Casting.cs - 291 changes
  • NDIterator cast files - template-based

Testing Strategy

  1. Unit Tests: Run existing 2700+ tests - all should pass
  2. Edge Cases: Add tests for arrays at int32 boundary (2.1B+ elements)
  3. Performance: Benchmark suite comparing int32 vs int64 versions
  4. Memory: Verify no memory leaks from changed allocation patterns

Breaking Changes

Change Impact Migration
Shape.Size returns long Low Cast to int if needed
NDArray.size returns long Low Cast to int if needed
int[] shape → long[] shape Medium Update dependent code
Iterator coordinate types Low Internal change

Most user code uses small arrays where int suffices. The main impact is internal code that stores/passes indices.


Estimated Effort

Phase Files Estimated Hours
Phase 1: Core Types 10 8
Phase 2: NDArray API 8 4
Phase 3: Iterators 30 6
Phase 4: IL Kernels 13 16
Phase 5: DefaultEngine 20 8
Phase 6: API Functions 30 6
Phase 7: Utilities 10 4
Testing & Fixes - 16
Total ~120 ~68 hours

References

  • NumPy npy_intp definition: numpy/_core/include/numpy/npy_common.h:217
  • NumPy uses Py_ssize_t which is 64-bit on x64 platforms
  • .NET nint/nuint are platform-dependent (like NumPy's approach)
  • Benchmark proof: 1-3% overhead acceptable for >2GB array support

Implementation Plan

This plan provides 100% coverage of all changes required for int32 → int64 migration.


Phase 0: Preparation

0.1 Create Branch and Conversion Helpers

git checkout -b int64-indexing

Create src/NumSharp.Core/Utilities/IndexConvert.cs:

using System.Runtime.CompilerServices;

namespace NumSharp.Utilities;

internal static class IndexConvert
{
    /// <summary>Convert int[] to long[] (always safe).</summary>
    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public static long[] ToLong(int[] arr)
    {
        var result = new long[arr.Length];
        for (int i = 0; i < arr.Length; i++)
            result[i] = arr[i];
        return result;
    }

    /// <summary>Convert long[] to int[], throws on overflow.</summary>
    public static int[] ToIntChecked(long[] arr)
    {
        var result = new int[arr.Length];
        for (int i = 0; i < arr.Length; i++)
        {
            if (arr[i] > int.MaxValue || arr[i] < int.MinValue)
                throw new OverflowException($"Index {i} value {arr[i]} exceeds int range");
            result[i] = (int)arr[i];
        }
        return result;
    }

    /// <summary>Throws if value exceeds int range.</summary>
    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public static int ToIntChecked(long value)
    {
        if (value > int.MaxValue || value < int.MinValue)
            throw new OverflowException($"Value {value} exceeds int range");
        return (int)value;
    }
}

Phase 1: Core Types (Critical Path)

All other phases depend on this. Must be done atomically.

1.1 Shape Struct (View/Shape.cs)

Line Current Change To
207 internal readonly int _hashCode Keep int (hash codes are int)
208 internal readonly int size internal readonly long size
209 internal readonly int[] dimensions internal readonly long[] dimensions
210 internal readonly int[] strides internal readonly long[] strides
218 internal readonly int bufferSize internal readonly long bufferSize
225 internal readonly int offset internal readonly long offset

Properties to change:

Property Change
OriginalSize (295) Return long
NDim (359) Keep int - max 32 dimensions
Size (380) Return long
Offset (391) Return long
BufferSize (402) Return long
this[int dim] (565) Return long (parameter stays int)

Methods to change:

Method Change
TransformOffset(int)TransformOffset(long) Parameter and return type
GetOffset(params int[])GetOffset(params long[]) Parameter and return type
GetCoordinates(int)GetCoordinates(long) Parameter and return long[]
ComputeSizeAndHash(int[])ComputeSizeAndHash(long[]) Internal helper
ComputeIsBroadcastedStatic(int[], int[]) Change to long[]
ComputeIsContiguousStatic(int[], int[]) Change to long[]
ComputeFlagsStatic(int[], int[]) Change to long[]

Constructors:

Constructor Change
Shape(int[] dims, int[] strides, int offset, int bufferSize) All long
Shape(int[] dims, int[] strides) long[] parameters
Shape(int[] dims, int[] strides, Shape originalShape) long[] parameters
Shape(params int[] dims) params long[] dims

1.2 Shape.Unmanaged.cs (View/Shape.Unmanaged.cs)

Method Change
GetOffset(int* indices, int ndims) GetOffset(long* indices, int ndims) returning long
GetSubshape(int* dims, int ndims) GetSubshape(long* dims, int ndims) returning (Shape, long)
InferNegativeCoordinates(int[], int*, int) Change to long[], long*

1.3 Shape.Reshaping.cs (View/Shape.Reshaping.cs)

All reshape methods: change dimension parameters to long[].

1.4 IArraySlice Interface (Backends/Unmanaged/Interfaces/IArraySlice.cs)

Current Change To
T GetIndex<T>(int index) T GetIndex<T>(long index)
object GetIndex(int index) object GetIndex(long index)
void SetIndex<T>(int index, T value) void SetIndex<T>(long index, T value)
void SetIndex(int index, object value) void SetIndex(long index, object value)
object this[int index] object this[long index]
IArraySlice Slice(int start) IArraySlice Slice(long start)
IArraySlice Slice(int start, int count) IArraySlice Slice(long start, long count)

1.5 IMemoryBlock Interface (Backends/Unmanaged/Interfaces/IMemoryBlock.cs)

Current Change To
int Count { get; } long Count { get; }

1.6 ArraySlice Implementation (Backends/Unmanaged/ArraySlice.cs, ArraySlice1.cs`)

  • All int index parameters → long
  • Count property → long
  • Internal _count field → long
  • All Slice() methods → long parameters

1.7 UnmanagedStorage (Backends/Unmanaged/UnmanagedStorage.cs)

Line Current Change To
47 public int Count public long Count

1.8 UnmanagedStorage.Getters.cs

Method Change
GetValue(params int[] indices) GetValue(params long[] indices)
GetAtIndex(int index) GetAtIndex(long index)
GetAtIndex<T>(int index) GetAtIndex<T>(long index)
GetData(int[] indices) GetData(long[] indices)
GetData(int* indices, int length) GetData(long* indices, int length)
GetValue<T>(params int[] indices) GetValue<T>(params long[] indices)

1.9 UnmanagedStorage.Setters.cs

Method Change
SetValue(ValueType value, params int[] indices) params long[] indices
SetValue<T>(T value, params int[] indices) params long[] indices
SetAtIndex(int index, object value) long index
SetAtIndex<T>(int index, T value) long index

1.10 UnmanagedStorage.Slicing.cs

All slice offset/count parameters → long.

1.11 UnmanagedStorage.Cloning.cs

Count parameters → long.

1.12 UnmanagedMemoryBlock (Backends/Unmanaged/UnmanagedMemoryBlock*.cs)

  • Count property → long
  • All index parameters → long
  • Allocation size calculations → long

Phase 2: NDArray Public API

2.1 NDArray Core (Backends/NDArray.cs)

Property/Field Change
public int[] shape public long[] shape
public int size public long size
public int[] strides public long[] strides
public int ndim Keep int
public int len public long len

2.2 NDArray.Indexing.cs (Selection/NDArray.Indexing.cs)

Current Change To
this[int* dims, int ndims] this[long* dims, int ndims]
this[params int[] indices] (in generic) this[params long[] indices]

2.3 NDArray.Indexing.Selection.cs

All index arrays int[]long[].

2.4 NDArray.Indexing.Selection.Getter.cs

All coordinate/index handling → long.

2.5 NDArray.Indexing.Selection.Setter.cs

All coordinate/index handling → long.

2.6 NDArray.Indexing.Masking.cs

Index tracking → long.

2.7 Generic NDArray (Generics/NDArray1.cs`)

Current Change To
NDArray(int size, bool fillZeros) NDArray(long size, bool fillZeros)
NDArray(int size) NDArray(long size)
this[params int[] indices] this[params long[] indices]
GetAtIndex(int index) GetAtIndex(long index)

2.8 NDArray.String.cs (Backends/NDArray.String.cs)

Index iteration → long.


Phase 3: Iterators

3.1 NDIterator.cs (Backends/Iterators/NDIterator.cs)

Change
Func<int[], int> getOffsetFunc<long[], long> getOffset
Internal index tracking intlong
Coordinate arrays int[]long[]

3.2 NDIterator.template.cs

Same changes as NDIterator.cs - this is the template for code generation.

3.3 NDIterator Cast Files (12 files)

All files in Backends/Iterators/NDIteratorCasts/:

  • NDIterator.Cast.Boolean.cs
  • NDIterator.Cast.Byte.cs
  • NDIterator.Cast.Char.cs
  • NDIterator.Cast.Decimal.cs
  • NDIterator.Cast.Double.cs
  • NDIterator.Cast.Int16.cs
  • NDIterator.Cast.Int32.cs
  • NDIterator.Cast.Int64.cs
  • NDIterator.Cast.Single.cs
  • NDIterator.Cast.UInt16.cs
  • NDIterator.Cast.UInt32.cs
  • NDIterator.Cast.UInt64.cs

All need: Func<int[], int>Func<long[], long>.

3.4 MultiIterator.cs (Backends/Iterators/MultiIterator.cs)

Same index/coordinate changes.

3.5 Incrementors (Utilities/Incrementors/)

File Changes
NDCoordinatesIncrementor.cs coords int[]long[]
NDCoordinatesAxisIncrementor.cs coords int[]long[]
NDCoordinatesLeftToAxisIncrementor.cs coords int[]long[]
NDExtendedCoordinatesIncrementor.cs coords int[]long[]
NDOffsetIncrementor.cs offset intlong
ValueOffsetIncrementor.cs offset intlong

Phase 4: IL Kernel Generator

924 IL emission changes across 13 files.

4.1 Common Patterns to Change

IL Pattern Current New
Load int constant il.Emit(OpCodes.Ldc_I4, value) il.Emit(OpCodes.Ldc_I8, (long)value)
Load 0 il.Emit(OpCodes.Ldc_I4_0) il.Emit(OpCodes.Ldc_I4_0); il.Emit(OpCodes.Conv_I8)
Load 1 il.Emit(OpCodes.Ldc_I4_1) il.Emit(OpCodes.Ldc_I4_1); il.Emit(OpCodes.Conv_I8)
Convert to int il.Emit(OpCodes.Conv_I4) il.Emit(OpCodes.Conv_I8)
Local variable il.DeclareLocal(typeof(int)) il.DeclareLocal(typeof(long))
Increment il.Emit(OpCodes.Add) Same (works for long)
Compare il.Emit(OpCodes.Clt) Same (works for long)

4.2 Delegate Type Changes

File Delegate Change
ILKernelGenerator.Binary.cs ContiguousKernel<T> int countlong count
ILKernelGenerator.MixedType.cs MixedTypeKernel All index/count → long
ILKernelGenerator.Unary.cs UnaryKernel All index/count → long
ILKernelGenerator.Comparison.cs ComparisonKernel All index/count → long
ILKernelGenerator.Reduction.cs TypedElementReductionKernel<T> All index/count → long
ILKernelGenerator.Scan.cs Scan delegates All index/count → long
ILKernelGenerator.Shift.cs Shift delegates All index/count → long
ILKernelGenerator.MatMul.cs MatMul delegates All index/count → long

4.3 DynamicMethod Signature Changes

All DynamicMethod parameter types:

// Before
new DynamicMethod("Kernel", typeof(void),
    new[] { typeof(byte*), typeof(byte*), typeof(byte*), typeof(int) });

// After
new DynamicMethod("Kernel", typeof(void),
    new[] { typeof(byte*), typeof(byte*), typeof(byte*), typeof(long) });

4.4 Files and Estimated Changes

File Occurrences Focus
ILKernelGenerator.cs 35 Core helpers, type utilities
ILKernelGenerator.Binary.cs 53 Loop counters, contiguous kernels
ILKernelGenerator.MixedType.cs 170 Loop indices, stride calculations
ILKernelGenerator.Unary.cs 78 Loop counters
ILKernelGenerator.Unary.Math.cs 41 Loop counters
ILKernelGenerator.Unary.Decimal.cs 7 Loop counters
ILKernelGenerator.Unary.Predicate.cs 3 Loop counters
ILKernelGenerator.Unary.Vector.cs 6 Vector indices
ILKernelGenerator.Comparison.cs 125 Loop counters, mask extraction
ILKernelGenerator.Reduction.cs 151 Index tracking, accumulator positions
ILKernelGenerator.Scan.cs 52 Cumulative indices
ILKernelGenerator.Shift.cs 73 Loop counters
ILKernelGenerator.MatMul.cs 130 Matrix indices, row/col offsets

4.5 Reduction Axis Files

File Changes
ILKernelGenerator.Reduction.Axis.cs Index iteration
ILKernelGenerator.Reduction.Axis.Arg.cs ArgMax/ArgMin index tracking
ILKernelGenerator.Reduction.Axis.NaN.cs NaN reduction indices
ILKernelGenerator.Reduction.Axis.Simd.cs SIMD loop indices
ILKernelGenerator.Reduction.Axis.VarStd.cs Variance/std indices
ILKernelGenerator.Reduction.Boolean.cs All/Any indices

4.6 Other IL Files

File Changes
ILKernelGenerator.Clip.cs Loop indices
ILKernelGenerator.Masking.cs Mask indices
ILKernelGenerator.Masking.Boolean.cs Boolean mask indices
ILKernelGenerator.Masking.NaN.cs NaN mask indices
ILKernelGenerator.Masking.VarStd.cs Variance mask indices

Phase 5: DefaultEngine Operations

5.1 Math Operations (Backends/Default/Math/)

File Changes
Default.Clip.cs Loop indices
Default.ClipNDArray.cs Loop indices
Default.Modf.cs Loop indices
Default.Round.cs Loop indices
Default.Shift.cs Loop indices

5.2 Reduction Operations (Backends/Default/Math/Reduction/)

File Changes
Default.Reduction.Add.cs Index tracking
Default.Reduction.Product.cs Index tracking
Default.Reduction.AMax.cs Index tracking
Default.Reduction.AMin.cs Index tracking
Default.Reduction.ArgMax.cs Return type intlong
Default.Reduction.ArgMin.cs Return type intlong
Default.Reduction.Mean.cs Index tracking
Default.Reduction.Var.cs Index tracking
Default.Reduction.Std.cs Index tracking

5.3 BLAS Operations (Backends/Default/Math/BLAS/)

File Changes
Default.Dot.NDMD.cs Matrix indices, blocked iteration
Default.MatMul.2D2D.cs Matrix indices (17 loop occurrences)
Default.MatMul.cs Matrix indices

5.4 Array Manipulation (Backends/Default/ArrayManipulation/)

File Changes
Default.Transpose.cs Stride/index calculations
Default.Broadcasting.cs Shape/stride calculations

5.5 Indexing (Backends/Default/Indexing/)

File Changes
Default.NonZero.cs Index collection returns long[][]

Phase 6: API Functions

6.1 Creation (Creation/)

File Parameter Changes
np.arange.cs start, stop, step calculation → long aware
np.linspace.cs num parameter, index iteration
np.zeros.cs shape parameters int[]long[]
np.ones.cs shape parameters int[]long[]
np.empty.cs shape parameters int[]long[]
np.full.cs shape parameters int[]long[]
np.eye.cs N, M parameters (can stay int, implicitly convert)
np.zeros_like.cs Shape handling
np.ones_like.cs Shape handling
np.empty_like.cs Shape handling
np.full_like.cs Shape handling
np.array.cs Shape calculation
np.asarray.cs Shape calculation
np.copy.cs Size handling
np.stack.cs Index iteration
np.hstack.cs Index iteration
np.vstack.cs Index iteration
np.dstack.cs Index iteration
np.concatenate.cs Index iteration
np.broadcast.cs Shape handling
np.broadcast_to.cs Shape handling
np.broadcast_arrays.cs Shape handling

6.2 Manipulation (Manipulation/)

File Changes
np.reshape.cs Shape parameter int[]long[]
np.repeat.cs repeats parameter, index iteration
np.roll.cs shift parameter, index iteration
NDArray.unique.cs Index tracking
np.squeeze.cs Shape handling
np.expand_dims.cs Shape handling
np.swapaxes.cs Stride handling
np.moveaxis.cs Stride handling
np.rollaxis.cs Stride handling
np.transpose.cs Stride handling
np.ravel.cs Size handling
NDArray.flatten.cs Size handling

6.3 Math (Math/)

File Changes
np.sum.cs Size handling
NDArray.prod.cs Size handling
np.cumsum.cs Index iteration
NDArray.negative.cs Index iteration
NdArray.Convolve.cs Index iteration
np.nansum.cs Index iteration
np.nanprod.cs Index iteration

6.4 Statistics (Statistics/)

File Changes
np.mean.cs Count handling
np.std.cs Count handling
np.var.cs Count handling
np.nanmean.cs Count tracking
np.nanstd.cs Count tracking
np.nanvar.cs Count tracking

6.5 Sorting/Searching (Sorting_Searching_Counting/)

File Changes
np.argmax.cs Return long
np.argmin.cs Return long
np.argsort.cs Index array handling
np.searchsorted.cs Index return type
np.nanmax.cs Index iteration
np.nanmin.cs Index iteration

6.6 Logic (Logic/)

File Changes
np.all.cs Index iteration
np.any.cs Index iteration
np.nonzero.cs Return long[][]

6.7 Linear Algebra (LinearAlgebra/)

File Changes
np.dot.cs Index handling
NDArray.dot.cs Index handling
np.matmul.cs Index handling
np.outer.cs Index handling
NDArray.matrix_power.cs Index iteration

6.8 Random (RandomSampling/)

File Changes
np.random.rand.cs Shape parameters
np.random.randn.cs Shape parameters
np.random.randint.cs Size parameters, index iteration
np.random.uniform.cs Shape parameters
np.random.choice.cs Size parameters
All distribution files Shape parameters

6.9 I/O (APIs/)

File Changes
np.save.cs Index iteration in Enumerate
np.load.cs Shape handling
np.fromfile.cs Count handling
np.tofile.cs Count handling

Phase 7: Utilities and Casting

7.1 Utilities (Utilities/)

File Changes
Arrays.cs Index parameters
ArrayConvert.cs 158 loop occurrences
Hashset1.cs Index parameters
py.cs Index handling
SteppingExtension.cs Index handling

7.2 Casting (Casting/)

File Changes
NdArrayToJaggedArray.cs 24 loop occurrences
UnmanagedMemoryBlock.Casting.cs 291 loop occurrences
NdArray.ToString.cs Index iteration
Implicit/NdArray.Implicit.Array.cs Index handling

Phase 8: SIMD Helpers

8.1 SimdKernels.cs (Backends/Kernels/SimdKernels.cs)

Loop counters and index handling → long.

8.2 SimdMatMul.cs (Backends/Kernels/SimdMatMul.cs)

20 loop occurrences - matrix indices → long.

8.3 SimdReductionOptimized.cs (Backends/Kernels/SimdReductionOptimized.cs)

Loop counters → long.


Phase 9: Testing

9.1 Update Existing Tests

  • All tests using int[] for shape assertions → long[]
  • All tests storing argmax/argmin results → long
  • All tests checking sizelong

9.2 New Tests: Large Array Boundary

Create test/NumSharp.UnitTest/LargeArrayTests.cs:

[Test]
public void Shape_SupportsLargeSize()
{
    // Test shape with size > int.MaxValue
    var shape = new Shape(50000, 50000);  // 2.5B elements
    Assert.That(shape.Size, Is.EqualTo(2_500_000_000L));
}

[Test]
public void ArgMax_ReturnsLongIndex()
{
    var arr = np.arange(100);
    long idx = arr.argmax();  // Should compile and work
    Assert.That(idx, Is.EqualTo(99L));
}

9.3 Performance Regression Tests

Compare benchmarks before/after:

  • np.add (scalar and SIMD paths)
  • np.cumsum
  • np.matmul
  • Verify < 5% regression

Execution Order

Phase 0: Preparation (create branch, helpers)
    ↓
Phase 1: Core Types (Shape, IArraySlice, UnmanagedStorage)
    ↓ (compilation will fail until complete)
Phase 2: NDArray Public API
    ↓
Phase 3: Iterators
    ↓
Phase 4: IL Kernel Generator (largest effort)
    ↓
Phase 5: DefaultEngine Operations
    ↓
Phase 6: API Functions
    ↓
Phase 7: Utilities and Casting
    ↓
Phase 8: SIMD Helpers
    ↓
Phase 9: Testing
    ↓
Final: Build, test, benchmark, merge

Verification Checklist

  • dotnet build succeeds with no errors
  • All 2700+ existing tests pass
  • No int remains for index/size/offset/stride (except ndim, Slice)
  • shape returns long[]
  • size returns long
  • argmax/argmin return long
  • nonzero returns long[][]
  • IL kernels use Ldc_I8 and Conv_I8
  • Performance regression < 5%
  • Large array test passes (shape with >2B elements)

Metadata

Metadata

Assignees

Labels

NumPy 2.x ComplianceAligns behavior with NumPy 2.x (NEPs, breaking changes)architectureCross-cutting structural changes affecting multiple componentsenhancementNew feature or requestperformancePerformance improvements or optimizations

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions