Skip to content

Add some mul overrides too#46

Merged
kshyatt merged 4 commits intomainfrom
ksh/bump
Mar 6, 2026
Merged

Add some mul overrides too#46
kshyatt merged 4 commits intomainfrom
ksh/bump

Conversation

@kshyatt
Copy link
Member

@kshyatt kshyatt commented Mar 5, 2026

No description provided.

@kshyatt kshyatt requested a review from lkdvos March 5, 2026 14:49
@codecov
Copy link

codecov bot commented Mar 5, 2026

Codecov Report

❌ Patch coverage is 70.00000% with 3 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
ext/StridedGPUArraysExt.jl 77.77% 2 Missing ⚠️
ext/StridedJLArraysExt.jl 0.00% 1 Missing ⚠️
Files with missing lines Coverage Δ
ext/StridedJLArraysExt.jl 0.00% <0.00%> (ø)
ext/StridedGPUArraysExt.jl 72.22% <77.77%> (+5.55%) ⬆️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

end

# lifted from GPUArrays.jl
function Base.fill!(A::StridedView{T, N, TA, F}, x) where {T, N, TA <: AbstractGPUArray{T}, F <: ALL_FS}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm slightly hesitant about these kinds of overloads, but I think it is just ignorance:

From what I know, the point of Strided is to have some logic that makes it such that even though A might have strides that are not just the usual multidimensional ones, it tries to figure out a way to access the parent array "as linearly as possible".
If I understand this overload correctly, this is simply doing a mapping and making use of the index computations, so in that sense it is somewhat similar to the implementation for just a regular SubArray, I think?
However, I don't actually know if the same kinds of performance ideas about linear access are even valid on the GPU, so this might be completely reasonable?

Maybe we could schedule a meeting somewhere next week to discuss this, and the possible implications of deciding what type of objects to use for our strided views?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I think that makes sense. I think coming up with a good access pattern is probably even more important on the GPU but here I've done the naive thing just to get it started, and then we should definitely do some benchmarking/profiling to see if this is really slowing us down or not.

@kshyatt kshyatt merged commit 53c46bf into main Mar 6, 2026
15 of 19 checks passed
@lkdvos lkdvos deleted the ksh/bump branch March 6, 2026 14:34
function Base.fill!(A::StridedView{T, N, TA, F}, x) where {T, N, TA <: AbstractGPUArray{T}, F <: ALL_FS}
isempty(A) && return A
@kernel function fill_kernel!(a, val)
idx = @index(Global, Linear)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would using @index(Global, Cartesian) have been an option? Linear indexing is not particularly efficient for a StridedView?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we can do that instead, but it also affects the launch configuration for the kernel of course. I'll try that in a fix PR that hopefully will also fix TagBot...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants