fix: improvements to bypass scalar indexing and improve GPU support by kshyatt · Pull Request #375 · QuantumKitHub/TensorKit.jl

kshyatt · 2026-02-18T12:42:45Z

Needed to get more MPSKit examples working

codecov · 2026-02-26T13:01:03Z

Codecov Report

❌ Patch coverage is 66.66667% with 6 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
ext/TensorKitCUDAExt/cutensormap.jl	0.00%	3 Missing ⚠️
src/tensors/tensor.jl	0.00%	2 Missing ⚠️
src/tensors/abstracttensor.jl	85.71%	1 Missing ⚠️

Files with missing lines	Coverage Δ
ext/TensorKitCUDAExt/truncation.jl	`96.77% <100.00%> (ø)`
src/tensors/adjoint.jl	`89.65% <ø> (-0.35%)`	⬇️
src/tensors/braidingtensor.jl	`88.51% <100.00%> (+0.07%)`	⬆️
src/tensors/diagonal.jl	`90.23% <100.00%> (ø)`
src/tensors/indexmanipulations.jl	`72.50% <100.00%> (-1.13%)`	⬇️
src/tensors/tensoroperations.jl	`96.27% <100.00%> (-1.40%)`	⬇️
src/tensors/abstracttensor.jl	`56.34% <85.71%> (+1.24%)`	⬆️
src/tensors/tensor.jl	`83.23% <0.00%> (-0.58%)`	⬇️
ext/TensorKitCUDAExt/cutensormap.jl	`70.83% <0.00%> (-3.84%)`	⬇️

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

kshyatt · 2026-02-27T11:14:43Z

Let's make this a draft too to cut down on CI thrash

lkdvos

Left some comment throughout, there are some things that I am not entirely convinced by but the rest looks great, thanks for working through all of this!

For the similarstoragetype(tensor, storagetype) calls that you added, this seems like something we should probably discuss over a separate PR, and it would be great if we could consolidate this one to get the remainder of the fixes in.
Would you be up for splitting these two things, and then getting this merged?

The same kind of holds for some of the comments I made too, if we can just postpone the things that are not obvious, but already get the other parts in, that would probably be helpful.

(Note that I am very much aware that none of this is your fault and this PR has lived for too long so the design shifts a bit, for which I do apologize!)

kshyatt · 2026-03-31T16:05:16Z

It's completely fine!! This has stayed open as I work through adding more tests for MPSKit, so I think we can pare off the simpler stuff we agree on, and then discuss things that are more contentious.

github-actions · 2026-03-31T17:55:16Z

Your PR no longer requires formatting changes. Thank you for your contribution!

lkdvos

It seems like some of the rebasing and the github UI has made it hard to spot the comments I left before, although I think many of them are still unresolved and could be discussed :)

kshyatt · 2026-05-18T13:20:46Z

Think I addressed everything and cleaned up the diff a bit as well

kshyatt · 2026-05-19T09:08:07Z

AMD test fail is unrelated. Does anyone have objections to this getting merged today?

lkdvos · 2026-05-19T11:19:17Z

+function MatrixAlgebraKit.findtruncated_svd(values::CuSectorVector, strategy::S) where {S <: MatrixAlgebraKit.TruncationStrategy}
+    # returning a CuSectorVector wrecks things in truncate_{co}domain
+    # because of scalar indexing
+    return CUDA.CUDACore.Adapt.adapt(Vector, MatrixAlgebraKit.findtruncated(values, strategy))
+end
+
+for strat in (:(MatrixAlgebraKit.TruncationByOrder), :(MatrixAlgebraKit.TruncationByError), :(MatrixAlgebraKit.TruncationIntersection), :(TensorKit.Factorizations.TruncationSpace))
+    @eval function MatrixAlgebraKit.findtruncated_svd(values::CuSectorVector, strategy::$strat)
+        # returning a CuSectorVector wrecks things in truncate_{co}domain
+        # because of scalar indexing
+        return CUDA.CUDACore.Adapt.adapt(Vector, MatrixAlgebraKit.findtruncated(values, strategy))
+    end
+end
+
+function MatrixAlgebraKit.findtruncated_svd(values::CuSectorVector, strategy::MatrixAlgebraKit.TruncationByValue)
+    atol = TensorKit.Factorizations.rtol_to_atol(values, strategy.p, strategy.atol, strategy.rtol)
+    strategy′ = trunctol(; atol, strategy.by, strategy.keep_below)
+    return SectorDict(c => CUDA.CUDACore.Adapt.adapt(Vector, MatrixAlgebraKit.findtruncated_svd(d, strategy′)) for (c, d) in pairs(values))
+end
+


Would it make sense to overload the truncate_domain! and truncate_codomain! and truncate_diagonal! functions instead?
This looks like it is quite prone to ambiguity, and I guess we will also have to copy this for the findtruncated version too if we want to have eigenvalue decompositions as well.

Suggested change

function MatrixAlgebraKit.findtruncated_svd(values::CuSectorVector, strategy::S) where {S <: MatrixAlgebraKit.TruncationStrategy}

# returning a CuSectorVector wrecks things in truncate_{co}domain

# because of scalar indexing

return CUDA.CUDACore.Adapt.adapt(Vector, MatrixAlgebraKit.findtruncated(values, strategy))

end

for strat in (:(MatrixAlgebraKit.TruncationByOrder), :(MatrixAlgebraKit.TruncationByError), :(MatrixAlgebraKit.TruncationIntersection), :(TensorKit.Factorizations.TruncationSpace))

@eval function MatrixAlgebraKit.findtruncated_svd(values::CuSectorVector, strategy::$strat)

# returning a CuSectorVector wrecks things in truncate_{co}domain

# because of scalar indexing

return CUDA.CUDACore.Adapt.adapt(Vector, MatrixAlgebraKit.findtruncated(values, strategy))

end

end

function MatrixAlgebraKit.findtruncated_svd(values::CuSectorVector, strategy::MatrixAlgebraKit.TruncationByValue)

atol = TensorKit.Factorizations.rtol_to_atol(values, strategy.p, strategy.atol, strategy.rtol)

strategy′ = trunctol(; atol, strategy.by, strategy.keep_below)

return SectorDict(c => CUDA.CUDACore.Adapt.adapt(Vector, MatrixAlgebraKit.findtruncated_svd(d, strategy′)) for (c, d) in pairs(values))

end

function TensorKit.Factorizations.truncate_domain!(tdst::CuTensorMap, tsrc::CuTensorMap, inds)

for (c, b) in blocks(tdst)

I = get(inds, c, nothing)

@assert !isnothing(I)

I = CUDA.CUDACore.Adapt.adapt(Vector, I)

b′ = block(tsrc, c)

b .= view(b′, :, I)

end

return tdst

end

function TensorKit.Factorizations.truncate_codomain!(tdst::CuTensorMap, tsrc::CuTensorMap, inds)

for (c, b) in blocks(tdst)

I = get(inds, c, nothing)

@assert !isnothing(I)

I = CUDA.CUDACore.Adapt.adapt(Vector, I)

b′ = block(tsrc, c)

b .= view(b′, I, :)

end

return tdst

end

function TensorKit.Factorizations.truncate_diagonal!(Ddst::DiagonalCuTensorMap, Dsrc::DiagonalCuTensorMap, inds)

for (c, b) in blocks(Ddst)

I = get(inds, c, nothing)

@assert !isnothing(I)

I = CUDA.CUDACore.Adapt.adapt(Vector, I)

diagview(b) .= view(diagview(block(Dsrc, c)), I)

end

return Ddst

end

(Warning, did not try to run this!)
Also, should this be adapt or collect?
Also, I added the DiagonalCuTensorMap, not sure if we have that type alias yet. (And also if this is required for them?)

I tried overriding those and the problem was I was getting compilation errors from GPUCompiler, it was enough of a rabbit hole that I thought it made more sense to punt this for now

Co-authored-by: Lukas Devos <ldevos98@gmail.com>

lkdvos

I think overall the current changes look good to me. The only remaining part is the one on the factorizations, but if this unblocks it, we can always revisit if this shows up in profilers.

kshyatt · 2026-05-19T12:31:46Z

Sorry by factorizations u mean truncate_{co}domain?

kshyatt force-pushed the ksh/cuda_tweaks branch from 3bed38d to 8665c4a Compare February 18, 2026 13:35

lkdvos reviewed Feb 18, 2026

View reviewed changes

kshyatt mentioned this pull request Feb 20, 2026

Add a disamgiguating conversion QuantumKitHub/BlockTensorKit.jl#47

Merged

kshyatt force-pushed the ksh/cuda_tweaks branch from eabfce9 to 0c903ac Compare February 25, 2026 15:47

kshyatt marked this pull request as draft February 27, 2026 11:14

kshyatt force-pushed the ksh/cuda_tweaks branch 2 times, most recently from f5857b3 to 32e182d Compare March 12, 2026 12:36

kshyatt force-pushed the ksh/cuda_tweaks branch 2 times, most recently from f5faaf6 to 2359d28 Compare March 23, 2026 14:24

lkdvos mentioned this pull request Mar 26, 2026

MAK v0.6.5 updates #390

Merged

lkdvos reviewed Mar 31, 2026

View reviewed changes

kshyatt force-pushed the ksh/cuda_tweaks branch from d0afb2d to 8a12178 Compare April 8, 2026 06:55

kshyatt force-pushed the ksh/cuda_tweaks branch from 8a12178 to ad62dad Compare April 22, 2026 10:17

kshyatt mentioned this pull request Apr 23, 2026

Some more small changes for GPU support QuantumKitHub/BlockTensorKit.jl#48

Merged

kshyatt force-pushed the ksh/cuda_tweaks branch 3 times, most recently from d29251a to 3c5a575 Compare April 27, 2026 10:21