Skip to content

Conversation

@kshyatt
Copy link
Member

@kshyatt kshyatt commented Nov 17, 2025

Base.require_one_based_indexing returns a Bool on some versions, not n (integer).

Converting to the full CuMatrix/ROCMatrix is inefficient/ugly but dispatching on a SubArray of a Diagonal of a CuVector (or ROCVector) is also really ugly. Not sure what the best approach is here, but this works for now (we can revisit if it's really problematic).

@kshyatt kshyatt requested a review from lkdvos November 17, 2025 18:31
Copy link
Member

@lkdvos lkdvos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it only the Diagonal giving issues?
We could also just add a specialization for Diagonal in general, which we should be able to handle in a GPU friendly way:

function project_hermitian_native!(A::Diagonal, B::Diagonal, ::Val{anti}) where {anti}
   if anti
       diagview(A) .= imag.(diagview(B)) .* im
   else
       diagview(A) .= real.(diagview(B))
   end
end

[edit] I think we have a utility function somewhere for the imaginary part, I can't remember the name though

@kshyatt
Copy link
Member Author

kshyatt commented Nov 17, 2025

That's also fine, I wrote this in the post lunch carb coma there's probably a better way

@lkdvos
Copy link
Member

lkdvos commented Nov 17, 2025

I guess the diagonal specialization is something we want anyways so that might make sense?

@codecov
Copy link

codecov bot commented Nov 17, 2025

Codecov Report

❌ Patch coverage is 96.77419% with 1 line in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/common/matrixproperties.jl 92.85% 1 Missing ⚠️
Files with missing lines Coverage Δ
...ixAlgebraKitAMDGPUExt/MatrixAlgebraKitAMDGPUExt.jl 87.30% <100.00%> (+5.21%) ⬆️
...MatrixAlgebraKitCUDAExt/MatrixAlgebraKitCUDAExt.jl 61.81% <100.00%> (+4.19%) ⬆️
src/MatrixAlgebraKit.jl 100.00% <ø> (ø)
src/implementations/projections.jl 96.42% <100.00%> (+0.32%) ⬆️
src/common/matrixproperties.jl 87.85% <92.85%> (-1.85%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.


return USVᴴ
end
svd_full!(A::Diagonal, USVᴴ, alg::GPU_SVDAlgorithm) = svd_full!(diagm(A.diag), USVᴴ, alg)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this actually being used? This is to cover the case where somehow a Diagonal ends up with a GPU_SVDAlgorithm?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We currently don't have a DiagonalGPUAlgorithm, which is why this happens, I suppose

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might also be a problem with the specific testing setup: we are manually picking the GPUAlgorithm and then calling this on a Diagonal, while the automated algorithm selection would have picked DiagonalAlgorithm instead.
I agree that we should make sure that the diagonal gpu arrays work, but I don't think we have to have specializations to make sure the GPU_SVDAlgorithm implementations work for Diagonal.
TLDR, instead of these specializations, I think I would rather just remove the tests, since this really is supposed to error?

Comment on lines 135 to 136
MatrixAlgebraKit.ishermitian_approx(A::StridedROCMatrix; kwargs...) =
@invoke MatrixAlgebraKit.ishermitian_approx(A::Any; kwargs...)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this kind of definition useful? What happens if it is removed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @lkdvos who added this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to avoid the StridedMatrix implementation that does scalar indexing, and instead use norm(project_hermitian(A)) <= ..., as this is the fallback definition for Any.
I updated this and simply copied over that code now, in hindsight the benefit of not repeating code is probably not outweighed by the clarity.

@kshyatt
Copy link
Member Author

kshyatt commented Nov 26, 2025 via email

@kshyatt
Copy link
Member Author

kshyatt commented Dec 1, 2025

SVD now has tests, but I rather dislike the approach here and I think will implement GPUDiagonalAlgorithm for CUDA and ROCm in a separate PR. This however could be merged as a stop-gap to unblock the TensorKit PR, and then another small update PR made to TensorKit if needed to address GPUDiagonalAlgorithm. Does that sound reasonable?

@github-actions
Copy link

github-actions bot commented Dec 1, 2025

Your PR no longer requires formatting changes. Thank you for your contribution!

@lkdvos
Copy link
Member

lkdvos commented Dec 1, 2025

Apologies for the PR hijack here, I hope everything now turns green, and would be happy to merge.

  • I migrated the ishermitian implementations for Diagonal from the GPU extensions to the main package, and implemented them in a GPU-friendly manner there.
  • I also reverted the specific diagm casts for the GPUAlgorithm implementations, which I'm happy to rerevert if need be. My main idea here is that we shouldn't ever end up calling svd_compact!(A::GPUDiagonal, alg::GPU_SVDAlgorithm), and this should have selected svd_compact!(A::GPUDiagonal, alg::DiagonalAlgortihm) instead.

@kshyatt
Copy link
Member Author

kshyatt commented Dec 1, 2025

I think the followup PR #106 can handle the SVD case, so I'm happy to turn on automerge if you are :)

@lkdvos
Copy link
Member

lkdvos commented Dec 2, 2025

As a note for a potential (follow-up) improvement: I think the polar decompositions are now defaulting to PolarViaSvd(DiagonalAlgorithm()) for Diagonal inputs, which means the output has to be non-diagonal because the svd sorts the singular values.
Am I missing something, or can we bypass that and directly return diagonal W and P arrays, similar to what a positive QR would return?

@kshyatt kshyatt enabled auto-merge (squash) December 2, 2025 08:38
@kshyatt kshyatt merged commit ba2c9ef into main Dec 2, 2025
10 checks passed
@kshyatt kshyatt deleted the ksh/tk2 branch December 2, 2025 10:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants