-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Split VORTEXM4 from VORTEX target and fix SGEMM_DIRECT support for SME-capable targets #5423
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
… reserved on MacOS
…ang compatibility
…t_performant for ARM64
| #define C6 x22 //Constant6: N*SVLs | ||
| #define C2 x19 //Constant2: N + SVLs | ||
| #define C3 x20 //Constant3: K*SVLs + SVLs | ||
| #define C4 x21 //Constant4: SVLs-2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modifying x20 to x21 will require below dependent changes.
At line 65: sub w21, w21, #2
At line 202: cmp w13, w21
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, sorry, I had already corrected this locally but pushed the wrong version. Unfortunately this correction has no effect on the wrong xscblat3 test results seen for M odd (and contrary to my expectations this PR also does not fix the divergence between SGEMM and SGEMMT seen in test_sgemmt of utest/openblas_utest_ext that was flagged in #5414)
…t support it w/out SVE)
eventually fixes #5414