-
Notifications
You must be signed in to change notification settings - Fork 14.6k
[VectorCombine][TTI] Prevent extract/ins rewrite to GEP #150216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-llvm-analysis @llvm/pr-subscribers-llvm-transforms Author: Nathan Gauër (Keenuts) ChangesUsing GEP to index into a vector is not disallowed, but not recommended. Preventing this optimizations from rewritting extract/insert instructions into a GEP helps us lower more code to SPIR-V. This change should be OK as it's only active when targeting SPIR-V and disabling a non-recommended transformation. Related to #145002 Patch is 33.46 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/150216.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/Vectorize/VectorCombine.cpp b/llvm/lib/Transforms/Vectorize/VectorCombine.cpp
index 82adc34fdbd84..20b8165ff280a 100644
--- a/llvm/lib/Transforms/Vectorize/VectorCombine.cpp
+++ b/llvm/lib/Transforms/Vectorize/VectorCombine.cpp
@@ -3750,8 +3750,9 @@ bool VectorCombine::run() {
LLVM_DEBUG(dbgs() << "\n\nVECTORCOMBINE on " << F.getName() << "\n");
+ const bool isSPIRV = F.getParent()->getTargetTriple().isSPIRV();
bool MadeChange = false;
- auto FoldInst = [this, &MadeChange](Instruction &I) {
+ auto FoldInst = [this, &MadeChange, isSPIRV](Instruction &I) {
Builder.SetInsertPoint(&I);
bool IsVectorType = isa<VectorType>(I.getType());
bool IsFixedVectorType = isa<FixedVectorType>(I.getType());
@@ -3780,13 +3781,15 @@ bool VectorCombine::run() {
// TODO: Identify and allow other scalable transforms
if (IsVectorType) {
MadeChange |= scalarizeOpOrCmp(I);
- MadeChange |= scalarizeLoadExtract(I);
- MadeChange |= scalarizeExtExtract(I);
+ if (!isSPIRV) {
+ MadeChange |= scalarizeLoadExtract(I);
+ MadeChange |= scalarizeExtExtract(I);
+ }
MadeChange |= scalarizeVPIntrinsic(I);
MadeChange |= foldInterleaveIntrinsics(I);
}
- if (Opcode == Instruction::Store)
+ if (Opcode == Instruction::Store && !isSPIRV)
MadeChange |= foldSingleElementStore(I);
// If this is an early pipeline invocation of this pass, we are done.
diff --git a/llvm/test/Transforms/VectorCombine/load-insert-store.ll b/llvm/test/Transforms/VectorCombine/load-insert-store.ll
index 93565c1a708eb..0181ec76088bd 100644
--- a/llvm/test/Transforms/VectorCombine/load-insert-store.ll
+++ b/llvm/test/Transforms/VectorCombine/load-insert-store.ll
@@ -1,6 +1,7 @@
; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
; RUN: opt -S -passes=vector-combine -data-layout=e < %s | FileCheck %s
; RUN: opt -S -passes=vector-combine -data-layout=E < %s | FileCheck %s
+; RUN: opt -S -passes=vector-combine -data-layout=E -mtriple=spirv-unknown-vulkan1.3-library %s | FileCheck %s --check-prefix=SPIRV
define void @insert_store(ptr %q, i8 zeroext %s) {
; CHECK-LABEL: @insert_store(
@@ -9,6 +10,13 @@ define void @insert_store(ptr %q, i8 zeroext %s) {
; CHECK-NEXT: store i8 [[S:%.*]], ptr [[TMP0]], align 1
; CHECK-NEXT: ret void
;
+; SPIRV-LABEL: @insert_store(
+; SPIRV-NEXT: entry:
+; SPIRV-NEXT: [[TMP0:%.*]] = load <16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <16 x i8> [[TMP0]], i8 [[S:%.*]], i32 3
+; SPIRV-NEXT: store <16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT: ret void
+;
entry:
%0 = load <16 x i8>, ptr %q
%vecins = insertelement <16 x i8> %0, i8 %s, i32 3
@@ -23,6 +31,13 @@ define void @insert_store_i16_align1(ptr %q, i16 zeroext %s) {
; CHECK-NEXT: store i16 [[S:%.*]], ptr [[TMP0]], align 2
; CHECK-NEXT: ret void
;
+; SPIRV-LABEL: @insert_store_i16_align1(
+; SPIRV-NEXT: entry:
+; SPIRV-NEXT: [[TMP0:%.*]] = load <8 x i16>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <8 x i16> [[TMP0]], i16 [[S:%.*]], i32 3
+; SPIRV-NEXT: store <8 x i16> [[VECINS]], ptr [[Q]], align 1
+; SPIRV-NEXT: ret void
+;
entry:
%0 = load <8 x i16>, ptr %q
%vecins = insertelement <8 x i16> %0, i16 %s, i32 3
@@ -39,6 +54,13 @@ define void @insert_store_outofbounds(ptr %q, i16 zeroext %s) {
; CHECK-NEXT: store <8 x i16> [[VECINS]], ptr [[Q]], align 16
; CHECK-NEXT: ret void
;
+; SPIRV-LABEL: @insert_store_outofbounds(
+; SPIRV-NEXT: entry:
+; SPIRV-NEXT: [[TMP0:%.*]] = load <8 x i16>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <8 x i16> [[TMP0]], i16 [[S:%.*]], i32 9
+; SPIRV-NEXT: store <8 x i16> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT: ret void
+;
entry:
%0 = load <8 x i16>, ptr %q
%vecins = insertelement <8 x i16> %0, i16 %s, i32 9
@@ -53,6 +75,13 @@ define void @insert_store_vscale(ptr %q, i16 zeroext %s) {
; CHECK-NEXT: store i16 [[S:%.*]], ptr [[TMP0]], align 2
; CHECK-NEXT: ret void
;
+; SPIRV-LABEL: @insert_store_vscale(
+; SPIRV-NEXT: entry:
+; SPIRV-NEXT: [[TMP0:%.*]] = load <vscale x 8 x i16>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <vscale x 8 x i16> [[TMP0]], i16 [[S:%.*]], i32 3
+; SPIRV-NEXT: store <vscale x 8 x i16> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT: ret void
+;
entry:
%0 = load <vscale x 8 x i16>, ptr %q
%vecins = insertelement <vscale x 8 x i16> %0, i16 %s, i32 3
@@ -70,6 +99,13 @@ define void @insert_store_vscale_exceeds(ptr %q, i16 zeroext %s) {
; CHECK-NEXT: store <vscale x 8 x i16> [[VECINS]], ptr [[Q]], align 16
; CHECK-NEXT: ret void
;
+; SPIRV-LABEL: @insert_store_vscale_exceeds(
+; SPIRV-NEXT: entry:
+; SPIRV-NEXT: [[TMP0:%.*]] = load <vscale x 8 x i16>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <vscale x 8 x i16> [[TMP0]], i16 [[S:%.*]], i32 9
+; SPIRV-NEXT: store <vscale x 8 x i16> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT: ret void
+;
entry:
%0 = load <vscale x 8 x i16>, ptr %q
%vecins = insertelement <vscale x 8 x i16> %0, i16 %s, i32 9
@@ -85,6 +121,13 @@ define void @insert_store_v9i4(ptr %q, i4 zeroext %s) {
; CHECK-NEXT: store <9 x i4> [[VECINS]], ptr [[Q]], align 1
; CHECK-NEXT: ret void
;
+; SPIRV-LABEL: @insert_store_v9i4(
+; SPIRV-NEXT: entry:
+; SPIRV-NEXT: [[TMP0:%.*]] = load <9 x i4>, ptr [[Q:%.*]], align 8
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <9 x i4> [[TMP0]], i4 [[S:%.*]], i32 3
+; SPIRV-NEXT: store <9 x i4> [[VECINS]], ptr [[Q]], align 1
+; SPIRV-NEXT: ret void
+;
entry:
%0 = load <9 x i4>, ptr %q
%vecins = insertelement <9 x i4> %0, i4 %s, i32 3
@@ -100,6 +143,13 @@ define void @insert_store_v4i27(ptr %q, i27 zeroext %s) {
; CHECK-NEXT: store <4 x i27> [[VECINS]], ptr [[Q]], align 1
; CHECK-NEXT: ret void
;
+; SPIRV-LABEL: @insert_store_v4i27(
+; SPIRV-NEXT: entry:
+; SPIRV-NEXT: [[TMP0:%.*]] = load <4 x i27>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <4 x i27> [[TMP0]], i27 [[S:%.*]], i32 3
+; SPIRV-NEXT: store <4 x i27> [[VECINS]], ptr [[Q]], align 1
+; SPIRV-NEXT: ret void
+;
entry:
%0 = load <4 x i27>, ptr %q
%vecins = insertelement <4 x i27> %0, i27 %s, i32 3
@@ -113,6 +163,12 @@ define void @insert_store_v32i1(ptr %p) {
; CHECK-NEXT: [[INS:%.*]] = insertelement <32 x i1> [[VEC]], i1 true, i64 0
; CHECK-NEXT: store <32 x i1> [[INS]], ptr [[P]], align 4
; CHECK-NEXT: ret void
+;
+; SPIRV-LABEL: @insert_store_v32i1(
+; SPIRV-NEXT: [[VEC:%.*]] = load <32 x i1>, ptr [[P:%.*]], align 4
+; SPIRV-NEXT: [[INS:%.*]] = insertelement <32 x i1> [[VEC]], i1 true, i64 0
+; SPIRV-NEXT: store <32 x i1> [[INS]], ptr [[P]], align 4
+; SPIRV-NEXT: ret void
;
%vec = load <32 x i1>, ptr %p
%ins = insertelement <32 x i1> %vec, i1 true, i64 0
@@ -130,6 +186,15 @@ define void @insert_store_blk_differ(ptr %q, i16 zeroext %s) {
; CHECK-NEXT: store <8 x i16> [[VECINS]], ptr [[Q]], align 16
; CHECK-NEXT: ret void
;
+; SPIRV-LABEL: @insert_store_blk_differ(
+; SPIRV-NEXT: entry:
+; SPIRV-NEXT: [[TMP0:%.*]] = load <8 x i16>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT: br label [[CONT:%.*]]
+; SPIRV: cont:
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <8 x i16> [[TMP0]], i16 [[S:%.*]], i32 3
+; SPIRV-NEXT: store <8 x i16> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT: ret void
+;
entry:
%0 = load <8 x i16>, ptr %q
br label %cont
@@ -147,6 +212,13 @@ define void @insert_store_nonconst(ptr %q, i8 zeroext %s, i32 %idx) {
; CHECK-NEXT: store <16 x i8> [[VECINS]], ptr [[Q]], align 16
; CHECK-NEXT: ret void
;
+; SPIRV-LABEL: @insert_store_nonconst(
+; SPIRV-NEXT: entry:
+; SPIRV-NEXT: [[TMP0:%.*]] = load <16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX:%.*]]
+; SPIRV-NEXT: store <16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT: ret void
+;
entry:
%0 = load <16 x i8>, ptr %q
%vecins = insertelement <16 x i8> %0, i8 %s, i32 %idx
@@ -164,6 +236,13 @@ define void @insert_store_vscale_nonconst(ptr %q, i8 zeroext %s, i32 %idx) {
; CHECK-NEXT: store <vscale x 16 x i8> [[VECINS]], ptr [[Q]], align 16
; CHECK-NEXT: ret void
;
+; SPIRV-LABEL: @insert_store_vscale_nonconst(
+; SPIRV-NEXT: entry:
+; SPIRV-NEXT: [[TMP0:%.*]] = load <vscale x 16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <vscale x 16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX:%.*]]
+; SPIRV-NEXT: store <vscale x 16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT: ret void
+;
entry:
%0 = load <vscale x 16 x i8>, ptr %q
%vecins = insertelement <vscale x 16 x i8> %0, i8 %s, i32 %idx
@@ -181,6 +260,15 @@ define void @insert_store_nonconst_large_alignment(ptr %q, i32 zeroext %s, i32 %
; CHECK-NEXT: store i32 [[S:%.*]], ptr [[TMP0]], align 4
; CHECK-NEXT: ret void
;
+; SPIRV-LABEL: @insert_store_nonconst_large_alignment(
+; SPIRV-NEXT: entry:
+; SPIRV-NEXT: [[CMP:%.*]] = icmp ult i32 [[IDX:%.*]], 4
+; SPIRV-NEXT: call void @llvm.assume(i1 [[CMP]])
+; SPIRV-NEXT: [[I:%.*]] = load <4 x i32>, ptr [[Q:%.*]], align 128
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <4 x i32> [[I]], i32 [[S:%.*]], i32 [[IDX]]
+; SPIRV-NEXT: store <4 x i32> [[VECINS]], ptr [[Q]], align 128
+; SPIRV-NEXT: ret void
+;
entry:
%cmp = icmp ult i32 %idx, 4
call void @llvm.assume(i1 %cmp)
@@ -197,6 +285,14 @@ define void @insert_store_nonconst_align_maximum_8(ptr %q, i64 %s, i32 %idx) {
; CHECK-NEXT: [[TMP1:%.*]] = getelementptr inbounds <8 x i64>, ptr [[Q:%.*]], i32 0, i32 [[IDX]]
; CHECK-NEXT: store i64 [[S:%.*]], ptr [[TMP1]], align 8
; CHECK-NEXT: ret void
+;
+; SPIRV-LABEL: @insert_store_nonconst_align_maximum_8(
+; SPIRV-NEXT: [[CMP:%.*]] = icmp ult i32 [[IDX:%.*]], 2
+; SPIRV-NEXT: call void @llvm.assume(i1 [[CMP]])
+; SPIRV-NEXT: [[I:%.*]] = load <8 x i64>, ptr [[Q:%.*]], align 8
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <8 x i64> [[I]], i64 [[S:%.*]], i32 [[IDX]]
+; SPIRV-NEXT: store <8 x i64> [[VECINS]], ptr [[Q]], align 8
+; SPIRV-NEXT: ret void
;
%cmp = icmp ult i32 %idx, 2
call void @llvm.assume(i1 %cmp)
@@ -213,6 +309,14 @@ define void @insert_store_nonconst_align_maximum_4(ptr %q, i64 %s, i32 %idx) {
; CHECK-NEXT: [[TMP1:%.*]] = getelementptr inbounds <8 x i64>, ptr [[Q:%.*]], i32 0, i32 [[IDX]]
; CHECK-NEXT: store i64 [[S:%.*]], ptr [[TMP1]], align 4
; CHECK-NEXT: ret void
+;
+; SPIRV-LABEL: @insert_store_nonconst_align_maximum_4(
+; SPIRV-NEXT: [[CMP:%.*]] = icmp ult i32 [[IDX:%.*]], 2
+; SPIRV-NEXT: call void @llvm.assume(i1 [[CMP]])
+; SPIRV-NEXT: [[I:%.*]] = load <8 x i64>, ptr [[Q:%.*]], align 4
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <8 x i64> [[I]], i64 [[S:%.*]], i32 [[IDX]]
+; SPIRV-NEXT: store <8 x i64> [[VECINS]], ptr [[Q]], align 4
+; SPIRV-NEXT: ret void
;
%cmp = icmp ult i32 %idx, 2
call void @llvm.assume(i1 %cmp)
@@ -229,6 +333,14 @@ define void @insert_store_nonconst_align_larger(ptr %q, i64 %s, i32 %idx) {
; CHECK-NEXT: [[TMP1:%.*]] = getelementptr inbounds <8 x i64>, ptr [[Q:%.*]], i32 0, i32 [[IDX]]
; CHECK-NEXT: store i64 [[S:%.*]], ptr [[TMP1]], align 4
; CHECK-NEXT: ret void
+;
+; SPIRV-LABEL: @insert_store_nonconst_align_larger(
+; SPIRV-NEXT: [[CMP:%.*]] = icmp ult i32 [[IDX:%.*]], 2
+; SPIRV-NEXT: call void @llvm.assume(i1 [[CMP]])
+; SPIRV-NEXT: [[I:%.*]] = load <8 x i64>, ptr [[Q:%.*]], align 4
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <8 x i64> [[I]], i64 [[S:%.*]], i32 [[IDX]]
+; SPIRV-NEXT: store <8 x i64> [[VECINS]], ptr [[Q]], align 2
+; SPIRV-NEXT: ret void
;
%cmp = icmp ult i32 %idx, 2
call void @llvm.assume(i1 %cmp)
@@ -247,6 +359,15 @@ define void @insert_store_nonconst_index_known_valid_by_assume(ptr %q, i8 zeroex
; CHECK-NEXT: store i8 [[S:%.*]], ptr [[TMP0]], align 1
; CHECK-NEXT: ret void
;
+; SPIRV-LABEL: @insert_store_nonconst_index_known_valid_by_assume(
+; SPIRV-NEXT: entry:
+; SPIRV-NEXT: [[CMP:%.*]] = icmp ult i32 [[IDX:%.*]], 4
+; SPIRV-NEXT: call void @llvm.assume(i1 [[CMP]])
+; SPIRV-NEXT: [[TMP0:%.*]] = load <16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX]]
+; SPIRV-NEXT: store <16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT: ret void
+;
entry:
%cmp = icmp ult i32 %idx, 4
call void @llvm.assume(i1 %cmp)
@@ -267,6 +388,15 @@ define void @insert_store_vscale_nonconst_index_known_valid_by_assume(ptr %q, i8
; CHECK-NEXT: store i8 [[S:%.*]], ptr [[TMP0]], align 1
; CHECK-NEXT: ret void
;
+; SPIRV-LABEL: @insert_store_vscale_nonconst_index_known_valid_by_assume(
+; SPIRV-NEXT: entry:
+; SPIRV-NEXT: [[CMP:%.*]] = icmp ult i32 [[IDX:%.*]], 4
+; SPIRV-NEXT: call void @llvm.assume(i1 [[CMP]])
+; SPIRV-NEXT: [[TMP0:%.*]] = load <vscale x 16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <vscale x 16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX]]
+; SPIRV-NEXT: store <vscale x 16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT: ret void
+;
entry:
%cmp = icmp ult i32 %idx, 4
call void @llvm.assume(i1 %cmp)
@@ -289,6 +419,16 @@ define void @insert_store_nonconst_index_not_known_valid_by_assume_after_load(pt
; CHECK-NEXT: store <16 x i8> [[VECINS]], ptr [[Q]], align 16
; CHECK-NEXT: ret void
;
+; SPIRV-LABEL: @insert_store_nonconst_index_not_known_valid_by_assume_after_load(
+; SPIRV-NEXT: entry:
+; SPIRV-NEXT: [[CMP:%.*]] = icmp ult i32 [[IDX:%.*]], 4
+; SPIRV-NEXT: [[TMP0:%.*]] = load <16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT: call void @maythrow()
+; SPIRV-NEXT: call void @llvm.assume(i1 [[CMP]])
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX]]
+; SPIRV-NEXT: store <16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT: ret void
+;
entry:
%cmp = icmp ult i32 %idx, 4
%0 = load <16 x i8>, ptr %q
@@ -309,6 +449,15 @@ define void @insert_store_nonconst_index_not_known_valid_by_assume(ptr %q, i8 ze
; CHECK-NEXT: store <16 x i8> [[VECINS]], ptr [[Q]], align 16
; CHECK-NEXT: ret void
;
+; SPIRV-LABEL: @insert_store_nonconst_index_not_known_valid_by_assume(
+; SPIRV-NEXT: entry:
+; SPIRV-NEXT: [[CMP:%.*]] = icmp ult i32 [[IDX:%.*]], 17
+; SPIRV-NEXT: call void @llvm.assume(i1 [[CMP]])
+; SPIRV-NEXT: [[TMP0:%.*]] = load <16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX]]
+; SPIRV-NEXT: store <16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT: ret void
+;
entry:
%cmp = icmp ult i32 %idx, 17
call void @llvm.assume(i1 %cmp)
@@ -330,6 +479,15 @@ define void @insert_store_vscale_nonconst_index_not_known_valid_by_assume(ptr %q
; CHECK-NEXT: store <vscale x 16 x i8> [[VECINS]], ptr [[Q]], align 16
; CHECK-NEXT: ret void
;
+; SPIRV-LABEL: @insert_store_vscale_nonconst_index_not_known_valid_by_assume(
+; SPIRV-NEXT: entry:
+; SPIRV-NEXT: [[CMP:%.*]] = icmp ult i32 [[IDX:%.*]], 17
+; SPIRV-NEXT: call void @llvm.assume(i1 [[CMP]])
+; SPIRV-NEXT: [[TMP0:%.*]] = load <vscale x 16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <vscale x 16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX]]
+; SPIRV-NEXT: store <vscale x 16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT: ret void
+;
entry:
%cmp = icmp ult i32 %idx, 17
call void @llvm.assume(i1 %cmp)
@@ -349,6 +507,14 @@ define void @insert_store_nonconst_index_known_noundef_and_valid_by_and(ptr %q,
; CHECK-NEXT: store i8 [[S:%.*]], ptr [[TMP0]], align 1
; CHECK-NEXT: ret void
;
+; SPIRV-LABEL: @insert_store_nonconst_index_known_noundef_and_valid_by_and(
+; SPIRV-NEXT: entry:
+; SPIRV-NEXT: [[TMP0:%.*]] = load <16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT: [[IDX_CLAMPED:%.*]] = and i32 [[IDX:%.*]], 7
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX_CLAMPED]]
+; SPIRV-NEXT: store <16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT: ret void
+;
entry:
%0 = load <16 x i8>, ptr %q
%idx.clamped = and i32 %idx, 7
@@ -367,6 +533,14 @@ define void @insert_store_vscale_nonconst_index_known_noundef_and_valid_by_and(p
; CHECK-NEXT: store i8 [[S:%.*]], ptr [[TMP0]], align 1
; CHECK-NEXT: ret void
;
+; SPIRV-LABEL: @insert_store_vscale_nonconst_index_known_noundef_and_valid_by_and(
+; SPIRV-NEXT: entry:
+; SPIRV-NEXT: [[TMP0:%.*]] = load <vscale x 16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT: [[IDX_CLAMPED:%.*]] = and i32 [[IDX:%.*]], 7
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <vscale x 16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX_CLAMPED]]
+; SPIRV-NEXT: store <vscale x 16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT: ret void
+;
entry:
%0 = load <vscale x 16 x i8>, ptr %q
%idx.clamped = and i32 %idx, 7
@@ -384,6 +558,15 @@ define void @insert_store_nonconst_index_base_frozen_and_valid_by_and(ptr %q, i8
; CHECK-NEXT: store i8 [[S:%.*]], ptr [[TMP0]], align 1
; CHECK-NEXT: ret void
;
+; SPIRV-LABEL: @insert_store_nonconst_index_base_frozen_and_valid_by_and(
+; SPIRV-NEXT: entry:
+; SPIRV-NEXT: [[TMP0:%.*]] = load <16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT: [[IDX_FROZEN:%.*]] = freeze i32 [[IDX:%.*]]
+; SPIRV-NEXT: [[IDX_CLAMPED:%.*]] = and i32 [[IDX_FROZEN]], 7
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX_CLAMPED]]
+; SPIRV-NEXT: store <16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT: ret void
+;
entry:
%0 = load <16 x i8>, ptr %q
%idx.frozen = freeze i32 %idx
@@ -403,6 +586,15 @@ define void @insert_store_nonconst_index_frozen_and_valid_by_and(ptr %q, i8 zero
; CHECK-NEXT: store <16 x i8> [[VECINS]], ptr [[Q]], align 16
; CHECK-NEXT: ret void
;
+; SPIRV-LABEL: @insert_store_nonconst_index_frozen_and_valid_by_and(
+; SPIRV-NEXT: entry:
+; SPIRV-NEXT: [[TMP0:%.*]] = load <16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT: [[IDX_CLAMPED:%.*]] = and i32 [[IDX:%.*]], 7
+; SPIRV-NEXT: [[IDX_CLAMPED_FROZEN:%.*]] = freeze i32 [[IDX_CLAMPED]]
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX_CLAMPED_FROZEN]]
+; SPIRV-NEXT: store <16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT: ret void
+;
entry:
%0 = load <16 x i8>, ptr %q
%idx.clamped = and i32 %idx, 7
@@ -421,6 +613,14 @@ define void @insert_store_nonconst_index_known_valid_by_and_but_may_be_poison(pt
; CHECK-NEXT: store i8 [[S:%.*]], ptr [[TMP0]], align 1
; CHECK-NEXT: ret void
;
+; SPIRV-LABEL: @insert_store_nonconst_index_known_valid_by_and_but_may_be_poison(
+; SPIRV-NEXT: entry:
+; SPIRV-NEXT: [[TMP0:%.*]] = load <16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT: [[IDX_CLAMPED:%.*]] = and i32 [[IDX:%.*]], 7
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX_CLAMPED]]
+; SPIRV-NEXT: store <16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT: ret void
+;
entry:
%0 = load <16 x i8>, ptr %q
%idx.clamped = and i32 %idx, 7
@@ -438,6 +638,14 @@ define void @insert_store_nonconst_index_not_known_valid_by_and(ptr %q, i8 zeroe
; CHECK-NEXT: store <16 x i8> [[VECINS]], ptr [[Q]], align 16
; CHECK-NEXT: ret void
;
+; SPIRV-LABEL: @insert_store_nonconst_index_not_known_valid_by_and(
+; SPIRV-NEXT: entry:
+; SPIRV-NEXT: [[TMP0:%.*]] = load <16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT: [[IDX_CLAMPED:%.*]] = and i32 [[IDX:%.*]], 16
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX_CLAMPED]]
+; SPIRV-NEXT: store <16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT: ret void
+;
entry:
%0 = load <16 x i8>, ptr %q
%idx.clamped = and i32 %idx, 16
@@ -455,6 +663,14 @@ define void @insert_store_nonconst_index_known_noundef_not_known_valid_by_and(pt
; CHECK-NEXT: store <16 x i8> [[VECINS]], ptr [[Q]], align 16
; CHECK-NEXT: ret void
;
+; SPIRV-LABEL: @insert_store_nonconst_index_known_noun...
[truncated]
|
@llvm/pr-subscribers-vectorizers Author: Nathan Gauër (Keenuts) ChangesUsing GEP to index into a vector is not disallowed, but not recommended. Preventing this optimizations from rewritting extract/insert instructions into a GEP helps us lower more code to SPIR-V. This change should be OK as it's only active when targeting SPIR-V and disabling a non-recommended transformation. Related to #145002 Patch is 33.46 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/150216.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/Vectorize/VectorCombine.cpp b/llvm/lib/Transforms/Vectorize/VectorCombine.cpp
index 82adc34fdbd84..20b8165ff280a 100644
--- a/llvm/lib/Transforms/Vectorize/VectorCombine.cpp
+++ b/llvm/lib/Transforms/Vectorize/VectorCombine.cpp
@@ -3750,8 +3750,9 @@ bool VectorCombine::run() {
LLVM_DEBUG(dbgs() << "\n\nVECTORCOMBINE on " << F.getName() << "\n");
+ const bool isSPIRV = F.getParent()->getTargetTriple().isSPIRV();
bool MadeChange = false;
- auto FoldInst = [this, &MadeChange](Instruction &I) {
+ auto FoldInst = [this, &MadeChange, isSPIRV](Instruction &I) {
Builder.SetInsertPoint(&I);
bool IsVectorType = isa<VectorType>(I.getType());
bool IsFixedVectorType = isa<FixedVectorType>(I.getType());
@@ -3780,13 +3781,15 @@ bool VectorCombine::run() {
// TODO: Identify and allow other scalable transforms
if (IsVectorType) {
MadeChange |= scalarizeOpOrCmp(I);
- MadeChange |= scalarizeLoadExtract(I);
- MadeChange |= scalarizeExtExtract(I);
+ if (!isSPIRV) {
+ MadeChange |= scalarizeLoadExtract(I);
+ MadeChange |= scalarizeExtExtract(I);
+ }
MadeChange |= scalarizeVPIntrinsic(I);
MadeChange |= foldInterleaveIntrinsics(I);
}
- if (Opcode == Instruction::Store)
+ if (Opcode == Instruction::Store && !isSPIRV)
MadeChange |= foldSingleElementStore(I);
// If this is an early pipeline invocation of this pass, we are done.
diff --git a/llvm/test/Transforms/VectorCombine/load-insert-store.ll b/llvm/test/Transforms/VectorCombine/load-insert-store.ll
index 93565c1a708eb..0181ec76088bd 100644
--- a/llvm/test/Transforms/VectorCombine/load-insert-store.ll
+++ b/llvm/test/Transforms/VectorCombine/load-insert-store.ll
@@ -1,6 +1,7 @@
; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
; RUN: opt -S -passes=vector-combine -data-layout=e < %s | FileCheck %s
; RUN: opt -S -passes=vector-combine -data-layout=E < %s | FileCheck %s
+; RUN: opt -S -passes=vector-combine -data-layout=E -mtriple=spirv-unknown-vulkan1.3-library %s | FileCheck %s --check-prefix=SPIRV
define void @insert_store(ptr %q, i8 zeroext %s) {
; CHECK-LABEL: @insert_store(
@@ -9,6 +10,13 @@ define void @insert_store(ptr %q, i8 zeroext %s) {
; CHECK-NEXT: store i8 [[S:%.*]], ptr [[TMP0]], align 1
; CHECK-NEXT: ret void
;
+; SPIRV-LABEL: @insert_store(
+; SPIRV-NEXT: entry:
+; SPIRV-NEXT: [[TMP0:%.*]] = load <16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <16 x i8> [[TMP0]], i8 [[S:%.*]], i32 3
+; SPIRV-NEXT: store <16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT: ret void
+;
entry:
%0 = load <16 x i8>, ptr %q
%vecins = insertelement <16 x i8> %0, i8 %s, i32 3
@@ -23,6 +31,13 @@ define void @insert_store_i16_align1(ptr %q, i16 zeroext %s) {
; CHECK-NEXT: store i16 [[S:%.*]], ptr [[TMP0]], align 2
; CHECK-NEXT: ret void
;
+; SPIRV-LABEL: @insert_store_i16_align1(
+; SPIRV-NEXT: entry:
+; SPIRV-NEXT: [[TMP0:%.*]] = load <8 x i16>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <8 x i16> [[TMP0]], i16 [[S:%.*]], i32 3
+; SPIRV-NEXT: store <8 x i16> [[VECINS]], ptr [[Q]], align 1
+; SPIRV-NEXT: ret void
+;
entry:
%0 = load <8 x i16>, ptr %q
%vecins = insertelement <8 x i16> %0, i16 %s, i32 3
@@ -39,6 +54,13 @@ define void @insert_store_outofbounds(ptr %q, i16 zeroext %s) {
; CHECK-NEXT: store <8 x i16> [[VECINS]], ptr [[Q]], align 16
; CHECK-NEXT: ret void
;
+; SPIRV-LABEL: @insert_store_outofbounds(
+; SPIRV-NEXT: entry:
+; SPIRV-NEXT: [[TMP0:%.*]] = load <8 x i16>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <8 x i16> [[TMP0]], i16 [[S:%.*]], i32 9
+; SPIRV-NEXT: store <8 x i16> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT: ret void
+;
entry:
%0 = load <8 x i16>, ptr %q
%vecins = insertelement <8 x i16> %0, i16 %s, i32 9
@@ -53,6 +75,13 @@ define void @insert_store_vscale(ptr %q, i16 zeroext %s) {
; CHECK-NEXT: store i16 [[S:%.*]], ptr [[TMP0]], align 2
; CHECK-NEXT: ret void
;
+; SPIRV-LABEL: @insert_store_vscale(
+; SPIRV-NEXT: entry:
+; SPIRV-NEXT: [[TMP0:%.*]] = load <vscale x 8 x i16>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <vscale x 8 x i16> [[TMP0]], i16 [[S:%.*]], i32 3
+; SPIRV-NEXT: store <vscale x 8 x i16> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT: ret void
+;
entry:
%0 = load <vscale x 8 x i16>, ptr %q
%vecins = insertelement <vscale x 8 x i16> %0, i16 %s, i32 3
@@ -70,6 +99,13 @@ define void @insert_store_vscale_exceeds(ptr %q, i16 zeroext %s) {
; CHECK-NEXT: store <vscale x 8 x i16> [[VECINS]], ptr [[Q]], align 16
; CHECK-NEXT: ret void
;
+; SPIRV-LABEL: @insert_store_vscale_exceeds(
+; SPIRV-NEXT: entry:
+; SPIRV-NEXT: [[TMP0:%.*]] = load <vscale x 8 x i16>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <vscale x 8 x i16> [[TMP0]], i16 [[S:%.*]], i32 9
+; SPIRV-NEXT: store <vscale x 8 x i16> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT: ret void
+;
entry:
%0 = load <vscale x 8 x i16>, ptr %q
%vecins = insertelement <vscale x 8 x i16> %0, i16 %s, i32 9
@@ -85,6 +121,13 @@ define void @insert_store_v9i4(ptr %q, i4 zeroext %s) {
; CHECK-NEXT: store <9 x i4> [[VECINS]], ptr [[Q]], align 1
; CHECK-NEXT: ret void
;
+; SPIRV-LABEL: @insert_store_v9i4(
+; SPIRV-NEXT: entry:
+; SPIRV-NEXT: [[TMP0:%.*]] = load <9 x i4>, ptr [[Q:%.*]], align 8
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <9 x i4> [[TMP0]], i4 [[S:%.*]], i32 3
+; SPIRV-NEXT: store <9 x i4> [[VECINS]], ptr [[Q]], align 1
+; SPIRV-NEXT: ret void
+;
entry:
%0 = load <9 x i4>, ptr %q
%vecins = insertelement <9 x i4> %0, i4 %s, i32 3
@@ -100,6 +143,13 @@ define void @insert_store_v4i27(ptr %q, i27 zeroext %s) {
; CHECK-NEXT: store <4 x i27> [[VECINS]], ptr [[Q]], align 1
; CHECK-NEXT: ret void
;
+; SPIRV-LABEL: @insert_store_v4i27(
+; SPIRV-NEXT: entry:
+; SPIRV-NEXT: [[TMP0:%.*]] = load <4 x i27>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <4 x i27> [[TMP0]], i27 [[S:%.*]], i32 3
+; SPIRV-NEXT: store <4 x i27> [[VECINS]], ptr [[Q]], align 1
+; SPIRV-NEXT: ret void
+;
entry:
%0 = load <4 x i27>, ptr %q
%vecins = insertelement <4 x i27> %0, i27 %s, i32 3
@@ -113,6 +163,12 @@ define void @insert_store_v32i1(ptr %p) {
; CHECK-NEXT: [[INS:%.*]] = insertelement <32 x i1> [[VEC]], i1 true, i64 0
; CHECK-NEXT: store <32 x i1> [[INS]], ptr [[P]], align 4
; CHECK-NEXT: ret void
+;
+; SPIRV-LABEL: @insert_store_v32i1(
+; SPIRV-NEXT: [[VEC:%.*]] = load <32 x i1>, ptr [[P:%.*]], align 4
+; SPIRV-NEXT: [[INS:%.*]] = insertelement <32 x i1> [[VEC]], i1 true, i64 0
+; SPIRV-NEXT: store <32 x i1> [[INS]], ptr [[P]], align 4
+; SPIRV-NEXT: ret void
;
%vec = load <32 x i1>, ptr %p
%ins = insertelement <32 x i1> %vec, i1 true, i64 0
@@ -130,6 +186,15 @@ define void @insert_store_blk_differ(ptr %q, i16 zeroext %s) {
; CHECK-NEXT: store <8 x i16> [[VECINS]], ptr [[Q]], align 16
; CHECK-NEXT: ret void
;
+; SPIRV-LABEL: @insert_store_blk_differ(
+; SPIRV-NEXT: entry:
+; SPIRV-NEXT: [[TMP0:%.*]] = load <8 x i16>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT: br label [[CONT:%.*]]
+; SPIRV: cont:
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <8 x i16> [[TMP0]], i16 [[S:%.*]], i32 3
+; SPIRV-NEXT: store <8 x i16> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT: ret void
+;
entry:
%0 = load <8 x i16>, ptr %q
br label %cont
@@ -147,6 +212,13 @@ define void @insert_store_nonconst(ptr %q, i8 zeroext %s, i32 %idx) {
; CHECK-NEXT: store <16 x i8> [[VECINS]], ptr [[Q]], align 16
; CHECK-NEXT: ret void
;
+; SPIRV-LABEL: @insert_store_nonconst(
+; SPIRV-NEXT: entry:
+; SPIRV-NEXT: [[TMP0:%.*]] = load <16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX:%.*]]
+; SPIRV-NEXT: store <16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT: ret void
+;
entry:
%0 = load <16 x i8>, ptr %q
%vecins = insertelement <16 x i8> %0, i8 %s, i32 %idx
@@ -164,6 +236,13 @@ define void @insert_store_vscale_nonconst(ptr %q, i8 zeroext %s, i32 %idx) {
; CHECK-NEXT: store <vscale x 16 x i8> [[VECINS]], ptr [[Q]], align 16
; CHECK-NEXT: ret void
;
+; SPIRV-LABEL: @insert_store_vscale_nonconst(
+; SPIRV-NEXT: entry:
+; SPIRV-NEXT: [[TMP0:%.*]] = load <vscale x 16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <vscale x 16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX:%.*]]
+; SPIRV-NEXT: store <vscale x 16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT: ret void
+;
entry:
%0 = load <vscale x 16 x i8>, ptr %q
%vecins = insertelement <vscale x 16 x i8> %0, i8 %s, i32 %idx
@@ -181,6 +260,15 @@ define void @insert_store_nonconst_large_alignment(ptr %q, i32 zeroext %s, i32 %
; CHECK-NEXT: store i32 [[S:%.*]], ptr [[TMP0]], align 4
; CHECK-NEXT: ret void
;
+; SPIRV-LABEL: @insert_store_nonconst_large_alignment(
+; SPIRV-NEXT: entry:
+; SPIRV-NEXT: [[CMP:%.*]] = icmp ult i32 [[IDX:%.*]], 4
+; SPIRV-NEXT: call void @llvm.assume(i1 [[CMP]])
+; SPIRV-NEXT: [[I:%.*]] = load <4 x i32>, ptr [[Q:%.*]], align 128
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <4 x i32> [[I]], i32 [[S:%.*]], i32 [[IDX]]
+; SPIRV-NEXT: store <4 x i32> [[VECINS]], ptr [[Q]], align 128
+; SPIRV-NEXT: ret void
+;
entry:
%cmp = icmp ult i32 %idx, 4
call void @llvm.assume(i1 %cmp)
@@ -197,6 +285,14 @@ define void @insert_store_nonconst_align_maximum_8(ptr %q, i64 %s, i32 %idx) {
; CHECK-NEXT: [[TMP1:%.*]] = getelementptr inbounds <8 x i64>, ptr [[Q:%.*]], i32 0, i32 [[IDX]]
; CHECK-NEXT: store i64 [[S:%.*]], ptr [[TMP1]], align 8
; CHECK-NEXT: ret void
+;
+; SPIRV-LABEL: @insert_store_nonconst_align_maximum_8(
+; SPIRV-NEXT: [[CMP:%.*]] = icmp ult i32 [[IDX:%.*]], 2
+; SPIRV-NEXT: call void @llvm.assume(i1 [[CMP]])
+; SPIRV-NEXT: [[I:%.*]] = load <8 x i64>, ptr [[Q:%.*]], align 8
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <8 x i64> [[I]], i64 [[S:%.*]], i32 [[IDX]]
+; SPIRV-NEXT: store <8 x i64> [[VECINS]], ptr [[Q]], align 8
+; SPIRV-NEXT: ret void
;
%cmp = icmp ult i32 %idx, 2
call void @llvm.assume(i1 %cmp)
@@ -213,6 +309,14 @@ define void @insert_store_nonconst_align_maximum_4(ptr %q, i64 %s, i32 %idx) {
; CHECK-NEXT: [[TMP1:%.*]] = getelementptr inbounds <8 x i64>, ptr [[Q:%.*]], i32 0, i32 [[IDX]]
; CHECK-NEXT: store i64 [[S:%.*]], ptr [[TMP1]], align 4
; CHECK-NEXT: ret void
+;
+; SPIRV-LABEL: @insert_store_nonconst_align_maximum_4(
+; SPIRV-NEXT: [[CMP:%.*]] = icmp ult i32 [[IDX:%.*]], 2
+; SPIRV-NEXT: call void @llvm.assume(i1 [[CMP]])
+; SPIRV-NEXT: [[I:%.*]] = load <8 x i64>, ptr [[Q:%.*]], align 4
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <8 x i64> [[I]], i64 [[S:%.*]], i32 [[IDX]]
+; SPIRV-NEXT: store <8 x i64> [[VECINS]], ptr [[Q]], align 4
+; SPIRV-NEXT: ret void
;
%cmp = icmp ult i32 %idx, 2
call void @llvm.assume(i1 %cmp)
@@ -229,6 +333,14 @@ define void @insert_store_nonconst_align_larger(ptr %q, i64 %s, i32 %idx) {
; CHECK-NEXT: [[TMP1:%.*]] = getelementptr inbounds <8 x i64>, ptr [[Q:%.*]], i32 0, i32 [[IDX]]
; CHECK-NEXT: store i64 [[S:%.*]], ptr [[TMP1]], align 4
; CHECK-NEXT: ret void
+;
+; SPIRV-LABEL: @insert_store_nonconst_align_larger(
+; SPIRV-NEXT: [[CMP:%.*]] = icmp ult i32 [[IDX:%.*]], 2
+; SPIRV-NEXT: call void @llvm.assume(i1 [[CMP]])
+; SPIRV-NEXT: [[I:%.*]] = load <8 x i64>, ptr [[Q:%.*]], align 4
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <8 x i64> [[I]], i64 [[S:%.*]], i32 [[IDX]]
+; SPIRV-NEXT: store <8 x i64> [[VECINS]], ptr [[Q]], align 2
+; SPIRV-NEXT: ret void
;
%cmp = icmp ult i32 %idx, 2
call void @llvm.assume(i1 %cmp)
@@ -247,6 +359,15 @@ define void @insert_store_nonconst_index_known_valid_by_assume(ptr %q, i8 zeroex
; CHECK-NEXT: store i8 [[S:%.*]], ptr [[TMP0]], align 1
; CHECK-NEXT: ret void
;
+; SPIRV-LABEL: @insert_store_nonconst_index_known_valid_by_assume(
+; SPIRV-NEXT: entry:
+; SPIRV-NEXT: [[CMP:%.*]] = icmp ult i32 [[IDX:%.*]], 4
+; SPIRV-NEXT: call void @llvm.assume(i1 [[CMP]])
+; SPIRV-NEXT: [[TMP0:%.*]] = load <16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX]]
+; SPIRV-NEXT: store <16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT: ret void
+;
entry:
%cmp = icmp ult i32 %idx, 4
call void @llvm.assume(i1 %cmp)
@@ -267,6 +388,15 @@ define void @insert_store_vscale_nonconst_index_known_valid_by_assume(ptr %q, i8
; CHECK-NEXT: store i8 [[S:%.*]], ptr [[TMP0]], align 1
; CHECK-NEXT: ret void
;
+; SPIRV-LABEL: @insert_store_vscale_nonconst_index_known_valid_by_assume(
+; SPIRV-NEXT: entry:
+; SPIRV-NEXT: [[CMP:%.*]] = icmp ult i32 [[IDX:%.*]], 4
+; SPIRV-NEXT: call void @llvm.assume(i1 [[CMP]])
+; SPIRV-NEXT: [[TMP0:%.*]] = load <vscale x 16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <vscale x 16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX]]
+; SPIRV-NEXT: store <vscale x 16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT: ret void
+;
entry:
%cmp = icmp ult i32 %idx, 4
call void @llvm.assume(i1 %cmp)
@@ -289,6 +419,16 @@ define void @insert_store_nonconst_index_not_known_valid_by_assume_after_load(pt
; CHECK-NEXT: store <16 x i8> [[VECINS]], ptr [[Q]], align 16
; CHECK-NEXT: ret void
;
+; SPIRV-LABEL: @insert_store_nonconst_index_not_known_valid_by_assume_after_load(
+; SPIRV-NEXT: entry:
+; SPIRV-NEXT: [[CMP:%.*]] = icmp ult i32 [[IDX:%.*]], 4
+; SPIRV-NEXT: [[TMP0:%.*]] = load <16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT: call void @maythrow()
+; SPIRV-NEXT: call void @llvm.assume(i1 [[CMP]])
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX]]
+; SPIRV-NEXT: store <16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT: ret void
+;
entry:
%cmp = icmp ult i32 %idx, 4
%0 = load <16 x i8>, ptr %q
@@ -309,6 +449,15 @@ define void @insert_store_nonconst_index_not_known_valid_by_assume(ptr %q, i8 ze
; CHECK-NEXT: store <16 x i8> [[VECINS]], ptr [[Q]], align 16
; CHECK-NEXT: ret void
;
+; SPIRV-LABEL: @insert_store_nonconst_index_not_known_valid_by_assume(
+; SPIRV-NEXT: entry:
+; SPIRV-NEXT: [[CMP:%.*]] = icmp ult i32 [[IDX:%.*]], 17
+; SPIRV-NEXT: call void @llvm.assume(i1 [[CMP]])
+; SPIRV-NEXT: [[TMP0:%.*]] = load <16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX]]
+; SPIRV-NEXT: store <16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT: ret void
+;
entry:
%cmp = icmp ult i32 %idx, 17
call void @llvm.assume(i1 %cmp)
@@ -330,6 +479,15 @@ define void @insert_store_vscale_nonconst_index_not_known_valid_by_assume(ptr %q
; CHECK-NEXT: store <vscale x 16 x i8> [[VECINS]], ptr [[Q]], align 16
; CHECK-NEXT: ret void
;
+; SPIRV-LABEL: @insert_store_vscale_nonconst_index_not_known_valid_by_assume(
+; SPIRV-NEXT: entry:
+; SPIRV-NEXT: [[CMP:%.*]] = icmp ult i32 [[IDX:%.*]], 17
+; SPIRV-NEXT: call void @llvm.assume(i1 [[CMP]])
+; SPIRV-NEXT: [[TMP0:%.*]] = load <vscale x 16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <vscale x 16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX]]
+; SPIRV-NEXT: store <vscale x 16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT: ret void
+;
entry:
%cmp = icmp ult i32 %idx, 17
call void @llvm.assume(i1 %cmp)
@@ -349,6 +507,14 @@ define void @insert_store_nonconst_index_known_noundef_and_valid_by_and(ptr %q,
; CHECK-NEXT: store i8 [[S:%.*]], ptr [[TMP0]], align 1
; CHECK-NEXT: ret void
;
+; SPIRV-LABEL: @insert_store_nonconst_index_known_noundef_and_valid_by_and(
+; SPIRV-NEXT: entry:
+; SPIRV-NEXT: [[TMP0:%.*]] = load <16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT: [[IDX_CLAMPED:%.*]] = and i32 [[IDX:%.*]], 7
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX_CLAMPED]]
+; SPIRV-NEXT: store <16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT: ret void
+;
entry:
%0 = load <16 x i8>, ptr %q
%idx.clamped = and i32 %idx, 7
@@ -367,6 +533,14 @@ define void @insert_store_vscale_nonconst_index_known_noundef_and_valid_by_and(p
; CHECK-NEXT: store i8 [[S:%.*]], ptr [[TMP0]], align 1
; CHECK-NEXT: ret void
;
+; SPIRV-LABEL: @insert_store_vscale_nonconst_index_known_noundef_and_valid_by_and(
+; SPIRV-NEXT: entry:
+; SPIRV-NEXT: [[TMP0:%.*]] = load <vscale x 16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT: [[IDX_CLAMPED:%.*]] = and i32 [[IDX:%.*]], 7
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <vscale x 16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX_CLAMPED]]
+; SPIRV-NEXT: store <vscale x 16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT: ret void
+;
entry:
%0 = load <vscale x 16 x i8>, ptr %q
%idx.clamped = and i32 %idx, 7
@@ -384,6 +558,15 @@ define void @insert_store_nonconst_index_base_frozen_and_valid_by_and(ptr %q, i8
; CHECK-NEXT: store i8 [[S:%.*]], ptr [[TMP0]], align 1
; CHECK-NEXT: ret void
;
+; SPIRV-LABEL: @insert_store_nonconst_index_base_frozen_and_valid_by_and(
+; SPIRV-NEXT: entry:
+; SPIRV-NEXT: [[TMP0:%.*]] = load <16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT: [[IDX_FROZEN:%.*]] = freeze i32 [[IDX:%.*]]
+; SPIRV-NEXT: [[IDX_CLAMPED:%.*]] = and i32 [[IDX_FROZEN]], 7
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX_CLAMPED]]
+; SPIRV-NEXT: store <16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT: ret void
+;
entry:
%0 = load <16 x i8>, ptr %q
%idx.frozen = freeze i32 %idx
@@ -403,6 +586,15 @@ define void @insert_store_nonconst_index_frozen_and_valid_by_and(ptr %q, i8 zero
; CHECK-NEXT: store <16 x i8> [[VECINS]], ptr [[Q]], align 16
; CHECK-NEXT: ret void
;
+; SPIRV-LABEL: @insert_store_nonconst_index_frozen_and_valid_by_and(
+; SPIRV-NEXT: entry:
+; SPIRV-NEXT: [[TMP0:%.*]] = load <16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT: [[IDX_CLAMPED:%.*]] = and i32 [[IDX:%.*]], 7
+; SPIRV-NEXT: [[IDX_CLAMPED_FROZEN:%.*]] = freeze i32 [[IDX_CLAMPED]]
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX_CLAMPED_FROZEN]]
+; SPIRV-NEXT: store <16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT: ret void
+;
entry:
%0 = load <16 x i8>, ptr %q
%idx.clamped = and i32 %idx, 7
@@ -421,6 +613,14 @@ define void @insert_store_nonconst_index_known_valid_by_and_but_may_be_poison(pt
; CHECK-NEXT: store i8 [[S:%.*]], ptr [[TMP0]], align 1
; CHECK-NEXT: ret void
;
+; SPIRV-LABEL: @insert_store_nonconst_index_known_valid_by_and_but_may_be_poison(
+; SPIRV-NEXT: entry:
+; SPIRV-NEXT: [[TMP0:%.*]] = load <16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT: [[IDX_CLAMPED:%.*]] = and i32 [[IDX:%.*]], 7
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX_CLAMPED]]
+; SPIRV-NEXT: store <16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT: ret void
+;
entry:
%0 = load <16 x i8>, ptr %q
%idx.clamped = and i32 %idx, 7
@@ -438,6 +638,14 @@ define void @insert_store_nonconst_index_not_known_valid_by_and(ptr %q, i8 zeroe
; CHECK-NEXT: store <16 x i8> [[VECINS]], ptr [[Q]], align 16
; CHECK-NEXT: ret void
;
+; SPIRV-LABEL: @insert_store_nonconst_index_not_known_valid_by_and(
+; SPIRV-NEXT: entry:
+; SPIRV-NEXT: [[TMP0:%.*]] = load <16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT: [[IDX_CLAMPED:%.*]] = and i32 [[IDX:%.*]], 16
+; SPIRV-NEXT: [[VECINS:%.*]] = insertelement <16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX_CLAMPED]]
+; SPIRV-NEXT: store <16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT: ret void
+;
entry:
%0 = load <16 x i8>, ptr %q
%idx.clamped = and i32 %idx, 16
@@ -455,6 +663,14 @@ define void @insert_store_nonconst_index_known_noundef_not_known_valid_by_and(pt
; CHECK-NEXT: store <16 x i8> [[VECINS]], ptr [[Q]], align 16
; CHECK-NEXT: ret void
;
+; SPIRV-LABEL: @insert_store_nonconst_index_known_noun...
[truncated]
|
Is there no TTI hook we can use? |
I don't thing there is a good hook from what I've seen.
We might use |
For the same reason, we use |
This is one of the solutions I'm exploring to fix the issue long-term, and for now our best one. But I need to evaluate the impact on optimizations (as we require some for legalization). So getting structured gep in a robust way will take time (hence why this PR) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I was discussing this with @Keenuts, I was wondering why we do this transformation at all considering https://discourse.llvm.org/t/status-of-geps-into-vectors-of-overaligned-elements/67497.
I don't know the state of that RFC. @jsilvanus Is anything happening with that RFC, and how should this pass be creating GEPs into the vectors?
@RKSimon Would it be OK to move forward with a target-specific switch? |
I talked with Nathan. I'm in favour of a hook that could be used for this. |
Sure a TTI::disableVectorElementAccessUsingGEP() with SPIRV override seems cleaner to me. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Waiting on TTI::disableVectorElementAccessUsingGEP() refactor
Thanks, added the TTI hook, PTAL 😊 |
All done, the suggested name is fine for me, thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - cheers
Using GEP to index into a vector is not disallowed, but not recommended. The SPIR-V backend needs to generate structured access into types, which is impossible with an untyped GEP instruction unless we add more info to the IR. Finding a solution is a work-in-progress, but in the meantime, we'd like to reduce the amount of failures. Preventing this optimizations from rewritting extract/insert instructions into a GEP helps us lower more code to SPIR-V. This change should be OK as it's only active when targeting SPIR-V and disabling a non-recommended transformation. Related to llvm#145002
c149e33
to
7756809
Compare
rebased on main, merging once CI is green, thanks for the reviews! |
Using GEP to index into a vector is not disallowed, but not recommended.
The SPIR-V backend needs to generate structured access into types, which is impossible with an untyped GEP instruction unless we add more info to the IR. Finding a solution is a work-in-progress, but in the meantime, we'd like to reduce the amount of failures.
Preventing this optimizations from rewritting extract/insert instructions into a GEP helps us lower more code to SPIR-V. This change should be OK as it's only active when targeting SPIR-V and disabling a non-recommended transformation.
Related to #145002