Skip to content

[VectorCombine][TTI] Prevent extract/ins rewrite to GEP #150216

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jul 31, 2025

Conversation

Keenuts
Copy link
Contributor

@Keenuts Keenuts commented Jul 23, 2025

Using GEP to index into a vector is not disallowed, but not recommended.
The SPIR-V backend needs to generate structured access into types, which is impossible with an untyped GEP instruction unless we add more info to the IR. Finding a solution is a work-in-progress, but in the meantime, we'd like to reduce the amount of failures.

Preventing this optimizations from rewritting extract/insert instructions into a GEP helps us lower more code to SPIR-V. This change should be OK as it's only active when targeting SPIR-V and disabling a non-recommended transformation.

Related to #145002

@llvmbot
Copy link
Member

llvmbot commented Jul 23, 2025

@llvm/pr-subscribers-llvm-analysis
@llvm/pr-subscribers-backend-spir-v

@llvm/pr-subscribers-llvm-transforms

Author: Nathan Gauër (Keenuts)

Changes

Using GEP to index into a vector is not disallowed, but not recommended.
The SPIR-V backend needs to generate structured access into types, which is impossible with an untyped GEP instruction unless we add more info to the IR. Finding a solution is a work-in-progress, but in the meantime, we'd like to reduce the amount of failures.

Preventing this optimizations from rewritting extract/insert instructions into a GEP helps us lower more code to SPIR-V. This change should be OK as it's only active when targeting SPIR-V and disabling a non-recommended transformation.

Related to #145002


Patch is 33.46 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/150216.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/Vectorize/VectorCombine.cpp (+7-4)
  • (modified) llvm/test/Transforms/VectorCombine/load-insert-store.ll (+382)
diff --git a/llvm/lib/Transforms/Vectorize/VectorCombine.cpp b/llvm/lib/Transforms/Vectorize/VectorCombine.cpp
index 82adc34fdbd84..20b8165ff280a 100644
--- a/llvm/lib/Transforms/Vectorize/VectorCombine.cpp
+++ b/llvm/lib/Transforms/Vectorize/VectorCombine.cpp
@@ -3750,8 +3750,9 @@ bool VectorCombine::run() {
 
   LLVM_DEBUG(dbgs() << "\n\nVECTORCOMBINE on " << F.getName() << "\n");
 
+  const bool isSPIRV = F.getParent()->getTargetTriple().isSPIRV();
   bool MadeChange = false;
-  auto FoldInst = [this, &MadeChange](Instruction &I) {
+  auto FoldInst = [this, &MadeChange, isSPIRV](Instruction &I) {
     Builder.SetInsertPoint(&I);
     bool IsVectorType = isa<VectorType>(I.getType());
     bool IsFixedVectorType = isa<FixedVectorType>(I.getType());
@@ -3780,13 +3781,15 @@ bool VectorCombine::run() {
     // TODO: Identify and allow other scalable transforms
     if (IsVectorType) {
       MadeChange |= scalarizeOpOrCmp(I);
-      MadeChange |= scalarizeLoadExtract(I);
-      MadeChange |= scalarizeExtExtract(I);
+      if (!isSPIRV) {
+        MadeChange |= scalarizeLoadExtract(I);
+        MadeChange |= scalarizeExtExtract(I);
+      }
       MadeChange |= scalarizeVPIntrinsic(I);
       MadeChange |= foldInterleaveIntrinsics(I);
     }
 
-    if (Opcode == Instruction::Store)
+    if (Opcode == Instruction::Store && !isSPIRV)
       MadeChange |= foldSingleElementStore(I);
 
     // If this is an early pipeline invocation of this pass, we are done.
diff --git a/llvm/test/Transforms/VectorCombine/load-insert-store.ll b/llvm/test/Transforms/VectorCombine/load-insert-store.ll
index 93565c1a708eb..0181ec76088bd 100644
--- a/llvm/test/Transforms/VectorCombine/load-insert-store.ll
+++ b/llvm/test/Transforms/VectorCombine/load-insert-store.ll
@@ -1,6 +1,7 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
 ; RUN: opt -S -passes=vector-combine -data-layout=e < %s | FileCheck %s
 ; RUN: opt -S -passes=vector-combine -data-layout=E < %s | FileCheck %s
+; RUN: opt -S -passes=vector-combine -data-layout=E -mtriple=spirv-unknown-vulkan1.3-library %s | FileCheck %s --check-prefix=SPIRV
 
 define void @insert_store(ptr %q, i8 zeroext %s) {
 ; CHECK-LABEL: @insert_store(
@@ -9,6 +10,13 @@ define void @insert_store(ptr %q, i8 zeroext %s) {
 ; CHECK-NEXT:    store i8 [[S:%.*]], ptr [[TMP0]], align 1
 ; CHECK-NEXT:    ret void
 ;
+; SPIRV-LABEL: @insert_store(
+; SPIRV-NEXT:  entry:
+; SPIRV-NEXT:    [[TMP0:%.*]] = load <16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <16 x i8> [[TMP0]], i8 [[S:%.*]], i32 3
+; SPIRV-NEXT:    store <16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT:    ret void
+;
 entry:
   %0 = load <16 x i8>, ptr %q
   %vecins = insertelement <16 x i8> %0, i8 %s, i32 3
@@ -23,6 +31,13 @@ define void @insert_store_i16_align1(ptr %q, i16 zeroext %s) {
 ; CHECK-NEXT:    store i16 [[S:%.*]], ptr [[TMP0]], align 2
 ; CHECK-NEXT:    ret void
 ;
+; SPIRV-LABEL: @insert_store_i16_align1(
+; SPIRV-NEXT:  entry:
+; SPIRV-NEXT:    [[TMP0:%.*]] = load <8 x i16>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <8 x i16> [[TMP0]], i16 [[S:%.*]], i32 3
+; SPIRV-NEXT:    store <8 x i16> [[VECINS]], ptr [[Q]], align 1
+; SPIRV-NEXT:    ret void
+;
 entry:
   %0 = load <8 x i16>, ptr %q
   %vecins = insertelement <8 x i16> %0, i16 %s, i32 3
@@ -39,6 +54,13 @@ define void @insert_store_outofbounds(ptr %q, i16 zeroext %s) {
 ; CHECK-NEXT:    store <8 x i16> [[VECINS]], ptr [[Q]], align 16
 ; CHECK-NEXT:    ret void
 ;
+; SPIRV-LABEL: @insert_store_outofbounds(
+; SPIRV-NEXT:  entry:
+; SPIRV-NEXT:    [[TMP0:%.*]] = load <8 x i16>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <8 x i16> [[TMP0]], i16 [[S:%.*]], i32 9
+; SPIRV-NEXT:    store <8 x i16> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT:    ret void
+;
 entry:
   %0 = load <8 x i16>, ptr %q
   %vecins = insertelement <8 x i16> %0, i16 %s, i32 9
@@ -53,6 +75,13 @@ define void @insert_store_vscale(ptr %q, i16 zeroext %s) {
 ; CHECK-NEXT:    store i16 [[S:%.*]], ptr [[TMP0]], align 2
 ; CHECK-NEXT:    ret void
 ;
+; SPIRV-LABEL: @insert_store_vscale(
+; SPIRV-NEXT:  entry:
+; SPIRV-NEXT:    [[TMP0:%.*]] = load <vscale x 8 x i16>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <vscale x 8 x i16> [[TMP0]], i16 [[S:%.*]], i32 3
+; SPIRV-NEXT:    store <vscale x 8 x i16> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT:    ret void
+;
 entry:
   %0 = load <vscale x 8 x i16>, ptr %q
   %vecins = insertelement <vscale x 8 x i16> %0, i16 %s, i32 3
@@ -70,6 +99,13 @@ define void @insert_store_vscale_exceeds(ptr %q, i16 zeroext %s) {
 ; CHECK-NEXT:    store <vscale x 8 x i16> [[VECINS]], ptr [[Q]], align 16
 ; CHECK-NEXT:    ret void
 ;
+; SPIRV-LABEL: @insert_store_vscale_exceeds(
+; SPIRV-NEXT:  entry:
+; SPIRV-NEXT:    [[TMP0:%.*]] = load <vscale x 8 x i16>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <vscale x 8 x i16> [[TMP0]], i16 [[S:%.*]], i32 9
+; SPIRV-NEXT:    store <vscale x 8 x i16> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT:    ret void
+;
 entry:
   %0 = load <vscale x 8 x i16>, ptr %q
   %vecins = insertelement <vscale x 8 x i16> %0, i16 %s, i32 9
@@ -85,6 +121,13 @@ define void @insert_store_v9i4(ptr %q, i4 zeroext %s) {
 ; CHECK-NEXT:    store <9 x i4> [[VECINS]], ptr [[Q]], align 1
 ; CHECK-NEXT:    ret void
 ;
+; SPIRV-LABEL: @insert_store_v9i4(
+; SPIRV-NEXT:  entry:
+; SPIRV-NEXT:    [[TMP0:%.*]] = load <9 x i4>, ptr [[Q:%.*]], align 8
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <9 x i4> [[TMP0]], i4 [[S:%.*]], i32 3
+; SPIRV-NEXT:    store <9 x i4> [[VECINS]], ptr [[Q]], align 1
+; SPIRV-NEXT:    ret void
+;
 entry:
   %0 = load <9 x i4>, ptr %q
   %vecins = insertelement <9 x i4> %0, i4 %s, i32 3
@@ -100,6 +143,13 @@ define void @insert_store_v4i27(ptr %q, i27 zeroext %s) {
 ; CHECK-NEXT:    store <4 x i27> [[VECINS]], ptr [[Q]], align 1
 ; CHECK-NEXT:    ret void
 ;
+; SPIRV-LABEL: @insert_store_v4i27(
+; SPIRV-NEXT:  entry:
+; SPIRV-NEXT:    [[TMP0:%.*]] = load <4 x i27>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <4 x i27> [[TMP0]], i27 [[S:%.*]], i32 3
+; SPIRV-NEXT:    store <4 x i27> [[VECINS]], ptr [[Q]], align 1
+; SPIRV-NEXT:    ret void
+;
 entry:
   %0 = load <4 x i27>, ptr %q
   %vecins = insertelement <4 x i27> %0, i27 %s, i32 3
@@ -113,6 +163,12 @@ define void @insert_store_v32i1(ptr %p) {
 ; CHECK-NEXT:    [[INS:%.*]] = insertelement <32 x i1> [[VEC]], i1 true, i64 0
 ; CHECK-NEXT:    store <32 x i1> [[INS]], ptr [[P]], align 4
 ; CHECK-NEXT:    ret void
+;
+; SPIRV-LABEL: @insert_store_v32i1(
+; SPIRV-NEXT:    [[VEC:%.*]] = load <32 x i1>, ptr [[P:%.*]], align 4
+; SPIRV-NEXT:    [[INS:%.*]] = insertelement <32 x i1> [[VEC]], i1 true, i64 0
+; SPIRV-NEXT:    store <32 x i1> [[INS]], ptr [[P]], align 4
+; SPIRV-NEXT:    ret void
 ;
   %vec = load <32 x i1>, ptr %p
   %ins = insertelement <32 x i1> %vec, i1 true, i64 0
@@ -130,6 +186,15 @@ define void @insert_store_blk_differ(ptr %q, i16 zeroext %s) {
 ; CHECK-NEXT:    store <8 x i16> [[VECINS]], ptr [[Q]], align 16
 ; CHECK-NEXT:    ret void
 ;
+; SPIRV-LABEL: @insert_store_blk_differ(
+; SPIRV-NEXT:  entry:
+; SPIRV-NEXT:    [[TMP0:%.*]] = load <8 x i16>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT:    br label [[CONT:%.*]]
+; SPIRV:       cont:
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <8 x i16> [[TMP0]], i16 [[S:%.*]], i32 3
+; SPIRV-NEXT:    store <8 x i16> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT:    ret void
+;
 entry:
   %0 = load <8 x i16>, ptr %q
   br label %cont
@@ -147,6 +212,13 @@ define void @insert_store_nonconst(ptr %q, i8 zeroext %s, i32 %idx) {
 ; CHECK-NEXT:    store <16 x i8> [[VECINS]], ptr [[Q]], align 16
 ; CHECK-NEXT:    ret void
 ;
+; SPIRV-LABEL: @insert_store_nonconst(
+; SPIRV-NEXT:  entry:
+; SPIRV-NEXT:    [[TMP0:%.*]] = load <16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX:%.*]]
+; SPIRV-NEXT:    store <16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT:    ret void
+;
 entry:
   %0 = load <16 x i8>, ptr %q
   %vecins = insertelement <16 x i8> %0, i8 %s, i32 %idx
@@ -164,6 +236,13 @@ define void @insert_store_vscale_nonconst(ptr %q, i8 zeroext %s, i32 %idx) {
 ; CHECK-NEXT:    store <vscale x 16 x i8> [[VECINS]], ptr [[Q]], align 16
 ; CHECK-NEXT:    ret void
 ;
+; SPIRV-LABEL: @insert_store_vscale_nonconst(
+; SPIRV-NEXT:  entry:
+; SPIRV-NEXT:    [[TMP0:%.*]] = load <vscale x 16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <vscale x 16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX:%.*]]
+; SPIRV-NEXT:    store <vscale x 16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT:    ret void
+;
 entry:
   %0 = load <vscale x 16 x i8>, ptr %q
   %vecins = insertelement <vscale x 16 x i8> %0, i8 %s, i32 %idx
@@ -181,6 +260,15 @@ define void @insert_store_nonconst_large_alignment(ptr %q, i32 zeroext %s, i32 %
 ; CHECK-NEXT:    store i32 [[S:%.*]], ptr [[TMP0]], align 4
 ; CHECK-NEXT:    ret void
 ;
+; SPIRV-LABEL: @insert_store_nonconst_large_alignment(
+; SPIRV-NEXT:  entry:
+; SPIRV-NEXT:    [[CMP:%.*]] = icmp ult i32 [[IDX:%.*]], 4
+; SPIRV-NEXT:    call void @llvm.assume(i1 [[CMP]])
+; SPIRV-NEXT:    [[I:%.*]] = load <4 x i32>, ptr [[Q:%.*]], align 128
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <4 x i32> [[I]], i32 [[S:%.*]], i32 [[IDX]]
+; SPIRV-NEXT:    store <4 x i32> [[VECINS]], ptr [[Q]], align 128
+; SPIRV-NEXT:    ret void
+;
 entry:
   %cmp = icmp ult i32 %idx, 4
   call void @llvm.assume(i1 %cmp)
@@ -197,6 +285,14 @@ define void @insert_store_nonconst_align_maximum_8(ptr %q, i64 %s, i32 %idx) {
 ; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr inbounds <8 x i64>, ptr [[Q:%.*]], i32 0, i32 [[IDX]]
 ; CHECK-NEXT:    store i64 [[S:%.*]], ptr [[TMP1]], align 8
 ; CHECK-NEXT:    ret void
+;
+; SPIRV-LABEL: @insert_store_nonconst_align_maximum_8(
+; SPIRV-NEXT:    [[CMP:%.*]] = icmp ult i32 [[IDX:%.*]], 2
+; SPIRV-NEXT:    call void @llvm.assume(i1 [[CMP]])
+; SPIRV-NEXT:    [[I:%.*]] = load <8 x i64>, ptr [[Q:%.*]], align 8
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <8 x i64> [[I]], i64 [[S:%.*]], i32 [[IDX]]
+; SPIRV-NEXT:    store <8 x i64> [[VECINS]], ptr [[Q]], align 8
+; SPIRV-NEXT:    ret void
 ;
   %cmp = icmp ult i32 %idx, 2
   call void @llvm.assume(i1 %cmp)
@@ -213,6 +309,14 @@ define void @insert_store_nonconst_align_maximum_4(ptr %q, i64 %s, i32 %idx) {
 ; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr inbounds <8 x i64>, ptr [[Q:%.*]], i32 0, i32 [[IDX]]
 ; CHECK-NEXT:    store i64 [[S:%.*]], ptr [[TMP1]], align 4
 ; CHECK-NEXT:    ret void
+;
+; SPIRV-LABEL: @insert_store_nonconst_align_maximum_4(
+; SPIRV-NEXT:    [[CMP:%.*]] = icmp ult i32 [[IDX:%.*]], 2
+; SPIRV-NEXT:    call void @llvm.assume(i1 [[CMP]])
+; SPIRV-NEXT:    [[I:%.*]] = load <8 x i64>, ptr [[Q:%.*]], align 4
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <8 x i64> [[I]], i64 [[S:%.*]], i32 [[IDX]]
+; SPIRV-NEXT:    store <8 x i64> [[VECINS]], ptr [[Q]], align 4
+; SPIRV-NEXT:    ret void
 ;
   %cmp = icmp ult i32 %idx, 2
   call void @llvm.assume(i1 %cmp)
@@ -229,6 +333,14 @@ define void @insert_store_nonconst_align_larger(ptr %q, i64 %s, i32 %idx) {
 ; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr inbounds <8 x i64>, ptr [[Q:%.*]], i32 0, i32 [[IDX]]
 ; CHECK-NEXT:    store i64 [[S:%.*]], ptr [[TMP1]], align 4
 ; CHECK-NEXT:    ret void
+;
+; SPIRV-LABEL: @insert_store_nonconst_align_larger(
+; SPIRV-NEXT:    [[CMP:%.*]] = icmp ult i32 [[IDX:%.*]], 2
+; SPIRV-NEXT:    call void @llvm.assume(i1 [[CMP]])
+; SPIRV-NEXT:    [[I:%.*]] = load <8 x i64>, ptr [[Q:%.*]], align 4
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <8 x i64> [[I]], i64 [[S:%.*]], i32 [[IDX]]
+; SPIRV-NEXT:    store <8 x i64> [[VECINS]], ptr [[Q]], align 2
+; SPIRV-NEXT:    ret void
 ;
   %cmp = icmp ult i32 %idx, 2
   call void @llvm.assume(i1 %cmp)
@@ -247,6 +359,15 @@ define void @insert_store_nonconst_index_known_valid_by_assume(ptr %q, i8 zeroex
 ; CHECK-NEXT:    store i8 [[S:%.*]], ptr [[TMP0]], align 1
 ; CHECK-NEXT:    ret void
 ;
+; SPIRV-LABEL: @insert_store_nonconst_index_known_valid_by_assume(
+; SPIRV-NEXT:  entry:
+; SPIRV-NEXT:    [[CMP:%.*]] = icmp ult i32 [[IDX:%.*]], 4
+; SPIRV-NEXT:    call void @llvm.assume(i1 [[CMP]])
+; SPIRV-NEXT:    [[TMP0:%.*]] = load <16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX]]
+; SPIRV-NEXT:    store <16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT:    ret void
+;
 entry:
   %cmp = icmp ult i32 %idx, 4
   call void @llvm.assume(i1 %cmp)
@@ -267,6 +388,15 @@ define void @insert_store_vscale_nonconst_index_known_valid_by_assume(ptr %q, i8
 ; CHECK-NEXT:    store i8 [[S:%.*]], ptr [[TMP0]], align 1
 ; CHECK-NEXT:    ret void
 ;
+; SPIRV-LABEL: @insert_store_vscale_nonconst_index_known_valid_by_assume(
+; SPIRV-NEXT:  entry:
+; SPIRV-NEXT:    [[CMP:%.*]] = icmp ult i32 [[IDX:%.*]], 4
+; SPIRV-NEXT:    call void @llvm.assume(i1 [[CMP]])
+; SPIRV-NEXT:    [[TMP0:%.*]] = load <vscale x 16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <vscale x 16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX]]
+; SPIRV-NEXT:    store <vscale x 16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT:    ret void
+;
 entry:
   %cmp = icmp ult i32 %idx, 4
   call void @llvm.assume(i1 %cmp)
@@ -289,6 +419,16 @@ define void @insert_store_nonconst_index_not_known_valid_by_assume_after_load(pt
 ; CHECK-NEXT:    store <16 x i8> [[VECINS]], ptr [[Q]], align 16
 ; CHECK-NEXT:    ret void
 ;
+; SPIRV-LABEL: @insert_store_nonconst_index_not_known_valid_by_assume_after_load(
+; SPIRV-NEXT:  entry:
+; SPIRV-NEXT:    [[CMP:%.*]] = icmp ult i32 [[IDX:%.*]], 4
+; SPIRV-NEXT:    [[TMP0:%.*]] = load <16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT:    call void @maythrow()
+; SPIRV-NEXT:    call void @llvm.assume(i1 [[CMP]])
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX]]
+; SPIRV-NEXT:    store <16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT:    ret void
+;
 entry:
   %cmp = icmp ult i32 %idx, 4
   %0 = load <16 x i8>, ptr %q
@@ -309,6 +449,15 @@ define void @insert_store_nonconst_index_not_known_valid_by_assume(ptr %q, i8 ze
 ; CHECK-NEXT:    store <16 x i8> [[VECINS]], ptr [[Q]], align 16
 ; CHECK-NEXT:    ret void
 ;
+; SPIRV-LABEL: @insert_store_nonconst_index_not_known_valid_by_assume(
+; SPIRV-NEXT:  entry:
+; SPIRV-NEXT:    [[CMP:%.*]] = icmp ult i32 [[IDX:%.*]], 17
+; SPIRV-NEXT:    call void @llvm.assume(i1 [[CMP]])
+; SPIRV-NEXT:    [[TMP0:%.*]] = load <16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX]]
+; SPIRV-NEXT:    store <16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT:    ret void
+;
 entry:
   %cmp = icmp ult i32 %idx, 17
   call void @llvm.assume(i1 %cmp)
@@ -330,6 +479,15 @@ define void @insert_store_vscale_nonconst_index_not_known_valid_by_assume(ptr %q
 ; CHECK-NEXT:    store <vscale x 16 x i8> [[VECINS]], ptr [[Q]], align 16
 ; CHECK-NEXT:    ret void
 ;
+; SPIRV-LABEL: @insert_store_vscale_nonconst_index_not_known_valid_by_assume(
+; SPIRV-NEXT:  entry:
+; SPIRV-NEXT:    [[CMP:%.*]] = icmp ult i32 [[IDX:%.*]], 17
+; SPIRV-NEXT:    call void @llvm.assume(i1 [[CMP]])
+; SPIRV-NEXT:    [[TMP0:%.*]] = load <vscale x 16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <vscale x 16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX]]
+; SPIRV-NEXT:    store <vscale x 16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT:    ret void
+;
 entry:
   %cmp = icmp ult i32 %idx, 17
   call void @llvm.assume(i1 %cmp)
@@ -349,6 +507,14 @@ define void @insert_store_nonconst_index_known_noundef_and_valid_by_and(ptr %q,
 ; CHECK-NEXT:    store i8 [[S:%.*]], ptr [[TMP0]], align 1
 ; CHECK-NEXT:    ret void
 ;
+; SPIRV-LABEL: @insert_store_nonconst_index_known_noundef_and_valid_by_and(
+; SPIRV-NEXT:  entry:
+; SPIRV-NEXT:    [[TMP0:%.*]] = load <16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT:    [[IDX_CLAMPED:%.*]] = and i32 [[IDX:%.*]], 7
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX_CLAMPED]]
+; SPIRV-NEXT:    store <16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT:    ret void
+;
 entry:
   %0 = load <16 x i8>, ptr %q
   %idx.clamped = and i32 %idx, 7
@@ -367,6 +533,14 @@ define void @insert_store_vscale_nonconst_index_known_noundef_and_valid_by_and(p
 ; CHECK-NEXT:    store i8 [[S:%.*]], ptr [[TMP0]], align 1
 ; CHECK-NEXT:    ret void
 ;
+; SPIRV-LABEL: @insert_store_vscale_nonconst_index_known_noundef_and_valid_by_and(
+; SPIRV-NEXT:  entry:
+; SPIRV-NEXT:    [[TMP0:%.*]] = load <vscale x 16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT:    [[IDX_CLAMPED:%.*]] = and i32 [[IDX:%.*]], 7
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <vscale x 16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX_CLAMPED]]
+; SPIRV-NEXT:    store <vscale x 16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT:    ret void
+;
 entry:
   %0 = load <vscale x 16 x i8>, ptr %q
   %idx.clamped = and i32 %idx, 7
@@ -384,6 +558,15 @@ define void @insert_store_nonconst_index_base_frozen_and_valid_by_and(ptr %q, i8
 ; CHECK-NEXT:    store i8 [[S:%.*]], ptr [[TMP0]], align 1
 ; CHECK-NEXT:    ret void
 ;
+; SPIRV-LABEL: @insert_store_nonconst_index_base_frozen_and_valid_by_and(
+; SPIRV-NEXT:  entry:
+; SPIRV-NEXT:    [[TMP0:%.*]] = load <16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT:    [[IDX_FROZEN:%.*]] = freeze i32 [[IDX:%.*]]
+; SPIRV-NEXT:    [[IDX_CLAMPED:%.*]] = and i32 [[IDX_FROZEN]], 7
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX_CLAMPED]]
+; SPIRV-NEXT:    store <16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT:    ret void
+;
 entry:
   %0 = load <16 x i8>, ptr %q
   %idx.frozen = freeze i32 %idx
@@ -403,6 +586,15 @@ define void @insert_store_nonconst_index_frozen_and_valid_by_and(ptr %q, i8 zero
 ; CHECK-NEXT:    store <16 x i8> [[VECINS]], ptr [[Q]], align 16
 ; CHECK-NEXT:    ret void
 ;
+; SPIRV-LABEL: @insert_store_nonconst_index_frozen_and_valid_by_and(
+; SPIRV-NEXT:  entry:
+; SPIRV-NEXT:    [[TMP0:%.*]] = load <16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT:    [[IDX_CLAMPED:%.*]] = and i32 [[IDX:%.*]], 7
+; SPIRV-NEXT:    [[IDX_CLAMPED_FROZEN:%.*]] = freeze i32 [[IDX_CLAMPED]]
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX_CLAMPED_FROZEN]]
+; SPIRV-NEXT:    store <16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT:    ret void
+;
 entry:
   %0 = load <16 x i8>, ptr %q
   %idx.clamped = and i32 %idx, 7
@@ -421,6 +613,14 @@ define void @insert_store_nonconst_index_known_valid_by_and_but_may_be_poison(pt
 ; CHECK-NEXT:    store i8 [[S:%.*]], ptr [[TMP0]], align 1
 ; CHECK-NEXT:    ret void
 ;
+; SPIRV-LABEL: @insert_store_nonconst_index_known_valid_by_and_but_may_be_poison(
+; SPIRV-NEXT:  entry:
+; SPIRV-NEXT:    [[TMP0:%.*]] = load <16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT:    [[IDX_CLAMPED:%.*]] = and i32 [[IDX:%.*]], 7
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX_CLAMPED]]
+; SPIRV-NEXT:    store <16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT:    ret void
+;
 entry:
   %0 = load <16 x i8>, ptr %q
   %idx.clamped = and i32 %idx, 7
@@ -438,6 +638,14 @@ define void @insert_store_nonconst_index_not_known_valid_by_and(ptr %q, i8 zeroe
 ; CHECK-NEXT:    store <16 x i8> [[VECINS]], ptr [[Q]], align 16
 ; CHECK-NEXT:    ret void
 ;
+; SPIRV-LABEL: @insert_store_nonconst_index_not_known_valid_by_and(
+; SPIRV-NEXT:  entry:
+; SPIRV-NEXT:    [[TMP0:%.*]] = load <16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT:    [[IDX_CLAMPED:%.*]] = and i32 [[IDX:%.*]], 16
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX_CLAMPED]]
+; SPIRV-NEXT:    store <16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT:    ret void
+;
 entry:
   %0 = load <16 x i8>, ptr %q
   %idx.clamped = and i32 %idx, 16
@@ -455,6 +663,14 @@ define void @insert_store_nonconst_index_known_noundef_not_known_valid_by_and(pt
 ; CHECK-NEXT:    store <16 x i8> [[VECINS]], ptr [[Q]], align 16
 ; CHECK-NEXT:    ret void
 ;
+; SPIRV-LABEL: @insert_store_nonconst_index_known_noun...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Jul 23, 2025

@llvm/pr-subscribers-vectorizers

Author: Nathan Gauër (Keenuts)

Changes

Using GEP to index into a vector is not disallowed, but not recommended.
The SPIR-V backend needs to generate structured access into types, which is impossible with an untyped GEP instruction unless we add more info to the IR. Finding a solution is a work-in-progress, but in the meantime, we'd like to reduce the amount of failures.

Preventing this optimizations from rewritting extract/insert instructions into a GEP helps us lower more code to SPIR-V. This change should be OK as it's only active when targeting SPIR-V and disabling a non-recommended transformation.

Related to #145002


Patch is 33.46 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/150216.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/Vectorize/VectorCombine.cpp (+7-4)
  • (modified) llvm/test/Transforms/VectorCombine/load-insert-store.ll (+382)
diff --git a/llvm/lib/Transforms/Vectorize/VectorCombine.cpp b/llvm/lib/Transforms/Vectorize/VectorCombine.cpp
index 82adc34fdbd84..20b8165ff280a 100644
--- a/llvm/lib/Transforms/Vectorize/VectorCombine.cpp
+++ b/llvm/lib/Transforms/Vectorize/VectorCombine.cpp
@@ -3750,8 +3750,9 @@ bool VectorCombine::run() {
 
   LLVM_DEBUG(dbgs() << "\n\nVECTORCOMBINE on " << F.getName() << "\n");
 
+  const bool isSPIRV = F.getParent()->getTargetTriple().isSPIRV();
   bool MadeChange = false;
-  auto FoldInst = [this, &MadeChange](Instruction &I) {
+  auto FoldInst = [this, &MadeChange, isSPIRV](Instruction &I) {
     Builder.SetInsertPoint(&I);
     bool IsVectorType = isa<VectorType>(I.getType());
     bool IsFixedVectorType = isa<FixedVectorType>(I.getType());
@@ -3780,13 +3781,15 @@ bool VectorCombine::run() {
     // TODO: Identify and allow other scalable transforms
     if (IsVectorType) {
       MadeChange |= scalarizeOpOrCmp(I);
-      MadeChange |= scalarizeLoadExtract(I);
-      MadeChange |= scalarizeExtExtract(I);
+      if (!isSPIRV) {
+        MadeChange |= scalarizeLoadExtract(I);
+        MadeChange |= scalarizeExtExtract(I);
+      }
       MadeChange |= scalarizeVPIntrinsic(I);
       MadeChange |= foldInterleaveIntrinsics(I);
     }
 
-    if (Opcode == Instruction::Store)
+    if (Opcode == Instruction::Store && !isSPIRV)
       MadeChange |= foldSingleElementStore(I);
 
     // If this is an early pipeline invocation of this pass, we are done.
diff --git a/llvm/test/Transforms/VectorCombine/load-insert-store.ll b/llvm/test/Transforms/VectorCombine/load-insert-store.ll
index 93565c1a708eb..0181ec76088bd 100644
--- a/llvm/test/Transforms/VectorCombine/load-insert-store.ll
+++ b/llvm/test/Transforms/VectorCombine/load-insert-store.ll
@@ -1,6 +1,7 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
 ; RUN: opt -S -passes=vector-combine -data-layout=e < %s | FileCheck %s
 ; RUN: opt -S -passes=vector-combine -data-layout=E < %s | FileCheck %s
+; RUN: opt -S -passes=vector-combine -data-layout=E -mtriple=spirv-unknown-vulkan1.3-library %s | FileCheck %s --check-prefix=SPIRV
 
 define void @insert_store(ptr %q, i8 zeroext %s) {
 ; CHECK-LABEL: @insert_store(
@@ -9,6 +10,13 @@ define void @insert_store(ptr %q, i8 zeroext %s) {
 ; CHECK-NEXT:    store i8 [[S:%.*]], ptr [[TMP0]], align 1
 ; CHECK-NEXT:    ret void
 ;
+; SPIRV-LABEL: @insert_store(
+; SPIRV-NEXT:  entry:
+; SPIRV-NEXT:    [[TMP0:%.*]] = load <16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <16 x i8> [[TMP0]], i8 [[S:%.*]], i32 3
+; SPIRV-NEXT:    store <16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT:    ret void
+;
 entry:
   %0 = load <16 x i8>, ptr %q
   %vecins = insertelement <16 x i8> %0, i8 %s, i32 3
@@ -23,6 +31,13 @@ define void @insert_store_i16_align1(ptr %q, i16 zeroext %s) {
 ; CHECK-NEXT:    store i16 [[S:%.*]], ptr [[TMP0]], align 2
 ; CHECK-NEXT:    ret void
 ;
+; SPIRV-LABEL: @insert_store_i16_align1(
+; SPIRV-NEXT:  entry:
+; SPIRV-NEXT:    [[TMP0:%.*]] = load <8 x i16>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <8 x i16> [[TMP0]], i16 [[S:%.*]], i32 3
+; SPIRV-NEXT:    store <8 x i16> [[VECINS]], ptr [[Q]], align 1
+; SPIRV-NEXT:    ret void
+;
 entry:
   %0 = load <8 x i16>, ptr %q
   %vecins = insertelement <8 x i16> %0, i16 %s, i32 3
@@ -39,6 +54,13 @@ define void @insert_store_outofbounds(ptr %q, i16 zeroext %s) {
 ; CHECK-NEXT:    store <8 x i16> [[VECINS]], ptr [[Q]], align 16
 ; CHECK-NEXT:    ret void
 ;
+; SPIRV-LABEL: @insert_store_outofbounds(
+; SPIRV-NEXT:  entry:
+; SPIRV-NEXT:    [[TMP0:%.*]] = load <8 x i16>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <8 x i16> [[TMP0]], i16 [[S:%.*]], i32 9
+; SPIRV-NEXT:    store <8 x i16> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT:    ret void
+;
 entry:
   %0 = load <8 x i16>, ptr %q
   %vecins = insertelement <8 x i16> %0, i16 %s, i32 9
@@ -53,6 +75,13 @@ define void @insert_store_vscale(ptr %q, i16 zeroext %s) {
 ; CHECK-NEXT:    store i16 [[S:%.*]], ptr [[TMP0]], align 2
 ; CHECK-NEXT:    ret void
 ;
+; SPIRV-LABEL: @insert_store_vscale(
+; SPIRV-NEXT:  entry:
+; SPIRV-NEXT:    [[TMP0:%.*]] = load <vscale x 8 x i16>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <vscale x 8 x i16> [[TMP0]], i16 [[S:%.*]], i32 3
+; SPIRV-NEXT:    store <vscale x 8 x i16> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT:    ret void
+;
 entry:
   %0 = load <vscale x 8 x i16>, ptr %q
   %vecins = insertelement <vscale x 8 x i16> %0, i16 %s, i32 3
@@ -70,6 +99,13 @@ define void @insert_store_vscale_exceeds(ptr %q, i16 zeroext %s) {
 ; CHECK-NEXT:    store <vscale x 8 x i16> [[VECINS]], ptr [[Q]], align 16
 ; CHECK-NEXT:    ret void
 ;
+; SPIRV-LABEL: @insert_store_vscale_exceeds(
+; SPIRV-NEXT:  entry:
+; SPIRV-NEXT:    [[TMP0:%.*]] = load <vscale x 8 x i16>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <vscale x 8 x i16> [[TMP0]], i16 [[S:%.*]], i32 9
+; SPIRV-NEXT:    store <vscale x 8 x i16> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT:    ret void
+;
 entry:
   %0 = load <vscale x 8 x i16>, ptr %q
   %vecins = insertelement <vscale x 8 x i16> %0, i16 %s, i32 9
@@ -85,6 +121,13 @@ define void @insert_store_v9i4(ptr %q, i4 zeroext %s) {
 ; CHECK-NEXT:    store <9 x i4> [[VECINS]], ptr [[Q]], align 1
 ; CHECK-NEXT:    ret void
 ;
+; SPIRV-LABEL: @insert_store_v9i4(
+; SPIRV-NEXT:  entry:
+; SPIRV-NEXT:    [[TMP0:%.*]] = load <9 x i4>, ptr [[Q:%.*]], align 8
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <9 x i4> [[TMP0]], i4 [[S:%.*]], i32 3
+; SPIRV-NEXT:    store <9 x i4> [[VECINS]], ptr [[Q]], align 1
+; SPIRV-NEXT:    ret void
+;
 entry:
   %0 = load <9 x i4>, ptr %q
   %vecins = insertelement <9 x i4> %0, i4 %s, i32 3
@@ -100,6 +143,13 @@ define void @insert_store_v4i27(ptr %q, i27 zeroext %s) {
 ; CHECK-NEXT:    store <4 x i27> [[VECINS]], ptr [[Q]], align 1
 ; CHECK-NEXT:    ret void
 ;
+; SPIRV-LABEL: @insert_store_v4i27(
+; SPIRV-NEXT:  entry:
+; SPIRV-NEXT:    [[TMP0:%.*]] = load <4 x i27>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <4 x i27> [[TMP0]], i27 [[S:%.*]], i32 3
+; SPIRV-NEXT:    store <4 x i27> [[VECINS]], ptr [[Q]], align 1
+; SPIRV-NEXT:    ret void
+;
 entry:
   %0 = load <4 x i27>, ptr %q
   %vecins = insertelement <4 x i27> %0, i27 %s, i32 3
@@ -113,6 +163,12 @@ define void @insert_store_v32i1(ptr %p) {
 ; CHECK-NEXT:    [[INS:%.*]] = insertelement <32 x i1> [[VEC]], i1 true, i64 0
 ; CHECK-NEXT:    store <32 x i1> [[INS]], ptr [[P]], align 4
 ; CHECK-NEXT:    ret void
+;
+; SPIRV-LABEL: @insert_store_v32i1(
+; SPIRV-NEXT:    [[VEC:%.*]] = load <32 x i1>, ptr [[P:%.*]], align 4
+; SPIRV-NEXT:    [[INS:%.*]] = insertelement <32 x i1> [[VEC]], i1 true, i64 0
+; SPIRV-NEXT:    store <32 x i1> [[INS]], ptr [[P]], align 4
+; SPIRV-NEXT:    ret void
 ;
   %vec = load <32 x i1>, ptr %p
   %ins = insertelement <32 x i1> %vec, i1 true, i64 0
@@ -130,6 +186,15 @@ define void @insert_store_blk_differ(ptr %q, i16 zeroext %s) {
 ; CHECK-NEXT:    store <8 x i16> [[VECINS]], ptr [[Q]], align 16
 ; CHECK-NEXT:    ret void
 ;
+; SPIRV-LABEL: @insert_store_blk_differ(
+; SPIRV-NEXT:  entry:
+; SPIRV-NEXT:    [[TMP0:%.*]] = load <8 x i16>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT:    br label [[CONT:%.*]]
+; SPIRV:       cont:
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <8 x i16> [[TMP0]], i16 [[S:%.*]], i32 3
+; SPIRV-NEXT:    store <8 x i16> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT:    ret void
+;
 entry:
   %0 = load <8 x i16>, ptr %q
   br label %cont
@@ -147,6 +212,13 @@ define void @insert_store_nonconst(ptr %q, i8 zeroext %s, i32 %idx) {
 ; CHECK-NEXT:    store <16 x i8> [[VECINS]], ptr [[Q]], align 16
 ; CHECK-NEXT:    ret void
 ;
+; SPIRV-LABEL: @insert_store_nonconst(
+; SPIRV-NEXT:  entry:
+; SPIRV-NEXT:    [[TMP0:%.*]] = load <16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX:%.*]]
+; SPIRV-NEXT:    store <16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT:    ret void
+;
 entry:
   %0 = load <16 x i8>, ptr %q
   %vecins = insertelement <16 x i8> %0, i8 %s, i32 %idx
@@ -164,6 +236,13 @@ define void @insert_store_vscale_nonconst(ptr %q, i8 zeroext %s, i32 %idx) {
 ; CHECK-NEXT:    store <vscale x 16 x i8> [[VECINS]], ptr [[Q]], align 16
 ; CHECK-NEXT:    ret void
 ;
+; SPIRV-LABEL: @insert_store_vscale_nonconst(
+; SPIRV-NEXT:  entry:
+; SPIRV-NEXT:    [[TMP0:%.*]] = load <vscale x 16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <vscale x 16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX:%.*]]
+; SPIRV-NEXT:    store <vscale x 16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT:    ret void
+;
 entry:
   %0 = load <vscale x 16 x i8>, ptr %q
   %vecins = insertelement <vscale x 16 x i8> %0, i8 %s, i32 %idx
@@ -181,6 +260,15 @@ define void @insert_store_nonconst_large_alignment(ptr %q, i32 zeroext %s, i32 %
 ; CHECK-NEXT:    store i32 [[S:%.*]], ptr [[TMP0]], align 4
 ; CHECK-NEXT:    ret void
 ;
+; SPIRV-LABEL: @insert_store_nonconst_large_alignment(
+; SPIRV-NEXT:  entry:
+; SPIRV-NEXT:    [[CMP:%.*]] = icmp ult i32 [[IDX:%.*]], 4
+; SPIRV-NEXT:    call void @llvm.assume(i1 [[CMP]])
+; SPIRV-NEXT:    [[I:%.*]] = load <4 x i32>, ptr [[Q:%.*]], align 128
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <4 x i32> [[I]], i32 [[S:%.*]], i32 [[IDX]]
+; SPIRV-NEXT:    store <4 x i32> [[VECINS]], ptr [[Q]], align 128
+; SPIRV-NEXT:    ret void
+;
 entry:
   %cmp = icmp ult i32 %idx, 4
   call void @llvm.assume(i1 %cmp)
@@ -197,6 +285,14 @@ define void @insert_store_nonconst_align_maximum_8(ptr %q, i64 %s, i32 %idx) {
 ; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr inbounds <8 x i64>, ptr [[Q:%.*]], i32 0, i32 [[IDX]]
 ; CHECK-NEXT:    store i64 [[S:%.*]], ptr [[TMP1]], align 8
 ; CHECK-NEXT:    ret void
+;
+; SPIRV-LABEL: @insert_store_nonconst_align_maximum_8(
+; SPIRV-NEXT:    [[CMP:%.*]] = icmp ult i32 [[IDX:%.*]], 2
+; SPIRV-NEXT:    call void @llvm.assume(i1 [[CMP]])
+; SPIRV-NEXT:    [[I:%.*]] = load <8 x i64>, ptr [[Q:%.*]], align 8
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <8 x i64> [[I]], i64 [[S:%.*]], i32 [[IDX]]
+; SPIRV-NEXT:    store <8 x i64> [[VECINS]], ptr [[Q]], align 8
+; SPIRV-NEXT:    ret void
 ;
   %cmp = icmp ult i32 %idx, 2
   call void @llvm.assume(i1 %cmp)
@@ -213,6 +309,14 @@ define void @insert_store_nonconst_align_maximum_4(ptr %q, i64 %s, i32 %idx) {
 ; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr inbounds <8 x i64>, ptr [[Q:%.*]], i32 0, i32 [[IDX]]
 ; CHECK-NEXT:    store i64 [[S:%.*]], ptr [[TMP1]], align 4
 ; CHECK-NEXT:    ret void
+;
+; SPIRV-LABEL: @insert_store_nonconst_align_maximum_4(
+; SPIRV-NEXT:    [[CMP:%.*]] = icmp ult i32 [[IDX:%.*]], 2
+; SPIRV-NEXT:    call void @llvm.assume(i1 [[CMP]])
+; SPIRV-NEXT:    [[I:%.*]] = load <8 x i64>, ptr [[Q:%.*]], align 4
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <8 x i64> [[I]], i64 [[S:%.*]], i32 [[IDX]]
+; SPIRV-NEXT:    store <8 x i64> [[VECINS]], ptr [[Q]], align 4
+; SPIRV-NEXT:    ret void
 ;
   %cmp = icmp ult i32 %idx, 2
   call void @llvm.assume(i1 %cmp)
@@ -229,6 +333,14 @@ define void @insert_store_nonconst_align_larger(ptr %q, i64 %s, i32 %idx) {
 ; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr inbounds <8 x i64>, ptr [[Q:%.*]], i32 0, i32 [[IDX]]
 ; CHECK-NEXT:    store i64 [[S:%.*]], ptr [[TMP1]], align 4
 ; CHECK-NEXT:    ret void
+;
+; SPIRV-LABEL: @insert_store_nonconst_align_larger(
+; SPIRV-NEXT:    [[CMP:%.*]] = icmp ult i32 [[IDX:%.*]], 2
+; SPIRV-NEXT:    call void @llvm.assume(i1 [[CMP]])
+; SPIRV-NEXT:    [[I:%.*]] = load <8 x i64>, ptr [[Q:%.*]], align 4
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <8 x i64> [[I]], i64 [[S:%.*]], i32 [[IDX]]
+; SPIRV-NEXT:    store <8 x i64> [[VECINS]], ptr [[Q]], align 2
+; SPIRV-NEXT:    ret void
 ;
   %cmp = icmp ult i32 %idx, 2
   call void @llvm.assume(i1 %cmp)
@@ -247,6 +359,15 @@ define void @insert_store_nonconst_index_known_valid_by_assume(ptr %q, i8 zeroex
 ; CHECK-NEXT:    store i8 [[S:%.*]], ptr [[TMP0]], align 1
 ; CHECK-NEXT:    ret void
 ;
+; SPIRV-LABEL: @insert_store_nonconst_index_known_valid_by_assume(
+; SPIRV-NEXT:  entry:
+; SPIRV-NEXT:    [[CMP:%.*]] = icmp ult i32 [[IDX:%.*]], 4
+; SPIRV-NEXT:    call void @llvm.assume(i1 [[CMP]])
+; SPIRV-NEXT:    [[TMP0:%.*]] = load <16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX]]
+; SPIRV-NEXT:    store <16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT:    ret void
+;
 entry:
   %cmp = icmp ult i32 %idx, 4
   call void @llvm.assume(i1 %cmp)
@@ -267,6 +388,15 @@ define void @insert_store_vscale_nonconst_index_known_valid_by_assume(ptr %q, i8
 ; CHECK-NEXT:    store i8 [[S:%.*]], ptr [[TMP0]], align 1
 ; CHECK-NEXT:    ret void
 ;
+; SPIRV-LABEL: @insert_store_vscale_nonconst_index_known_valid_by_assume(
+; SPIRV-NEXT:  entry:
+; SPIRV-NEXT:    [[CMP:%.*]] = icmp ult i32 [[IDX:%.*]], 4
+; SPIRV-NEXT:    call void @llvm.assume(i1 [[CMP]])
+; SPIRV-NEXT:    [[TMP0:%.*]] = load <vscale x 16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <vscale x 16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX]]
+; SPIRV-NEXT:    store <vscale x 16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT:    ret void
+;
 entry:
   %cmp = icmp ult i32 %idx, 4
   call void @llvm.assume(i1 %cmp)
@@ -289,6 +419,16 @@ define void @insert_store_nonconst_index_not_known_valid_by_assume_after_load(pt
 ; CHECK-NEXT:    store <16 x i8> [[VECINS]], ptr [[Q]], align 16
 ; CHECK-NEXT:    ret void
 ;
+; SPIRV-LABEL: @insert_store_nonconst_index_not_known_valid_by_assume_after_load(
+; SPIRV-NEXT:  entry:
+; SPIRV-NEXT:    [[CMP:%.*]] = icmp ult i32 [[IDX:%.*]], 4
+; SPIRV-NEXT:    [[TMP0:%.*]] = load <16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT:    call void @maythrow()
+; SPIRV-NEXT:    call void @llvm.assume(i1 [[CMP]])
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX]]
+; SPIRV-NEXT:    store <16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT:    ret void
+;
 entry:
   %cmp = icmp ult i32 %idx, 4
   %0 = load <16 x i8>, ptr %q
@@ -309,6 +449,15 @@ define void @insert_store_nonconst_index_not_known_valid_by_assume(ptr %q, i8 ze
 ; CHECK-NEXT:    store <16 x i8> [[VECINS]], ptr [[Q]], align 16
 ; CHECK-NEXT:    ret void
 ;
+; SPIRV-LABEL: @insert_store_nonconst_index_not_known_valid_by_assume(
+; SPIRV-NEXT:  entry:
+; SPIRV-NEXT:    [[CMP:%.*]] = icmp ult i32 [[IDX:%.*]], 17
+; SPIRV-NEXT:    call void @llvm.assume(i1 [[CMP]])
+; SPIRV-NEXT:    [[TMP0:%.*]] = load <16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX]]
+; SPIRV-NEXT:    store <16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT:    ret void
+;
 entry:
   %cmp = icmp ult i32 %idx, 17
   call void @llvm.assume(i1 %cmp)
@@ -330,6 +479,15 @@ define void @insert_store_vscale_nonconst_index_not_known_valid_by_assume(ptr %q
 ; CHECK-NEXT:    store <vscale x 16 x i8> [[VECINS]], ptr [[Q]], align 16
 ; CHECK-NEXT:    ret void
 ;
+; SPIRV-LABEL: @insert_store_vscale_nonconst_index_not_known_valid_by_assume(
+; SPIRV-NEXT:  entry:
+; SPIRV-NEXT:    [[CMP:%.*]] = icmp ult i32 [[IDX:%.*]], 17
+; SPIRV-NEXT:    call void @llvm.assume(i1 [[CMP]])
+; SPIRV-NEXT:    [[TMP0:%.*]] = load <vscale x 16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <vscale x 16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX]]
+; SPIRV-NEXT:    store <vscale x 16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT:    ret void
+;
 entry:
   %cmp = icmp ult i32 %idx, 17
   call void @llvm.assume(i1 %cmp)
@@ -349,6 +507,14 @@ define void @insert_store_nonconst_index_known_noundef_and_valid_by_and(ptr %q,
 ; CHECK-NEXT:    store i8 [[S:%.*]], ptr [[TMP0]], align 1
 ; CHECK-NEXT:    ret void
 ;
+; SPIRV-LABEL: @insert_store_nonconst_index_known_noundef_and_valid_by_and(
+; SPIRV-NEXT:  entry:
+; SPIRV-NEXT:    [[TMP0:%.*]] = load <16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT:    [[IDX_CLAMPED:%.*]] = and i32 [[IDX:%.*]], 7
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX_CLAMPED]]
+; SPIRV-NEXT:    store <16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT:    ret void
+;
 entry:
   %0 = load <16 x i8>, ptr %q
   %idx.clamped = and i32 %idx, 7
@@ -367,6 +533,14 @@ define void @insert_store_vscale_nonconst_index_known_noundef_and_valid_by_and(p
 ; CHECK-NEXT:    store i8 [[S:%.*]], ptr [[TMP0]], align 1
 ; CHECK-NEXT:    ret void
 ;
+; SPIRV-LABEL: @insert_store_vscale_nonconst_index_known_noundef_and_valid_by_and(
+; SPIRV-NEXT:  entry:
+; SPIRV-NEXT:    [[TMP0:%.*]] = load <vscale x 16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT:    [[IDX_CLAMPED:%.*]] = and i32 [[IDX:%.*]], 7
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <vscale x 16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX_CLAMPED]]
+; SPIRV-NEXT:    store <vscale x 16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT:    ret void
+;
 entry:
   %0 = load <vscale x 16 x i8>, ptr %q
   %idx.clamped = and i32 %idx, 7
@@ -384,6 +558,15 @@ define void @insert_store_nonconst_index_base_frozen_and_valid_by_and(ptr %q, i8
 ; CHECK-NEXT:    store i8 [[S:%.*]], ptr [[TMP0]], align 1
 ; CHECK-NEXT:    ret void
 ;
+; SPIRV-LABEL: @insert_store_nonconst_index_base_frozen_and_valid_by_and(
+; SPIRV-NEXT:  entry:
+; SPIRV-NEXT:    [[TMP0:%.*]] = load <16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT:    [[IDX_FROZEN:%.*]] = freeze i32 [[IDX:%.*]]
+; SPIRV-NEXT:    [[IDX_CLAMPED:%.*]] = and i32 [[IDX_FROZEN]], 7
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX_CLAMPED]]
+; SPIRV-NEXT:    store <16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT:    ret void
+;
 entry:
   %0 = load <16 x i8>, ptr %q
   %idx.frozen = freeze i32 %idx
@@ -403,6 +586,15 @@ define void @insert_store_nonconst_index_frozen_and_valid_by_and(ptr %q, i8 zero
 ; CHECK-NEXT:    store <16 x i8> [[VECINS]], ptr [[Q]], align 16
 ; CHECK-NEXT:    ret void
 ;
+; SPIRV-LABEL: @insert_store_nonconst_index_frozen_and_valid_by_and(
+; SPIRV-NEXT:  entry:
+; SPIRV-NEXT:    [[TMP0:%.*]] = load <16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT:    [[IDX_CLAMPED:%.*]] = and i32 [[IDX:%.*]], 7
+; SPIRV-NEXT:    [[IDX_CLAMPED_FROZEN:%.*]] = freeze i32 [[IDX_CLAMPED]]
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX_CLAMPED_FROZEN]]
+; SPIRV-NEXT:    store <16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT:    ret void
+;
 entry:
   %0 = load <16 x i8>, ptr %q
   %idx.clamped = and i32 %idx, 7
@@ -421,6 +613,14 @@ define void @insert_store_nonconst_index_known_valid_by_and_but_may_be_poison(pt
 ; CHECK-NEXT:    store i8 [[S:%.*]], ptr [[TMP0]], align 1
 ; CHECK-NEXT:    ret void
 ;
+; SPIRV-LABEL: @insert_store_nonconst_index_known_valid_by_and_but_may_be_poison(
+; SPIRV-NEXT:  entry:
+; SPIRV-NEXT:    [[TMP0:%.*]] = load <16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT:    [[IDX_CLAMPED:%.*]] = and i32 [[IDX:%.*]], 7
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX_CLAMPED]]
+; SPIRV-NEXT:    store <16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT:    ret void
+;
 entry:
   %0 = load <16 x i8>, ptr %q
   %idx.clamped = and i32 %idx, 7
@@ -438,6 +638,14 @@ define void @insert_store_nonconst_index_not_known_valid_by_and(ptr %q, i8 zeroe
 ; CHECK-NEXT:    store <16 x i8> [[VECINS]], ptr [[Q]], align 16
 ; CHECK-NEXT:    ret void
 ;
+; SPIRV-LABEL: @insert_store_nonconst_index_not_known_valid_by_and(
+; SPIRV-NEXT:  entry:
+; SPIRV-NEXT:    [[TMP0:%.*]] = load <16 x i8>, ptr [[Q:%.*]], align 16
+; SPIRV-NEXT:    [[IDX_CLAMPED:%.*]] = and i32 [[IDX:%.*]], 16
+; SPIRV-NEXT:    [[VECINS:%.*]] = insertelement <16 x i8> [[TMP0]], i8 [[S:%.*]], i32 [[IDX_CLAMPED]]
+; SPIRV-NEXT:    store <16 x i8> [[VECINS]], ptr [[Q]], align 16
+; SPIRV-NEXT:    ret void
+;
 entry:
   %0 = load <16 x i8>, ptr %q
   %idx.clamped = and i32 %idx, 16
@@ -455,6 +663,14 @@ define void @insert_store_nonconst_index_known_noundef_not_known_valid_by_and(pt
 ; CHECK-NEXT:    store <16 x i8> [[VECINS]], ptr [[Q]], align 16
 ; CHECK-NEXT:    ret void
 ;
+; SPIRV-LABEL: @insert_store_nonconst_index_known_noun...
[truncated]

@RKSimon
Copy link
Collaborator

RKSimon commented Jul 23, 2025

Is there no TTI hook we can use?

@Keenuts
Copy link
Contributor Author

Keenuts commented Jul 23, 2025

Is there no TTI hook we can use?

I don't thing there is a good hook from what I've seen.

  • GEP is "invalid" but used/tolerated.
  • SPIR-V has real issues with GEP/ptradd we need to solve (not this PR)
  • Reducing the number of GEP allows us to move a bit forward

We might use getGEPCost and then set the cost to something very high in SPIR-V, but I feel this is just wrong: GEP are not expensive in SPIR-V, it's just LLVM lacks a robust type scavenging to correctly lower them.
Hence why I've prefered to explicitely list the target in the transform: this is something specific for SPIR-V, and not because of cost/etc, but because of an actual LLVM/BE limitation.

@dtcxzyw
Copy link
Member

dtcxzyw commented Jul 23, 2025

Is there no TTI hook we can use?

I don't thing there is a good hook from what I've seen.

  • GEP is "invalid" but used/tolerated.
  • SPIR-V has real issues with GEP/ptradd we need to solve (not this PR)
  • Reducing the number of GEP allows us to move a bit forward

We might use getGEPCost and then set the cost to something very high in SPIR-V, but I feel this is just wrong: GEP are not expensive in SPIR-V, it's just LLVM lacks a robust type scavenging to correctly lower them. Hence why I've prefered to explicitely list the target in the transform: this is something specific for SPIR-V, and not because of cost/etc, but because of an actual LLVM/BE limitation.

For the same reason, we use @llvm.preserve.*.access.index instead of gep to preserve type information on BPF. Is it also suitable for SPIR-V?

@Keenuts
Copy link
Contributor Author

Keenuts commented Jul 23, 2025

For the same reason, we use @llvm.preserve.*.access.index instead of gep to preserve type information on BPF. Is it also suitable for SPIR-V?

This is one of the solutions I'm exploring to fix the issue long-term, and for now our best one. But I need to evaluate the impact on optimizations (as we require some for legalization). So getting structured gep in a robust way will take time (hence why this PR)

Copy link
Contributor

@s-perron s-perron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I was discussing this with @Keenuts, I was wondering why we do this transformation at all considering https://discourse.llvm.org/t/status-of-geps-into-vectors-of-overaligned-elements/67497.

I don't know the state of that RFC. @jsilvanus Is anything happening with that RFC, and how should this pass be creating GEPs into the vectors?

@Keenuts
Copy link
Contributor Author

Keenuts commented Jul 28, 2025

Is there no TTI hook we can use?

@RKSimon Would it be OK to move forward with a target-specific switch?
I could also add a new bit like "disableVectorElementAccessUsingGEP" which would allow to prevent this on a target-by-target basis until the pattern is officially disallowed? (if it is one day)

@s-perron
Copy link
Contributor

When I was discussing this with @Keenuts, I was wondering why we do this transformation at all considering https://discourse.llvm.org/t/status-of-geps-into-vectors-of-overaligned-elements/67497.

I don't know the state of that RFC. @jsilvanus Is anything happening with that RFC, and how should this pass be creating GEPs into the vectors?

I talked with Nathan. I'm in favour of a hook that could be used for this.

@RKSimon
Copy link
Collaborator

RKSimon commented Jul 29, 2025

Is there no TTI hook we can use?

@RKSimon Would it be OK to move forward with a target-specific switch? I could also add a new bit like "disableVectorElementAccessUsingGEP" which would allow to prevent this on a target-by-target basis until the pattern is officially disallowed? (if it is one day)

Sure a TTI::disableVectorElementAccessUsingGEP() with SPIRV override seems cleaner to me.

Copy link
Collaborator

@RKSimon RKSimon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Waiting on TTI::disableVectorElementAccessUsingGEP() refactor

@llvmbot llvmbot added backend:SPIR-V llvm:analysis Includes value tracking, cost tables and constant folding labels Jul 29, 2025
@Keenuts Keenuts changed the title [VectorCombine] Prevent extract/ins rewrite to GEP [VectorCombine][TTI] Prevent extract/ins rewrite to GEP Jul 29, 2025
@Keenuts
Copy link
Contributor Author

Keenuts commented Jul 29, 2025

Thanks, added the TTI hook, PTAL 😊

@Keenuts Keenuts requested review from s-perron and RKSimon July 29, 2025 14:03
@Keenuts
Copy link
Contributor Author

Keenuts commented Jul 30, 2025

All done, the suggested name is fine for me, thanks!

Copy link
Collaborator

@RKSimon RKSimon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - cheers

Using GEP to index into a vector is not disallowed, but not
recommended.
The SPIR-V backend needs to generate structured access into types,
which is impossible with an untyped GEP instruction unless we add
more info to the IR. Finding a solution is a work-in-progress, but
in the meantime, we'd like to reduce the amount of failures.

Preventing this optimizations from rewritting extract/insert
instructions into a GEP helps us lower more code to SPIR-V.
This change should be OK as it's only active when targeting SPIR-V and
disabling a non-recommended transformation.

Related to llvm#145002
@Keenuts Keenuts force-pushed the disable-vector-combine branch from c149e33 to 7756809 Compare July 31, 2025 11:19
@Keenuts
Copy link
Contributor Author

Keenuts commented Jul 31, 2025

rebased on main, merging once CI is green, thanks for the reviews!

@Keenuts Keenuts merged commit 6727339 into llvm:main Jul 31, 2025
10 checks passed
@Keenuts Keenuts deleted the disable-vector-combine branch July 31, 2025 12:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:SPIR-V llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms llvm:vectorcombine vectorizers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants