-
Notifications
You must be signed in to change notification settings - Fork 14.8k
[InstCombine] Allow freezing multiple operands #154336
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-llvm-transforms Author: Nikita Popov (nikic) ChangesInstCombine tries to convert This patch allows the transform even if multiple operands need to be frozen. The existing limitation makes sure that we do not increase the total number of freezes, but it also means that that we may fail to eliminate freezes (via poison flag dropping) and may prevent optimizations (as analysis generally can't look past freeze). Overall, I believe that aggressively pushing freezes upwards is more beneficial than harmful. This is the middle-end version of #145939 in DAGCombine (which is currently reverted for SDAG-specific reasons). llvm-opt-benchmark: dtcxzyw/llvm-opt-benchmark#2691 Patch is 25.90 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/154336.diff 12 Files Affected:
diff --git a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
index 5ee3bb1abe86e..17bbeefb8be7e 100644
--- a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
@@ -4961,14 +4961,11 @@ Instruction *InstCombinerImpl::visitLandingPadInst(LandingPadInst &LI) {
Value *
InstCombinerImpl::pushFreezeToPreventPoisonFromPropagating(FreezeInst &OrigFI) {
// Try to push freeze through instructions that propagate but don't produce
- // poison as far as possible. If an operand of freeze follows three
- // conditions 1) one-use, 2) does not produce poison, and 3) has all but one
- // guaranteed-non-poison operands then push the freeze through to the one
- // operand that is not guaranteed non-poison. The actual transform is as
- // follows.
+ // poison as far as possible. If an operand of freeze is one-use and does
+ // not produce poison then push the freeze through to the operands that are
+ // not guaranteed non-poison. The actual transform is as follows.
// Op1 = ... ; Op1 can be posion
- // Op0 = Inst(Op1, NonPoisonOps...) ; Op0 has only one use and only have
- // ; single guaranteed-non-poison operands
+ // Op0 = Inst(Op1, NonPoisonOps...) ; Op0 has only one use
// ... = Freeze(Op0)
// =>
// Op1 = ...
@@ -4994,29 +4991,24 @@ InstCombinerImpl::pushFreezeToPreventPoisonFromPropagating(FreezeInst &OrigFI) {
// If operand is guaranteed not to be poison, there is no need to add freeze
// to the operand. So we first find the operand that is not guaranteed to be
// poison.
- Value *MaybePoisonOperand = nullptr;
+ SmallSetVector<Value *, 4> MaybePoisonOperands;
for (Value *V : OrigOpInst->operands()) {
- if (isa<MetadataAsValue>(V) || isGuaranteedNotToBeUndefOrPoison(V) ||
- // Treat identical operands as a single operand.
- (MaybePoisonOperand && MaybePoisonOperand == V))
+ if (isa<MetadataAsValue>(V) || isGuaranteedNotToBeUndefOrPoison(V))
continue;
- if (!MaybePoisonOperand)
- MaybePoisonOperand = V;
- else
- return nullptr;
+ MaybePoisonOperands.insert(V);
}
OrigOpInst->dropPoisonGeneratingAnnotations();
// If all operands are guaranteed to be non-poison, we can drop freeze.
- if (!MaybePoisonOperand)
+ if (MaybePoisonOperands.empty())
return OrigOp;
Builder.SetInsertPoint(OrigOpInst);
- Value *FrozenMaybePoisonOperand = Builder.CreateFreeze(
- MaybePoisonOperand, MaybePoisonOperand->getName() + ".fr");
-
- OrigOpInst->replaceUsesOfWith(MaybePoisonOperand, FrozenMaybePoisonOperand);
+ for (Value *V : MaybePoisonOperands) {
+ Value *Frozen = Builder.CreateFreeze(V, V->getName() + ".fr");
+ OrigOpInst->replaceUsesOfWith(V, Frozen);
+ }
return OrigOp;
}
diff --git a/llvm/test/Transforms/InstCombine/freeze-fp-ops.ll b/llvm/test/Transforms/InstCombine/freeze-fp-ops.ll
index d1a36dcef4d65..dd08b8eff8f51 100644
--- a/llvm/test/Transforms/InstCombine/freeze-fp-ops.ll
+++ b/llvm/test/Transforms/InstCombine/freeze-fp-ops.ll
@@ -149,9 +149,10 @@ define float @freeze_sqrt(float %arg) {
define float @freeze_powi(float %arg0, i32 %arg1) {
; CHECK-LABEL: @freeze_powi(
-; CHECK-NEXT: [[OP:%.*]] = call float @llvm.powi.f32.i32(float [[ARG0:%.*]], i32 [[ARG1:%.*]])
-; CHECK-NEXT: [[FREEZE:%.*]] = freeze float [[OP]]
-; CHECK-NEXT: ret float [[FREEZE]]
+; CHECK-NEXT: [[FREEZE:%.*]] = freeze float [[OP:%.*]]
+; CHECK-NEXT: [[ARG1_FR:%.*]] = freeze i32 [[ARG1:%.*]]
+; CHECK-NEXT: [[OP1:%.*]] = call float @llvm.powi.f32.i32(float [[FREEZE]], i32 [[ARG1_FR]])
+; CHECK-NEXT: ret float [[OP1]]
;
%op = call float @llvm.powi.f32.i32(float %arg0, i32 %arg1)
%freeze = freeze float %op
diff --git a/llvm/test/Transforms/InstCombine/freeze.ll b/llvm/test/Transforms/InstCombine/freeze.ll
index 3fedead2feab8..2b083ee06142e 100644
--- a/llvm/test/Transforms/InstCombine/freeze.ll
+++ b/llvm/test/Transforms/InstCombine/freeze.ll
@@ -100,9 +100,10 @@ define <3 x i4> @partial_undef_vec() {
define i32 @early_freeze_test1(i32 %x, i32 %y) {
; CHECK-LABEL: @early_freeze_test1(
-; CHECK-NEXT: [[V1:%.*]] = add i32 [[X:%.*]], [[Y:%.*]]
-; CHECK-NEXT: [[V1_FR:%.*]] = freeze i32 [[V1]]
-; CHECK-NEXT: [[V2:%.*]] = shl i32 [[V1_FR]], 1
+; CHECK-NEXT: [[V1_FR:%.*]] = freeze i32 [[V1:%.*]]
+; CHECK-NEXT: [[Y_FR:%.*]] = freeze i32 [[Y:%.*]]
+; CHECK-NEXT: [[V4:%.*]] = add i32 [[V1_FR]], [[Y_FR]]
+; CHECK-NEXT: [[V2:%.*]] = shl i32 [[V4]], 1
; CHECK-NEXT: [[V3:%.*]] = and i32 [[V2]], 2
; CHECK-NEXT: ret i32 [[V3]]
;
@@ -889,17 +890,17 @@ exit: ; preds = %loop
}
; The recurrence for the GEP offset can't produce poison so the freeze should
-; be pushed through to the ptr, but this is not currently supported.
+; be pushed through to the ptr.
define void @fold_phi_gep_phi_offset(ptr %init, ptr %end, i64 noundef %n) {
; CHECK-LABEL: @fold_phi_gep_phi_offset(
; CHECK-NEXT: entry:
+; CHECK-NEXT: [[INIT:%.*]] = freeze ptr [[INIT1:%.*]]
; CHECK-NEXT: br label [[LOOP:%.*]]
; CHECK: loop:
-; CHECK-NEXT: [[I:%.*]] = phi ptr [ [[INIT:%.*]], [[ENTRY:%.*]] ], [ [[I_NEXT_FR:%.*]], [[LOOP]] ]
+; CHECK-NEXT: [[I:%.*]] = phi ptr [ [[INIT]], [[ENTRY:%.*]] ], [ [[I_NEXT_FR:%.*]], [[LOOP]] ]
; CHECK-NEXT: [[OFF:%.*]] = phi i64 [ [[N:%.*]], [[ENTRY]] ], [ [[OFF_NEXT:%.*]], [[LOOP]] ]
; CHECK-NEXT: [[OFF_NEXT]] = shl i64 [[OFF]], 3
-; CHECK-NEXT: [[I_NEXT:%.*]] = getelementptr i8, ptr [[I]], i64 [[OFF_NEXT]]
-; CHECK-NEXT: [[I_NEXT_FR]] = freeze ptr [[I_NEXT]]
+; CHECK-NEXT: [[I_NEXT_FR]] = getelementptr i8, ptr [[I]], i64 [[OFF_NEXT]]
; CHECK-NEXT: [[COND:%.*]] = icmp eq ptr [[I_NEXT_FR]], [[END:%.*]]
; CHECK-NEXT: br i1 [[COND]], label [[LOOP]], label [[EXIT:%.*]]
; CHECK: exit:
@@ -921,18 +922,18 @@ exit: ; preds = %loop
ret void
}
-; Offset is still guaranteed not to be poison, so the freeze could be moved
-; here if we strip inbounds from the GEP, but this is not currently supported.
+; Offset is still guaranteed not to be poison, so the freeze can be moved
+; here if we strip inbounds from the GEP.
define void @fold_phi_gep_inbounds_phi_offset(ptr %init, ptr %end, i64 noundef %n) {
; CHECK-LABEL: @fold_phi_gep_inbounds_phi_offset(
; CHECK-NEXT: entry:
+; CHECK-NEXT: [[INIT:%.*]] = freeze ptr [[INIT1:%.*]]
; CHECK-NEXT: br label [[LOOP:%.*]]
; CHECK: loop:
-; CHECK-NEXT: [[I:%.*]] = phi ptr [ [[INIT:%.*]], [[ENTRY:%.*]] ], [ [[I_NEXT_FR:%.*]], [[LOOP]] ]
+; CHECK-NEXT: [[I:%.*]] = phi ptr [ [[INIT]], [[ENTRY:%.*]] ], [ [[I_NEXT_FR:%.*]], [[LOOP]] ]
; CHECK-NEXT: [[OFF:%.*]] = phi i64 [ [[N:%.*]], [[ENTRY]] ], [ [[OFF_NEXT:%.*]], [[LOOP]] ]
; CHECK-NEXT: [[OFF_NEXT]] = shl i64 [[OFF]], 3
-; CHECK-NEXT: [[I_NEXT:%.*]] = getelementptr inbounds i8, ptr [[I]], i64 [[OFF_NEXT]]
-; CHECK-NEXT: [[I_NEXT_FR]] = freeze ptr [[I_NEXT]]
+; CHECK-NEXT: [[I_NEXT_FR]] = getelementptr i8, ptr [[I]], i64 [[OFF_NEXT]]
; CHECK-NEXT: [[COND:%.*]] = icmp eq ptr [[I_NEXT_FR]], [[END:%.*]]
; CHECK-NEXT: br i1 [[COND]], label [[LOOP]], label [[EXIT:%.*]]
; CHECK: exit:
@@ -954,18 +955,19 @@ exit: ; preds = %loop
ret void
}
-; GEP can produce poison, check freeze isn't moved.
-define void @cant_fold_phi_gep_phi_offset(ptr %init, ptr %end, i64 %n) {
-; CHECK-LABEL: @cant_fold_phi_gep_phi_offset(
+; Same as previous, but also requires freezing %n.
+define void @fold_fold_phi_gep_phi_offset_multiple(ptr %init, ptr %end, i64 %n) {
+; CHECK-LABEL: @fold_fold_phi_gep_phi_offset_multiple(
; CHECK-NEXT: entry:
+; CHECK-NEXT: [[TMP0:%.*]] = freeze ptr [[INIT:%.*]]
+; CHECK-NEXT: [[TMP1:%.*]] = freeze i64 [[N:%.*]]
; CHECK-NEXT: br label [[LOOP:%.*]]
; CHECK: loop:
-; CHECK-NEXT: [[I:%.*]] = phi ptr [ [[INIT:%.*]], [[ENTRY:%.*]] ], [ [[I_NEXT_FR:%.*]], [[LOOP]] ]
-; CHECK-NEXT: [[OFF:%.*]] = phi i64 [ [[N:%.*]], [[ENTRY]] ], [ [[OFF_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT: [[I:%.*]] = phi ptr [ [[TMP0]], [[ENTRY:%.*]] ], [ [[I_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT: [[OFF:%.*]] = phi i64 [ [[TMP1]], [[ENTRY]] ], [ [[OFF_NEXT:%.*]], [[LOOP]] ]
; CHECK-NEXT: [[OFF_NEXT]] = shl i64 [[OFF]], 3
-; CHECK-NEXT: [[I_NEXT:%.*]] = getelementptr inbounds i8, ptr [[I]], i64 [[OFF_NEXT]]
-; CHECK-NEXT: [[I_NEXT_FR]] = freeze ptr [[I_NEXT]]
-; CHECK-NEXT: [[COND:%.*]] = icmp eq ptr [[I_NEXT_FR]], [[END:%.*]]
+; CHECK-NEXT: [[I_NEXT]] = getelementptr i8, ptr [[I]], i64 [[OFF_NEXT]]
+; CHECK-NEXT: [[COND:%.*]] = icmp eq ptr [[I_NEXT]], [[END:%.*]]
; CHECK-NEXT: br i1 [[COND]], label [[LOOP]], label [[EXIT:%.*]]
; CHECK: exit:
; CHECK-NEXT: ret void
diff --git a/llvm/test/Transforms/InstCombine/icmp.ll b/llvm/test/Transforms/InstCombine/icmp.ll
index a090f9c4d2614..fe23aae953a32 100644
--- a/llvm/test/Transforms/InstCombine/icmp.ll
+++ b/llvm/test/Transforms/InstCombine/icmp.ll
@@ -5393,11 +5393,10 @@ entry:
define i1 @icmp_freeze_sext(i16 %x, i16 %y) {
; CHECK-LABEL: @icmp_freeze_sext(
-; CHECK-NEXT: [[CMP1:%.*]] = icmp uge i16 [[X:%.*]], [[Y:%.*]]
-; CHECK-NEXT: [[CMP1_FR:%.*]] = freeze i1 [[CMP1]]
-; CHECK-NEXT: [[TMP1:%.*]] = icmp eq i16 [[Y]], 0
-; CHECK-NEXT: [[CMP2:%.*]] = or i1 [[TMP1]], [[CMP1_FR]]
-; CHECK-NEXT: ret i1 [[CMP2]]
+; CHECK-NEXT: [[Y:%.*]] = freeze i16 [[Y1:%.*]]
+; CHECK-NEXT: [[X:%.*]] = freeze i16 [[X1:%.*]]
+; CHECK-NEXT: [[CMP1:%.*]] = icmp uge i16 [[X]], [[Y]]
+; CHECK-NEXT: ret i1 [[CMP1]]
;
%cmp1 = icmp uge i16 %x, %y
%ext = sext i1 %cmp1 to i16
diff --git a/llvm/test/Transforms/InstCombine/nsw.ll b/llvm/test/Transforms/InstCombine/nsw.ll
index b00f2e58add78..18fb661603e4d 100644
--- a/llvm/test/Transforms/InstCombine/nsw.ll
+++ b/llvm/test/Transforms/InstCombine/nsw.ll
@@ -255,9 +255,10 @@ define i32 @sub_sub1_nsw_nsw(i32 %a, i32 %b, i32 %c) {
define i8 @neg_nsw_freeze(i8 %a1, i8 %a2) {
; CHECK-LABEL: @neg_nsw_freeze(
-; CHECK-NEXT: [[A_NEG:%.*]] = sub nsw i8 [[A2:%.*]], [[A1:%.*]]
-; CHECK-NEXT: [[FR_NEG:%.*]] = freeze i8 [[A_NEG]]
-; CHECK-NEXT: ret i8 [[FR_NEG]]
+; CHECK-NEXT: [[FR_NEG:%.*]] = freeze i8 [[A_NEG:%.*]]
+; CHECK-NEXT: [[A2_FR:%.*]] = freeze i8 [[A2:%.*]]
+; CHECK-NEXT: [[A_NEG1:%.*]] = sub i8 [[A2_FR]], [[FR_NEG]]
+; CHECK-NEXT: ret i8 [[A_NEG1]]
;
%a = sub nsw i8 %a1, %a2
%fr = freeze i8 %a
diff --git a/llvm/test/Transforms/InstCombine/select.ll b/llvm/test/Transforms/InstCombine/select.ll
index 1f9ee83536016..8ce55eb2d6421 100644
--- a/llvm/test/Transforms/InstCombine/select.ll
+++ b/llvm/test/Transforms/InstCombine/select.ll
@@ -2683,7 +2683,8 @@ define void @cond_freeze_multipleuses(i8 %x, i8 %y) {
define i32 @select_freeze_icmp_eq(i32 %x, i32 %y) {
; CHECK-LABEL: @select_freeze_icmp_eq(
-; CHECK-NEXT: ret i32 [[Y:%.*]]
+; CHECK-NEXT: [[Y:%.*]] = freeze i32 [[Y1:%.*]]
+; CHECK-NEXT: ret i32 [[Y]]
;
%c = icmp eq i32 %x, %y
%c.fr = freeze i1 %c
@@ -2693,7 +2694,8 @@ define i32 @select_freeze_icmp_eq(i32 %x, i32 %y) {
define i32 @select_freeze_icmp_ne(i32 %x, i32 %y) {
; CHECK-LABEL: @select_freeze_icmp_ne(
-; CHECK-NEXT: ret i32 [[X:%.*]]
+; CHECK-NEXT: [[X:%.*]] = freeze i32 [[X1:%.*]]
+; CHECK-NEXT: ret i32 [[X]]
;
%c = icmp ne i32 %x, %y
%c.fr = freeze i1 %c
@@ -2703,9 +2705,9 @@ define i32 @select_freeze_icmp_ne(i32 %x, i32 %y) {
define i32 @select_freeze_icmp_else(i32 %x, i32 %y) {
; CHECK-LABEL: @select_freeze_icmp_else(
-; CHECK-NEXT: [[C:%.*]] = icmp ult i32 [[X:%.*]], [[Y:%.*]]
-; CHECK-NEXT: [[C_FR:%.*]] = freeze i1 [[C]]
-; CHECK-NEXT: [[V:%.*]] = select i1 [[C_FR]], i32 [[X]], i32 [[Y]]
+; CHECK-NEXT: [[Y_FR:%.*]] = freeze i32 [[Y:%.*]]
+; CHECK-NEXT: [[X_FR:%.*]] = freeze i32 [[X:%.*]]
+; CHECK-NEXT: [[V:%.*]] = call i32 @llvm.umin.i32(i32 [[X_FR]], i32 [[Y_FR]])
; CHECK-NEXT: ret i32 [[V]]
;
%c = icmp ult i32 %x, %y
@@ -2718,10 +2720,10 @@ declare void @use_i1_i32(i1, i32)
define void @select_freeze_icmp_multuses(i32 %x, i32 %y) {
; CHECK-LABEL: @select_freeze_icmp_multuses(
-; CHECK-NEXT: [[C:%.*]] = icmp ne i32 [[X:%.*]], [[Y:%.*]]
-; CHECK-NEXT: [[C_FR:%.*]] = freeze i1 [[C]]
-; CHECK-NEXT: [[V:%.*]] = select i1 [[C_FR]], i32 [[X]], i32 [[Y]]
-; CHECK-NEXT: call void @use_i1_i32(i1 [[C_FR]], i32 [[V]])
+; CHECK-NEXT: [[Y:%.*]] = freeze i32 [[Y1:%.*]]
+; CHECK-NEXT: [[X:%.*]] = freeze i32 [[X1:%.*]]
+; CHECK-NEXT: [[C:%.*]] = icmp ne i32 [[X]], [[Y]]
+; CHECK-NEXT: call void @use_i1_i32(i1 [[C]], i32 [[X]])
; CHECK-NEXT: ret void
;
%c = icmp ne i32 %x, %y
diff --git a/llvm/test/Transforms/InstCombine/sub-of-negatible-inseltpoison.ll b/llvm/test/Transforms/InstCombine/sub-of-negatible-inseltpoison.ll
index 3bbb9b931e433..514f74af465cc 100644
--- a/llvm/test/Transforms/InstCombine/sub-of-negatible-inseltpoison.ll
+++ b/llvm/test/Transforms/InstCombine/sub-of-negatible-inseltpoison.ll
@@ -1353,9 +1353,10 @@ define i8 @dont_negate_ordinary_select(i8 %x, i8 %y, i8 %z, i1 %c) {
; Freeze is transparent as far as negation is concerned
define i4 @negate_freeze(i4 %x, i4 %y, i4 %z) {
; CHECK-LABEL: @negate_freeze(
-; CHECK-NEXT: [[T0_NEG:%.*]] = sub i4 [[Y:%.*]], [[X:%.*]]
-; CHECK-NEXT: [[T1_NEG:%.*]] = freeze i4 [[T0_NEG]]
-; CHECK-NEXT: [[T2:%.*]] = add i4 [[T1_NEG]], [[Z:%.*]]
+; CHECK-NEXT: [[T1_NEG:%.*]] = freeze i4 [[T0_NEG:%.*]]
+; CHECK-NEXT: [[Y_FR:%.*]] = freeze i4 [[Y:%.*]]
+; CHECK-NEXT: [[T0_NEG1:%.*]] = sub i4 [[Y_FR]], [[T1_NEG]]
+; CHECK-NEXT: [[T2:%.*]] = add i4 [[T0_NEG1]], [[Z:%.*]]
; CHECK-NEXT: ret i4 [[T2]]
;
%t0 = sub i4 %x, %y
@@ -1365,8 +1366,9 @@ define i4 @negate_freeze(i4 %x, i4 %y, i4 %z) {
}
define i4 @negate_freeze_extrause(i4 %x, i4 %y, i4 %z) {
; CHECK-LABEL: @negate_freeze_extrause(
-; CHECK-NEXT: [[T0:%.*]] = sub i4 [[X:%.*]], [[Y:%.*]]
-; CHECK-NEXT: [[T1:%.*]] = freeze i4 [[T0]]
+; CHECK-NEXT: [[X_FR:%.*]] = freeze i4 [[X:%.*]]
+; CHECK-NEXT: [[Y_FR:%.*]] = freeze i4 [[Y:%.*]]
+; CHECK-NEXT: [[T1:%.*]] = sub i4 [[X_FR]], [[Y_FR]]
; CHECK-NEXT: call void @use4(i4 [[T1]])
; CHECK-NEXT: [[T2:%.*]] = sub i4 [[Z:%.*]], [[T1]]
; CHECK-NEXT: ret i4 [[T2]]
diff --git a/llvm/test/Transforms/InstCombine/sub-of-negatible.ll b/llvm/test/Transforms/InstCombine/sub-of-negatible.ll
index 871cf37976d89..191941ed997b6 100644
--- a/llvm/test/Transforms/InstCombine/sub-of-negatible.ll
+++ b/llvm/test/Transforms/InstCombine/sub-of-negatible.ll
@@ -1443,9 +1443,10 @@ define <2 x i32> @negate_select_of_negation_poison(<2 x i1> %c, <2 x i32> %x) {
; Freeze is transparent as far as negation is concerned
define i4 @negate_freeze(i4 %x, i4 %y, i4 %z) {
; CHECK-LABEL: @negate_freeze(
-; CHECK-NEXT: [[T0_NEG:%.*]] = sub i4 [[Y:%.*]], [[X:%.*]]
-; CHECK-NEXT: [[T1_NEG:%.*]] = freeze i4 [[T0_NEG]]
-; CHECK-NEXT: [[T2:%.*]] = add i4 [[T1_NEG]], [[Z:%.*]]
+; CHECK-NEXT: [[T1_NEG:%.*]] = freeze i4 [[T0_NEG:%.*]]
+; CHECK-NEXT: [[Y_FR:%.*]] = freeze i4 [[Y:%.*]]
+; CHECK-NEXT: [[T0_NEG1:%.*]] = sub i4 [[Y_FR]], [[T1_NEG]]
+; CHECK-NEXT: [[T2:%.*]] = add i4 [[T0_NEG1]], [[Z:%.*]]
; CHECK-NEXT: ret i4 [[T2]]
;
%t0 = sub i4 %x, %y
@@ -1455,8 +1456,9 @@ define i4 @negate_freeze(i4 %x, i4 %y, i4 %z) {
}
define i4 @negate_freeze_extrause(i4 %x, i4 %y, i4 %z) {
; CHECK-LABEL: @negate_freeze_extrause(
-; CHECK-NEXT: [[T0:%.*]] = sub i4 [[X:%.*]], [[Y:%.*]]
-; CHECK-NEXT: [[T1:%.*]] = freeze i4 [[T0]]
+; CHECK-NEXT: [[X_FR:%.*]] = freeze i4 [[X:%.*]]
+; CHECK-NEXT: [[Y_FR:%.*]] = freeze i4 [[Y:%.*]]
+; CHECK-NEXT: [[T1:%.*]] = sub i4 [[X_FR]], [[Y_FR]]
; CHECK-NEXT: call void @use4(i4 [[T1]])
; CHECK-NEXT: [[T2:%.*]] = sub i4 [[Z:%.*]], [[T1]]
; CHECK-NEXT: ret i4 [[T2]]
diff --git a/llvm/test/Transforms/InstCombine/urem-via-cmp-select.ll b/llvm/test/Transforms/InstCombine/urem-via-cmp-select.ll
index 02be67a2ca250..4f2bc3a79fcc8 100644
--- a/llvm/test/Transforms/InstCombine/urem-via-cmp-select.ll
+++ b/llvm/test/Transforms/InstCombine/urem-via-cmp-select.ll
@@ -90,8 +90,9 @@ define i8 @urem_assume_with_unexpected_const(i8 %x, i8 %n) {
; https://alive2.llvm.org/ce/z/gNhZ2x
define i8 @urem_without_assume(i8 %arg, i8 %arg2) {
; CHECK-LABEL: @urem_without_assume(
-; CHECK-NEXT: [[X:%.*]] = urem i8 [[ARG:%.*]], [[ARG2:%.*]]
-; CHECK-NEXT: [[X_FR:%.*]] = freeze i8 [[X]]
+; CHECK-NEXT: [[ARG2:%.*]] = freeze i8 [[ARG3:%.*]]
+; CHECK-NEXT: [[ARG_FR:%.*]] = freeze i8 [[ARG:%.*]]
+; CHECK-NEXT: [[X_FR:%.*]] = urem i8 [[ARG_FR]], [[ARG2]]
; CHECK-NEXT: [[ADD:%.*]] = add i8 [[X_FR]], 1
; CHECK-NEXT: [[TMP1:%.*]] = icmp eq i8 [[ADD]], [[ARG2]]
; CHECK-NEXT: [[OUT:%.*]] = select i1 [[TMP1]], i8 0, i8 [[ADD]]
diff --git a/llvm/test/Transforms/LoopVectorize/forked-pointers.ll b/llvm/test/Transforms/LoopVectorize/forked-pointers.ll
index 677163b51ec64..efd420c11ef06 100644
--- a/llvm/test/Transforms/LoopVectorize/forked-pointers.ll
+++ b/llvm/test/Transforms/LoopVectorize/forked-pointers.ll
@@ -17,21 +17,22 @@ target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
define dso_local void @forked_ptrs_different_base_same_offset(ptr nocapture readonly %Base1, ptr nocapture readonly %Base2, ptr nocapture %Dest, ptr nocapture readonly %Preds) {
; CHECK-LABEL: @forked_ptrs_different_base_same_offset(
; CHECK-NEXT: entry:
+; CHECK-NEXT: [[BASE1:%.*]] = freeze ptr [[BASE3:%.*]]
+; CHECK-NEXT: [[BASE2:%.*]] = freeze ptr [[BASE4:%.*]]
+; CHECK-NEXT: [[DEST:%.*]] = freeze ptr [[DEST2:%.*]]
; CHECK-NEXT: br i1 false, label [[SCALAR_PH:%.*]], label [[VECTOR_MEMCHECK:%.*]]
; CHECK: vector.memcheck:
-; CHECK-NEXT: [[DEST1:%.*]] = ptrtoint ptr [[DEST:%.*]] to i64
+; CHECK-NEXT: [[DEST1:%.*]] = ptrtoint ptr [[DEST]] to i64
; CHECK-NEXT: [[PREDS2:%.*]] = ptrtoint ptr [[PREDS:%.*]] to i64
-; CHECK-NEXT: [[BASE23:%.*]] = ptrtoint ptr [[BASE2:%.*]] to i64
-; CHECK-NEXT: [[BASE15:%.*]] = ptrtoint ptr [[BASE1:%.*]] to i64
+; CHECK-NEXT: [[BASE23:%.*]] = ptrtoint ptr [[BASE2]] to i64
+; CHECK-NEXT: [[BASE15:%.*]] = ptrtoint ptr [[BASE1]] to i64
; CHECK-NEXT: [[TMP0:%.*]] = sub i64 [[DEST1]], [[PREDS2]]
; CHECK-NEXT: [[DIFF_CHECK:%.*]] = icmp ult i64 [[TMP0]], 16
; CHECK-NEXT: [[TMP1:%.*]] = sub i64 [[DEST1]], [[BASE23]]
-; CHECK-NEXT: [[DOTFR:%.*]] = freeze i64 [[TMP1]]
-; CHECK-NEXT: [[DIFF_CHECK4:%.*]] = icmp ult i64 [[DOTFR]], 16
+; CHECK-NEXT: [[DIFF_CHECK4:%.*]] = icmp ult i64 [[TMP1]], 16
; CHECK-NEXT: [[CONFLICT_RDX:%.*]] = or i1 [[DIFF_CHECK]], [[DIFF_CHECK4]]
; CHECK-NEXT: [[TMP2:%.*]] = sub i64 [[DEST1]], [[BASE15]]
-; CHECK-NEXT: [[DOTFR10:%.*]] = freeze i64 [[TMP2]]
-; CHECK-NEXT: [[DIFF_CHECK6:%.*]] = icmp ult i64 [[DOTFR10]], 16
+; CHECK-NEXT: [[DIFF_CHECK6:%.*]] = icmp ult i64 [[TMP2]], 16
; CHECK-NEXT: [[CONFLICT_RDX7:%.*]] = or i1 [[CONFLICT_RDX]], [[DIFF_CHECK6]]
; CHECK-NEXT: br i1 [[CONFLICT_RDX7]], label [[SCALAR_PH]], label [[VECTOR_PH:%.*]]
; CHECK: vector.ph:
diff --git a/llvm/test/Transforms/PGOProfile/chr.ll b/llvm/test/Transforms/PGOProfile/chr.ll
index 46f9a2bde7a23..f0a1574c5f209 100644
--- a/llvm/test/Transforms/PGOProfile/chr.ll
+++ b/llvm/test/Transforms/PGOProfile/chr.ll
@@ -1295,6 +1295,7 @@ define i32 @test_chr_14(ptr %i, ptr %j, i32 %sum0, i1 %pred, i32 %z) !prof !14 {
; CHECK-NEXT: entry:
; CHECK-NEXT: [[Z_FR:%.*]] = freeze i32 [[Z:%.*]]
; CHECK-NEXT: [[I0:%.*]] = load i32, ptr [[I:%.*]], align 4
+; CHECK-NEXT: [[I0_FR:%.*]] = freeze i32 [[I0]]
; CHECK-NEXT: [[V1_NOT:%.*]] = icmp eq i32 [[Z_FR]], 1
; CHECK-NEXT: br i1 [[V1_NOT]], label [[BB1:%.*]], label [[ENTRY_SPLIT_NONCHR:%.*]], !prof [[PROF15]]
; CHECK: entry.split.nonchr:
@@ -1307,27 +1308,26 @@ define i32 @test_chr_14(ptr %i, ptr %j, i32 %sum0, i1 %pred, i32 %z) !prof !14 {
; CHECK-NEXT: br label [[BB1]]
; CHECK: bb1:
; CHECK-NEXT: [[J0:%.*]] = load i32, ptr [[J:%.*]], align 4
-; CHECK-NEXT: [[V6:%.*]] = and i32 [[I0]], 2
-; CHECK-NEXT: [[V4:%.*]] = icmp ne i32 [[V6]], [[J0]]
+; CHECK-NEXT: [[J0_FR:%.*]] = freeze i...
[truncated]
|
@llvm/pr-subscribers-pgo Author: Nikita Popov (nikic) ChangesInstCombine tries to convert This patch allows the transform even if multiple operands need to be frozen. The existing limitation makes sure that we do not increase the total number of freezes, but it also means that that we may fail to eliminate freezes (via poison flag dropping) and may prevent optimizations (as analysis generally can't look past freeze). Overall, I believe that aggressively pushing freezes upwards is more beneficial than harmful. This is the middle-end version of #145939 in DAGCombine (which is currently reverted for SDAG-specific reasons). llvm-opt-benchmark: dtcxzyw/llvm-opt-benchmark#2691 Patch is 25.90 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/154336.diff 12 Files Affected:
diff --git a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
index 5ee3bb1abe86e..17bbeefb8be7e 100644
--- a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
@@ -4961,14 +4961,11 @@ Instruction *InstCombinerImpl::visitLandingPadInst(LandingPadInst &LI) {
Value *
InstCombinerImpl::pushFreezeToPreventPoisonFromPropagating(FreezeInst &OrigFI) {
// Try to push freeze through instructions that propagate but don't produce
- // poison as far as possible. If an operand of freeze follows three
- // conditions 1) one-use, 2) does not produce poison, and 3) has all but one
- // guaranteed-non-poison operands then push the freeze through to the one
- // operand that is not guaranteed non-poison. The actual transform is as
- // follows.
+ // poison as far as possible. If an operand of freeze is one-use and does
+ // not produce poison then push the freeze through to the operands that are
+ // not guaranteed non-poison. The actual transform is as follows.
// Op1 = ... ; Op1 can be posion
- // Op0 = Inst(Op1, NonPoisonOps...) ; Op0 has only one use and only have
- // ; single guaranteed-non-poison operands
+ // Op0 = Inst(Op1, NonPoisonOps...) ; Op0 has only one use
// ... = Freeze(Op0)
// =>
// Op1 = ...
@@ -4994,29 +4991,24 @@ InstCombinerImpl::pushFreezeToPreventPoisonFromPropagating(FreezeInst &OrigFI) {
// If operand is guaranteed not to be poison, there is no need to add freeze
// to the operand. So we first find the operand that is not guaranteed to be
// poison.
- Value *MaybePoisonOperand = nullptr;
+ SmallSetVector<Value *, 4> MaybePoisonOperands;
for (Value *V : OrigOpInst->operands()) {
- if (isa<MetadataAsValue>(V) || isGuaranteedNotToBeUndefOrPoison(V) ||
- // Treat identical operands as a single operand.
- (MaybePoisonOperand && MaybePoisonOperand == V))
+ if (isa<MetadataAsValue>(V) || isGuaranteedNotToBeUndefOrPoison(V))
continue;
- if (!MaybePoisonOperand)
- MaybePoisonOperand = V;
- else
- return nullptr;
+ MaybePoisonOperands.insert(V);
}
OrigOpInst->dropPoisonGeneratingAnnotations();
// If all operands are guaranteed to be non-poison, we can drop freeze.
- if (!MaybePoisonOperand)
+ if (MaybePoisonOperands.empty())
return OrigOp;
Builder.SetInsertPoint(OrigOpInst);
- Value *FrozenMaybePoisonOperand = Builder.CreateFreeze(
- MaybePoisonOperand, MaybePoisonOperand->getName() + ".fr");
-
- OrigOpInst->replaceUsesOfWith(MaybePoisonOperand, FrozenMaybePoisonOperand);
+ for (Value *V : MaybePoisonOperands) {
+ Value *Frozen = Builder.CreateFreeze(V, V->getName() + ".fr");
+ OrigOpInst->replaceUsesOfWith(V, Frozen);
+ }
return OrigOp;
}
diff --git a/llvm/test/Transforms/InstCombine/freeze-fp-ops.ll b/llvm/test/Transforms/InstCombine/freeze-fp-ops.ll
index d1a36dcef4d65..dd08b8eff8f51 100644
--- a/llvm/test/Transforms/InstCombine/freeze-fp-ops.ll
+++ b/llvm/test/Transforms/InstCombine/freeze-fp-ops.ll
@@ -149,9 +149,10 @@ define float @freeze_sqrt(float %arg) {
define float @freeze_powi(float %arg0, i32 %arg1) {
; CHECK-LABEL: @freeze_powi(
-; CHECK-NEXT: [[OP:%.*]] = call float @llvm.powi.f32.i32(float [[ARG0:%.*]], i32 [[ARG1:%.*]])
-; CHECK-NEXT: [[FREEZE:%.*]] = freeze float [[OP]]
-; CHECK-NEXT: ret float [[FREEZE]]
+; CHECK-NEXT: [[FREEZE:%.*]] = freeze float [[OP:%.*]]
+; CHECK-NEXT: [[ARG1_FR:%.*]] = freeze i32 [[ARG1:%.*]]
+; CHECK-NEXT: [[OP1:%.*]] = call float @llvm.powi.f32.i32(float [[FREEZE]], i32 [[ARG1_FR]])
+; CHECK-NEXT: ret float [[OP1]]
;
%op = call float @llvm.powi.f32.i32(float %arg0, i32 %arg1)
%freeze = freeze float %op
diff --git a/llvm/test/Transforms/InstCombine/freeze.ll b/llvm/test/Transforms/InstCombine/freeze.ll
index 3fedead2feab8..2b083ee06142e 100644
--- a/llvm/test/Transforms/InstCombine/freeze.ll
+++ b/llvm/test/Transforms/InstCombine/freeze.ll
@@ -100,9 +100,10 @@ define <3 x i4> @partial_undef_vec() {
define i32 @early_freeze_test1(i32 %x, i32 %y) {
; CHECK-LABEL: @early_freeze_test1(
-; CHECK-NEXT: [[V1:%.*]] = add i32 [[X:%.*]], [[Y:%.*]]
-; CHECK-NEXT: [[V1_FR:%.*]] = freeze i32 [[V1]]
-; CHECK-NEXT: [[V2:%.*]] = shl i32 [[V1_FR]], 1
+; CHECK-NEXT: [[V1_FR:%.*]] = freeze i32 [[V1:%.*]]
+; CHECK-NEXT: [[Y_FR:%.*]] = freeze i32 [[Y:%.*]]
+; CHECK-NEXT: [[V4:%.*]] = add i32 [[V1_FR]], [[Y_FR]]
+; CHECK-NEXT: [[V2:%.*]] = shl i32 [[V4]], 1
; CHECK-NEXT: [[V3:%.*]] = and i32 [[V2]], 2
; CHECK-NEXT: ret i32 [[V3]]
;
@@ -889,17 +890,17 @@ exit: ; preds = %loop
}
; The recurrence for the GEP offset can't produce poison so the freeze should
-; be pushed through to the ptr, but this is not currently supported.
+; be pushed through to the ptr.
define void @fold_phi_gep_phi_offset(ptr %init, ptr %end, i64 noundef %n) {
; CHECK-LABEL: @fold_phi_gep_phi_offset(
; CHECK-NEXT: entry:
+; CHECK-NEXT: [[INIT:%.*]] = freeze ptr [[INIT1:%.*]]
; CHECK-NEXT: br label [[LOOP:%.*]]
; CHECK: loop:
-; CHECK-NEXT: [[I:%.*]] = phi ptr [ [[INIT:%.*]], [[ENTRY:%.*]] ], [ [[I_NEXT_FR:%.*]], [[LOOP]] ]
+; CHECK-NEXT: [[I:%.*]] = phi ptr [ [[INIT]], [[ENTRY:%.*]] ], [ [[I_NEXT_FR:%.*]], [[LOOP]] ]
; CHECK-NEXT: [[OFF:%.*]] = phi i64 [ [[N:%.*]], [[ENTRY]] ], [ [[OFF_NEXT:%.*]], [[LOOP]] ]
; CHECK-NEXT: [[OFF_NEXT]] = shl i64 [[OFF]], 3
-; CHECK-NEXT: [[I_NEXT:%.*]] = getelementptr i8, ptr [[I]], i64 [[OFF_NEXT]]
-; CHECK-NEXT: [[I_NEXT_FR]] = freeze ptr [[I_NEXT]]
+; CHECK-NEXT: [[I_NEXT_FR]] = getelementptr i8, ptr [[I]], i64 [[OFF_NEXT]]
; CHECK-NEXT: [[COND:%.*]] = icmp eq ptr [[I_NEXT_FR]], [[END:%.*]]
; CHECK-NEXT: br i1 [[COND]], label [[LOOP]], label [[EXIT:%.*]]
; CHECK: exit:
@@ -921,18 +922,18 @@ exit: ; preds = %loop
ret void
}
-; Offset is still guaranteed not to be poison, so the freeze could be moved
-; here if we strip inbounds from the GEP, but this is not currently supported.
+; Offset is still guaranteed not to be poison, so the freeze can be moved
+; here if we strip inbounds from the GEP.
define void @fold_phi_gep_inbounds_phi_offset(ptr %init, ptr %end, i64 noundef %n) {
; CHECK-LABEL: @fold_phi_gep_inbounds_phi_offset(
; CHECK-NEXT: entry:
+; CHECK-NEXT: [[INIT:%.*]] = freeze ptr [[INIT1:%.*]]
; CHECK-NEXT: br label [[LOOP:%.*]]
; CHECK: loop:
-; CHECK-NEXT: [[I:%.*]] = phi ptr [ [[INIT:%.*]], [[ENTRY:%.*]] ], [ [[I_NEXT_FR:%.*]], [[LOOP]] ]
+; CHECK-NEXT: [[I:%.*]] = phi ptr [ [[INIT]], [[ENTRY:%.*]] ], [ [[I_NEXT_FR:%.*]], [[LOOP]] ]
; CHECK-NEXT: [[OFF:%.*]] = phi i64 [ [[N:%.*]], [[ENTRY]] ], [ [[OFF_NEXT:%.*]], [[LOOP]] ]
; CHECK-NEXT: [[OFF_NEXT]] = shl i64 [[OFF]], 3
-; CHECK-NEXT: [[I_NEXT:%.*]] = getelementptr inbounds i8, ptr [[I]], i64 [[OFF_NEXT]]
-; CHECK-NEXT: [[I_NEXT_FR]] = freeze ptr [[I_NEXT]]
+; CHECK-NEXT: [[I_NEXT_FR]] = getelementptr i8, ptr [[I]], i64 [[OFF_NEXT]]
; CHECK-NEXT: [[COND:%.*]] = icmp eq ptr [[I_NEXT_FR]], [[END:%.*]]
; CHECK-NEXT: br i1 [[COND]], label [[LOOP]], label [[EXIT:%.*]]
; CHECK: exit:
@@ -954,18 +955,19 @@ exit: ; preds = %loop
ret void
}
-; GEP can produce poison, check freeze isn't moved.
-define void @cant_fold_phi_gep_phi_offset(ptr %init, ptr %end, i64 %n) {
-; CHECK-LABEL: @cant_fold_phi_gep_phi_offset(
+; Same as previous, but also requires freezing %n.
+define void @fold_fold_phi_gep_phi_offset_multiple(ptr %init, ptr %end, i64 %n) {
+; CHECK-LABEL: @fold_fold_phi_gep_phi_offset_multiple(
; CHECK-NEXT: entry:
+; CHECK-NEXT: [[TMP0:%.*]] = freeze ptr [[INIT:%.*]]
+; CHECK-NEXT: [[TMP1:%.*]] = freeze i64 [[N:%.*]]
; CHECK-NEXT: br label [[LOOP:%.*]]
; CHECK: loop:
-; CHECK-NEXT: [[I:%.*]] = phi ptr [ [[INIT:%.*]], [[ENTRY:%.*]] ], [ [[I_NEXT_FR:%.*]], [[LOOP]] ]
-; CHECK-NEXT: [[OFF:%.*]] = phi i64 [ [[N:%.*]], [[ENTRY]] ], [ [[OFF_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT: [[I:%.*]] = phi ptr [ [[TMP0]], [[ENTRY:%.*]] ], [ [[I_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT: [[OFF:%.*]] = phi i64 [ [[TMP1]], [[ENTRY]] ], [ [[OFF_NEXT:%.*]], [[LOOP]] ]
; CHECK-NEXT: [[OFF_NEXT]] = shl i64 [[OFF]], 3
-; CHECK-NEXT: [[I_NEXT:%.*]] = getelementptr inbounds i8, ptr [[I]], i64 [[OFF_NEXT]]
-; CHECK-NEXT: [[I_NEXT_FR]] = freeze ptr [[I_NEXT]]
-; CHECK-NEXT: [[COND:%.*]] = icmp eq ptr [[I_NEXT_FR]], [[END:%.*]]
+; CHECK-NEXT: [[I_NEXT]] = getelementptr i8, ptr [[I]], i64 [[OFF_NEXT]]
+; CHECK-NEXT: [[COND:%.*]] = icmp eq ptr [[I_NEXT]], [[END:%.*]]
; CHECK-NEXT: br i1 [[COND]], label [[LOOP]], label [[EXIT:%.*]]
; CHECK: exit:
; CHECK-NEXT: ret void
diff --git a/llvm/test/Transforms/InstCombine/icmp.ll b/llvm/test/Transforms/InstCombine/icmp.ll
index a090f9c4d2614..fe23aae953a32 100644
--- a/llvm/test/Transforms/InstCombine/icmp.ll
+++ b/llvm/test/Transforms/InstCombine/icmp.ll
@@ -5393,11 +5393,10 @@ entry:
define i1 @icmp_freeze_sext(i16 %x, i16 %y) {
; CHECK-LABEL: @icmp_freeze_sext(
-; CHECK-NEXT: [[CMP1:%.*]] = icmp uge i16 [[X:%.*]], [[Y:%.*]]
-; CHECK-NEXT: [[CMP1_FR:%.*]] = freeze i1 [[CMP1]]
-; CHECK-NEXT: [[TMP1:%.*]] = icmp eq i16 [[Y]], 0
-; CHECK-NEXT: [[CMP2:%.*]] = or i1 [[TMP1]], [[CMP1_FR]]
-; CHECK-NEXT: ret i1 [[CMP2]]
+; CHECK-NEXT: [[Y:%.*]] = freeze i16 [[Y1:%.*]]
+; CHECK-NEXT: [[X:%.*]] = freeze i16 [[X1:%.*]]
+; CHECK-NEXT: [[CMP1:%.*]] = icmp uge i16 [[X]], [[Y]]
+; CHECK-NEXT: ret i1 [[CMP1]]
;
%cmp1 = icmp uge i16 %x, %y
%ext = sext i1 %cmp1 to i16
diff --git a/llvm/test/Transforms/InstCombine/nsw.ll b/llvm/test/Transforms/InstCombine/nsw.ll
index b00f2e58add78..18fb661603e4d 100644
--- a/llvm/test/Transforms/InstCombine/nsw.ll
+++ b/llvm/test/Transforms/InstCombine/nsw.ll
@@ -255,9 +255,10 @@ define i32 @sub_sub1_nsw_nsw(i32 %a, i32 %b, i32 %c) {
define i8 @neg_nsw_freeze(i8 %a1, i8 %a2) {
; CHECK-LABEL: @neg_nsw_freeze(
-; CHECK-NEXT: [[A_NEG:%.*]] = sub nsw i8 [[A2:%.*]], [[A1:%.*]]
-; CHECK-NEXT: [[FR_NEG:%.*]] = freeze i8 [[A_NEG]]
-; CHECK-NEXT: ret i8 [[FR_NEG]]
+; CHECK-NEXT: [[FR_NEG:%.*]] = freeze i8 [[A_NEG:%.*]]
+; CHECK-NEXT: [[A2_FR:%.*]] = freeze i8 [[A2:%.*]]
+; CHECK-NEXT: [[A_NEG1:%.*]] = sub i8 [[A2_FR]], [[FR_NEG]]
+; CHECK-NEXT: ret i8 [[A_NEG1]]
;
%a = sub nsw i8 %a1, %a2
%fr = freeze i8 %a
diff --git a/llvm/test/Transforms/InstCombine/select.ll b/llvm/test/Transforms/InstCombine/select.ll
index 1f9ee83536016..8ce55eb2d6421 100644
--- a/llvm/test/Transforms/InstCombine/select.ll
+++ b/llvm/test/Transforms/InstCombine/select.ll
@@ -2683,7 +2683,8 @@ define void @cond_freeze_multipleuses(i8 %x, i8 %y) {
define i32 @select_freeze_icmp_eq(i32 %x, i32 %y) {
; CHECK-LABEL: @select_freeze_icmp_eq(
-; CHECK-NEXT: ret i32 [[Y:%.*]]
+; CHECK-NEXT: [[Y:%.*]] = freeze i32 [[Y1:%.*]]
+; CHECK-NEXT: ret i32 [[Y]]
;
%c = icmp eq i32 %x, %y
%c.fr = freeze i1 %c
@@ -2693,7 +2694,8 @@ define i32 @select_freeze_icmp_eq(i32 %x, i32 %y) {
define i32 @select_freeze_icmp_ne(i32 %x, i32 %y) {
; CHECK-LABEL: @select_freeze_icmp_ne(
-; CHECK-NEXT: ret i32 [[X:%.*]]
+; CHECK-NEXT: [[X:%.*]] = freeze i32 [[X1:%.*]]
+; CHECK-NEXT: ret i32 [[X]]
;
%c = icmp ne i32 %x, %y
%c.fr = freeze i1 %c
@@ -2703,9 +2705,9 @@ define i32 @select_freeze_icmp_ne(i32 %x, i32 %y) {
define i32 @select_freeze_icmp_else(i32 %x, i32 %y) {
; CHECK-LABEL: @select_freeze_icmp_else(
-; CHECK-NEXT: [[C:%.*]] = icmp ult i32 [[X:%.*]], [[Y:%.*]]
-; CHECK-NEXT: [[C_FR:%.*]] = freeze i1 [[C]]
-; CHECK-NEXT: [[V:%.*]] = select i1 [[C_FR]], i32 [[X]], i32 [[Y]]
+; CHECK-NEXT: [[Y_FR:%.*]] = freeze i32 [[Y:%.*]]
+; CHECK-NEXT: [[X_FR:%.*]] = freeze i32 [[X:%.*]]
+; CHECK-NEXT: [[V:%.*]] = call i32 @llvm.umin.i32(i32 [[X_FR]], i32 [[Y_FR]])
; CHECK-NEXT: ret i32 [[V]]
;
%c = icmp ult i32 %x, %y
@@ -2718,10 +2720,10 @@ declare void @use_i1_i32(i1, i32)
define void @select_freeze_icmp_multuses(i32 %x, i32 %y) {
; CHECK-LABEL: @select_freeze_icmp_multuses(
-; CHECK-NEXT: [[C:%.*]] = icmp ne i32 [[X:%.*]], [[Y:%.*]]
-; CHECK-NEXT: [[C_FR:%.*]] = freeze i1 [[C]]
-; CHECK-NEXT: [[V:%.*]] = select i1 [[C_FR]], i32 [[X]], i32 [[Y]]
-; CHECK-NEXT: call void @use_i1_i32(i1 [[C_FR]], i32 [[V]])
+; CHECK-NEXT: [[Y:%.*]] = freeze i32 [[Y1:%.*]]
+; CHECK-NEXT: [[X:%.*]] = freeze i32 [[X1:%.*]]
+; CHECK-NEXT: [[C:%.*]] = icmp ne i32 [[X]], [[Y]]
+; CHECK-NEXT: call void @use_i1_i32(i1 [[C]], i32 [[X]])
; CHECK-NEXT: ret void
;
%c = icmp ne i32 %x, %y
diff --git a/llvm/test/Transforms/InstCombine/sub-of-negatible-inseltpoison.ll b/llvm/test/Transforms/InstCombine/sub-of-negatible-inseltpoison.ll
index 3bbb9b931e433..514f74af465cc 100644
--- a/llvm/test/Transforms/InstCombine/sub-of-negatible-inseltpoison.ll
+++ b/llvm/test/Transforms/InstCombine/sub-of-negatible-inseltpoison.ll
@@ -1353,9 +1353,10 @@ define i8 @dont_negate_ordinary_select(i8 %x, i8 %y, i8 %z, i1 %c) {
; Freeze is transparent as far as negation is concerned
define i4 @negate_freeze(i4 %x, i4 %y, i4 %z) {
; CHECK-LABEL: @negate_freeze(
-; CHECK-NEXT: [[T0_NEG:%.*]] = sub i4 [[Y:%.*]], [[X:%.*]]
-; CHECK-NEXT: [[T1_NEG:%.*]] = freeze i4 [[T0_NEG]]
-; CHECK-NEXT: [[T2:%.*]] = add i4 [[T1_NEG]], [[Z:%.*]]
+; CHECK-NEXT: [[T1_NEG:%.*]] = freeze i4 [[T0_NEG:%.*]]
+; CHECK-NEXT: [[Y_FR:%.*]] = freeze i4 [[Y:%.*]]
+; CHECK-NEXT: [[T0_NEG1:%.*]] = sub i4 [[Y_FR]], [[T1_NEG]]
+; CHECK-NEXT: [[T2:%.*]] = add i4 [[T0_NEG1]], [[Z:%.*]]
; CHECK-NEXT: ret i4 [[T2]]
;
%t0 = sub i4 %x, %y
@@ -1365,8 +1366,9 @@ define i4 @negate_freeze(i4 %x, i4 %y, i4 %z) {
}
define i4 @negate_freeze_extrause(i4 %x, i4 %y, i4 %z) {
; CHECK-LABEL: @negate_freeze_extrause(
-; CHECK-NEXT: [[T0:%.*]] = sub i4 [[X:%.*]], [[Y:%.*]]
-; CHECK-NEXT: [[T1:%.*]] = freeze i4 [[T0]]
+; CHECK-NEXT: [[X_FR:%.*]] = freeze i4 [[X:%.*]]
+; CHECK-NEXT: [[Y_FR:%.*]] = freeze i4 [[Y:%.*]]
+; CHECK-NEXT: [[T1:%.*]] = sub i4 [[X_FR]], [[Y_FR]]
; CHECK-NEXT: call void @use4(i4 [[T1]])
; CHECK-NEXT: [[T2:%.*]] = sub i4 [[Z:%.*]], [[T1]]
; CHECK-NEXT: ret i4 [[T2]]
diff --git a/llvm/test/Transforms/InstCombine/sub-of-negatible.ll b/llvm/test/Transforms/InstCombine/sub-of-negatible.ll
index 871cf37976d89..191941ed997b6 100644
--- a/llvm/test/Transforms/InstCombine/sub-of-negatible.ll
+++ b/llvm/test/Transforms/InstCombine/sub-of-negatible.ll
@@ -1443,9 +1443,10 @@ define <2 x i32> @negate_select_of_negation_poison(<2 x i1> %c, <2 x i32> %x) {
; Freeze is transparent as far as negation is concerned
define i4 @negate_freeze(i4 %x, i4 %y, i4 %z) {
; CHECK-LABEL: @negate_freeze(
-; CHECK-NEXT: [[T0_NEG:%.*]] = sub i4 [[Y:%.*]], [[X:%.*]]
-; CHECK-NEXT: [[T1_NEG:%.*]] = freeze i4 [[T0_NEG]]
-; CHECK-NEXT: [[T2:%.*]] = add i4 [[T1_NEG]], [[Z:%.*]]
+; CHECK-NEXT: [[T1_NEG:%.*]] = freeze i4 [[T0_NEG:%.*]]
+; CHECK-NEXT: [[Y_FR:%.*]] = freeze i4 [[Y:%.*]]
+; CHECK-NEXT: [[T0_NEG1:%.*]] = sub i4 [[Y_FR]], [[T1_NEG]]
+; CHECK-NEXT: [[T2:%.*]] = add i4 [[T0_NEG1]], [[Z:%.*]]
; CHECK-NEXT: ret i4 [[T2]]
;
%t0 = sub i4 %x, %y
@@ -1455,8 +1456,9 @@ define i4 @negate_freeze(i4 %x, i4 %y, i4 %z) {
}
define i4 @negate_freeze_extrause(i4 %x, i4 %y, i4 %z) {
; CHECK-LABEL: @negate_freeze_extrause(
-; CHECK-NEXT: [[T0:%.*]] = sub i4 [[X:%.*]], [[Y:%.*]]
-; CHECK-NEXT: [[T1:%.*]] = freeze i4 [[T0]]
+; CHECK-NEXT: [[X_FR:%.*]] = freeze i4 [[X:%.*]]
+; CHECK-NEXT: [[Y_FR:%.*]] = freeze i4 [[Y:%.*]]
+; CHECK-NEXT: [[T1:%.*]] = sub i4 [[X_FR]], [[Y_FR]]
; CHECK-NEXT: call void @use4(i4 [[T1]])
; CHECK-NEXT: [[T2:%.*]] = sub i4 [[Z:%.*]], [[T1]]
; CHECK-NEXT: ret i4 [[T2]]
diff --git a/llvm/test/Transforms/InstCombine/urem-via-cmp-select.ll b/llvm/test/Transforms/InstCombine/urem-via-cmp-select.ll
index 02be67a2ca250..4f2bc3a79fcc8 100644
--- a/llvm/test/Transforms/InstCombine/urem-via-cmp-select.ll
+++ b/llvm/test/Transforms/InstCombine/urem-via-cmp-select.ll
@@ -90,8 +90,9 @@ define i8 @urem_assume_with_unexpected_const(i8 %x, i8 %n) {
; https://alive2.llvm.org/ce/z/gNhZ2x
define i8 @urem_without_assume(i8 %arg, i8 %arg2) {
; CHECK-LABEL: @urem_without_assume(
-; CHECK-NEXT: [[X:%.*]] = urem i8 [[ARG:%.*]], [[ARG2:%.*]]
-; CHECK-NEXT: [[X_FR:%.*]] = freeze i8 [[X]]
+; CHECK-NEXT: [[ARG2:%.*]] = freeze i8 [[ARG3:%.*]]
+; CHECK-NEXT: [[ARG_FR:%.*]] = freeze i8 [[ARG:%.*]]
+; CHECK-NEXT: [[X_FR:%.*]] = urem i8 [[ARG_FR]], [[ARG2]]
; CHECK-NEXT: [[ADD:%.*]] = add i8 [[X_FR]], 1
; CHECK-NEXT: [[TMP1:%.*]] = icmp eq i8 [[ADD]], [[ARG2]]
; CHECK-NEXT: [[OUT:%.*]] = select i1 [[TMP1]], i8 0, i8 [[ADD]]
diff --git a/llvm/test/Transforms/LoopVectorize/forked-pointers.ll b/llvm/test/Transforms/LoopVectorize/forked-pointers.ll
index 677163b51ec64..efd420c11ef06 100644
--- a/llvm/test/Transforms/LoopVectorize/forked-pointers.ll
+++ b/llvm/test/Transforms/LoopVectorize/forked-pointers.ll
@@ -17,21 +17,22 @@ target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
define dso_local void @forked_ptrs_different_base_same_offset(ptr nocapture readonly %Base1, ptr nocapture readonly %Base2, ptr nocapture %Dest, ptr nocapture readonly %Preds) {
; CHECK-LABEL: @forked_ptrs_different_base_same_offset(
; CHECK-NEXT: entry:
+; CHECK-NEXT: [[BASE1:%.*]] = freeze ptr [[BASE3:%.*]]
+; CHECK-NEXT: [[BASE2:%.*]] = freeze ptr [[BASE4:%.*]]
+; CHECK-NEXT: [[DEST:%.*]] = freeze ptr [[DEST2:%.*]]
; CHECK-NEXT: br i1 false, label [[SCALAR_PH:%.*]], label [[VECTOR_MEMCHECK:%.*]]
; CHECK: vector.memcheck:
-; CHECK-NEXT: [[DEST1:%.*]] = ptrtoint ptr [[DEST:%.*]] to i64
+; CHECK-NEXT: [[DEST1:%.*]] = ptrtoint ptr [[DEST]] to i64
; CHECK-NEXT: [[PREDS2:%.*]] = ptrtoint ptr [[PREDS:%.*]] to i64
-; CHECK-NEXT: [[BASE23:%.*]] = ptrtoint ptr [[BASE2:%.*]] to i64
-; CHECK-NEXT: [[BASE15:%.*]] = ptrtoint ptr [[BASE1:%.*]] to i64
+; CHECK-NEXT: [[BASE23:%.*]] = ptrtoint ptr [[BASE2]] to i64
+; CHECK-NEXT: [[BASE15:%.*]] = ptrtoint ptr [[BASE1]] to i64
; CHECK-NEXT: [[TMP0:%.*]] = sub i64 [[DEST1]], [[PREDS2]]
; CHECK-NEXT: [[DIFF_CHECK:%.*]] = icmp ult i64 [[TMP0]], 16
; CHECK-NEXT: [[TMP1:%.*]] = sub i64 [[DEST1]], [[BASE23]]
-; CHECK-NEXT: [[DOTFR:%.*]] = freeze i64 [[TMP1]]
-; CHECK-NEXT: [[DIFF_CHECK4:%.*]] = icmp ult i64 [[DOTFR]], 16
+; CHECK-NEXT: [[DIFF_CHECK4:%.*]] = icmp ult i64 [[TMP1]], 16
; CHECK-NEXT: [[CONFLICT_RDX:%.*]] = or i1 [[DIFF_CHECK]], [[DIFF_CHECK4]]
; CHECK-NEXT: [[TMP2:%.*]] = sub i64 [[DEST1]], [[BASE15]]
-; CHECK-NEXT: [[DOTFR10:%.*]] = freeze i64 [[TMP2]]
-; CHECK-NEXT: [[DIFF_CHECK6:%.*]] = icmp ult i64 [[DOTFR10]], 16
+; CHECK-NEXT: [[DIFF_CHECK6:%.*]] = icmp ult i64 [[TMP2]], 16
; CHECK-NEXT: [[CONFLICT_RDX7:%.*]] = or i1 [[CONFLICT_RDX]], [[DIFF_CHECK6]]
; CHECK-NEXT: br i1 [[CONFLICT_RDX7]], label [[SCALAR_PH]], label [[VECTOR_PH:%.*]]
; CHECK: vector.ph:
diff --git a/llvm/test/Transforms/PGOProfile/chr.ll b/llvm/test/Transforms/PGOProfile/chr.ll
index 46f9a2bde7a23..f0a1574c5f209 100644
--- a/llvm/test/Transforms/PGOProfile/chr.ll
+++ b/llvm/test/Transforms/PGOProfile/chr.ll
@@ -1295,6 +1295,7 @@ define i32 @test_chr_14(ptr %i, ptr %j, i32 %sum0, i1 %pred, i32 %z) !prof !14 {
; CHECK-NEXT: entry:
; CHECK-NEXT: [[Z_FR:%.*]] = freeze i32 [[Z:%.*]]
; CHECK-NEXT: [[I0:%.*]] = load i32, ptr [[I:%.*]], align 4
+; CHECK-NEXT: [[I0_FR:%.*]] = freeze i32 [[I0]]
; CHECK-NEXT: [[V1_NOT:%.*]] = icmp eq i32 [[Z_FR]], 1
; CHECK-NEXT: br i1 [[V1_NOT]], label [[BB1:%.*]], label [[ENTRY_SPLIT_NONCHR:%.*]], !prof [[PROF15]]
; CHECK: entry.split.nonchr:
@@ -1307,27 +1308,26 @@ define i32 @test_chr_14(ptr %i, ptr %j, i32 %sum0, i1 %pred, i32 %z) !prof !14 {
; CHECK-NEXT: br label [[BB1]]
; CHECK: bb1:
; CHECK-NEXT: [[J0:%.*]] = load i32, ptr [[J:%.*]], align 4
-; CHECK-NEXT: [[V6:%.*]] = and i32 [[I0]], 2
-; CHECK-NEXT: [[V4:%.*]] = icmp ne i32 [[V6]], [[J0]]
+; CHECK-NEXT: [[J0_FR:%.*]] = freeze i...
[truncated]
|
@@ -2683,7 +2683,8 @@ define void @cond_freeze_multipleuses(i8 %x, i8 %y) { | |||
|
|||
define i32 @select_freeze_icmp_eq(i32 %x, i32 %y) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comes from:
static Value *foldSelectWithFrozenICmp(SelectInst &Sel, InstCombiner::BuilderTy &Builder) { |
We regress this case by inserting an unnecessary freeze now, but I don't think this is important. We can drop the specialized transform after this patch.
What caused so much trouble in #145939 is that SelectionDAG is terrible at topological combine order, which was resulting in the replaceUses calls causing infinite loops because the freeze was getting pushed through beyond the recursion depth that some other users (combined later) could see. Is that likely to happen in InstCombine? I've started work on a much tamer version here #152107 - which relies on a separate fold of frozen/unfrozen versions of a node getting merged later on. |
No, this is not a problem for InstCombine.
This is actually how it works in InstCombine. We do a plain freeze an the operands, and then there's a separate freezeOtherUses() transform. |
Thanks for the patch! This fixes an issue which the tests I added in #145541 demonstrate. I'll abandon my attempt #150420 at fixing that as this seems like a nicer approach.
Me and @david-arm also considered what you're doing here when looking at this code. I think one thing that put me off is it seemed at odds with the "total number of freezes" metric and any reduction in this being seen as a good thing. I'm not sure if my assessment there is correct, just something I observed when i started looking into this and reading old slides etc. It occurred to me perhaps it's not the number of freezes but how much of the DAG is frozen, so aggressively pushing freezes upwards makes sense to me 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left a couple of nits and a question, but otherwise LGTM cheers
; CHECK-NEXT: [[A_NEG:%.*]] = sub nsw i8 [[A2:%.*]], [[A1:%.*]] | ||
; CHECK-NEXT: [[FR_NEG:%.*]] = freeze i8 [[A_NEG]] | ||
; CHECK-NEXT: ret i8 [[FR_NEG]] | ||
; CHECK-NEXT: [[FR_NEG:%.*]] = freeze i8 [[A_NEG:%.*]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: CHECK variable names are out of sync with IR here and elsewhere, could do with re-generating these tests with --reset-variable-names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could the PHI node restriction also be removed so freeze can be pushed through PHIs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The phi case is trickier because we need to make sure that we don't end up pushing a freeze around a loop. We should at least extend foldFreezeIntoRecurrence() to allow freezing multiple out-of-loop values though.
InstCombine tries to convert `freeze(inst(op))` to `inst(freeze(op))`. Currently, this is limited to the case where a single operand needs to be frozen, and all other operands are guaranteed non-poison. This patch allows the transform even if multiple operands need to be frozen. The existing limitation makes sure that we do not increase the total number of freezes, but it also means that that we may fail to eliminate freezes (via poison flag dropping) and may prevent optimizations (as analysis generally can't look past freeze). Overall, I believe that aggressively pushing freezes upwards is more beneficial than harmful. This is the middle-end version of llvm#145939 in DAGCombine (which is currently reverted for SDAG-specific reasons).
To avoid compile time issues when pushing freeze up a large value graph, I've rewritten the implementation to directly push freeze to the leafs, instead of doing this step by step. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LG
// follows. | ||
// poison as far as possible. If an operand of freeze does not produce poison | ||
// then push the freeze through to the operands that are not guaranteed | ||
// non-poison. The actual transform is as follows. | ||
// Op1 = ... ; Op1 can be posion |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Op1 = ... ; Op1 can be posion | |
// Op1 = ... ; Op1 can be poison |
; CHECK-NEXT: [[OP:%.*]] = call float @llvm.powi.f32.i32(float [[ARG0:%.*]], i32 [[ARG1:%.*]]) | ||
; CHECK-NEXT: [[FREEZE:%.*]] = freeze float [[OP]] | ||
; CHECK-NEXT: ret float [[FREEZE]] | ||
; CHECK-NEXT: [[FREEZE:%.*]] = freeze float [[OP:%.*]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
; CHECK-NEXT: [[FREEZE:%.*]] = freeze float [[OP:%.*]] | |
; CHECK-NEXT: [[FREEZE:%.*]] = freeze float [[ARG0:%.*]] |
Can you please regenerate the check lines?
InstCombine tries to convert
freeze(inst(op))
toinst(freeze(op))
. Currently, this is limited to the case where a single operand needs to be frozen, and all other operands are guaranteed non-poison.This patch allows the transform even if multiple operands need to be frozen. The existing limitation makes sure that we do not increase the total number of freezes, but it also means that that we may fail to eliminate freezes (via poison flag dropping) and may prevent optimizations (as analysis generally can't look past freeze). Overall, I believe that aggressively pushing freezes upwards is more beneficial than harmful.
This is the middle-end version of #145939 in DAGCombine (which is currently reverted for SDAG-specific reasons).
llvm-opt-benchmark: dtcxzyw/llvm-opt-benchmark#2691