-
Notifications
You must be signed in to change notification settings - Fork 14.7k
[IRCE] Check loop clone legality before attempting to do so #151062
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Not all loop bodies are safe to clone. For example, they may have `noduplicate` call in them. Additionally, if the transformation may cause any of the cloned loop bodies to execute conditionally, we must also disallow `convergent` calls, as well as any call that returns a token that is used outside of the loop. The loop unswitching pass had an appropriate legality check for this, but IRCE did not, causing issues downstream (JuliaLang/julia#48918). As a particular example, consider the test case in this commit, where `opt -passes=irce`, creates invalid IR: ``` opt --passes=irce llvm/test/Transforms/IRCE/pre_post_loops_clone.ll WARNING: You're attempting to print out a bitcode file. This is inadvisable as it may cause display problems. If you REALLY want to taste LLVM bitcode first-hand, you can force output with the `-f' option. Instruction does not dominate all uses! %token = call token @llvm.source_token() call void @llvm.sink_token(token %token) Instruction does not dominate all uses! %token = call token @llvm.source_token() call void @llvm.sink_token(token %token) LLVM ERROR: Broken module found, compilation aborted! ``` Fix this by factoring out the legality check from the loop unswitch pass and using it in IRCE as well. I personally don't have any code that is `noduplicate` or `convergent`, but I don't see any reason why IRCE would be different here. I will also mention llvm#56243 as related, which contains some related discussion of LCSSA in the presence of tokens. Fixes llvm#63984
@llvm/pr-subscribers-llvm-transforms Author: Keno Fischer (Keno) ChangesNot all loop bodies are safe to clone. For example, they may have As a particular example, consider the test case in this commit, where
Fix this by factoring out the legality check from the loop unswitch pass and using it in IRCE as well. I personally don't have any code that is I will also mention #56243 as related, which contains some related discussion of LCSSA in the presence of tokens. Fixes #63984 Full diff: https://github.com/llvm/llvm-project/pull/151062.diff 5 Files Affected:
diff --git a/llvm/include/llvm/Analysis/LoopInfo.h b/llvm/include/llvm/Analysis/LoopInfo.h
index a7a6a2753709c..7969d3f3d061c 100644
--- a/llvm/include/llvm/Analysis/LoopInfo.h
+++ b/llvm/include/llvm/Analysis/LoopInfo.h
@@ -327,6 +327,12 @@ class LLVM_ABI Loop : public LoopBase<BasicBlock, Loop> {
/// Return true if the loop body is safe to clone in practice.
bool isSafeToClone() const;
+ /// Like `isSafeToClone`, but also checks whether we may form phis for all
+ /// values that are live-out from the loop. This is required if either of
+ /// the cloned loop bodies may run conditionally.
+ bool isSafeToCloneConditionally(const DominatorTree &DT,
+ bool AllowConvergent = false) const;
+
/// Returns true if the loop is annotated parallel.
///
/// A parallel loop can be assumed to not contain any dependencies between
diff --git a/llvm/lib/Analysis/LoopInfo.cpp b/llvm/lib/Analysis/LoopInfo.cpp
index 901cfe03ecd33..833c7bfe71073 100644
--- a/llvm/lib/Analysis/LoopInfo.cpp
+++ b/llvm/lib/Analysis/LoopInfo.cpp
@@ -429,6 +429,27 @@ bool Loop::isCanonical(ScalarEvolution &SE) const {
return true;
}
+static bool loopContainsUser(const Loop &L, const BasicBlock &BB, const Use &U,
+ const DominatorTree &DT) {
+ const Instruction *UI = cast<Instruction>(U.getUser());
+ const BasicBlock *UserBB = UI->getParent();
+
+ // For practical purposes, we consider that the use in a PHI
+ // occurs in the respective predecessor block. For more info,
+ // see the `phi` doc in LangRef and the LCSSA doc.
+ if (const PHINode *P = dyn_cast<PHINode>(UI))
+ UserBB = P->getIncomingBlock(U);
+
+ // Check the current block, as a fast-path, before checking whether
+ // the use is anywhere in the loop. Most values are used in the same
+ // block they are defined in. Also, blocks not reachable from the
+ // entry are special; uses in them don't need to go through PHIs.
+ if (UserBB != &BB && !L.contains(UserBB) && DT.isReachableFromEntry(UserBB))
+ return false;
+
+ return true;
+}
+
// Check that 'BB' doesn't have any uses outside of the 'L'
static bool isBlockInLCSSAForm(const Loop &L, const BasicBlock &BB,
const DominatorTree &DT, bool IgnoreTokens) {
@@ -440,21 +461,7 @@ static bool isBlockInLCSSAForm(const Loop &L, const BasicBlock &BB,
continue;
for (const Use &U : I.uses()) {
- const Instruction *UI = cast<Instruction>(U.getUser());
- const BasicBlock *UserBB = UI->getParent();
-
- // For practical purposes, we consider that the use in a PHI
- // occurs in the respective predecessor block. For more info,
- // see the `phi` doc in LangRef and the LCSSA doc.
- if (const PHINode *P = dyn_cast<PHINode>(UI))
- UserBB = P->getIncomingBlock(U);
-
- // Check the current block, as a fast-path, before checking whether
- // the use is anywhere in the loop. Most values are used in the same
- // block they are defined in. Also, blocks not reachable from the
- // entry are special; uses in them don't need to go through PHIs.
- if (UserBB != &BB && !L.contains(UserBB) &&
- DT.isReachableFromEntry(UserBB))
+ if (!loopContainsUser(L, BB, U, DT))
return false;
}
}
@@ -500,6 +507,29 @@ bool Loop::isSafeToClone() const {
return true;
}
+bool Loop::isSafeToCloneConditionally(const DominatorTree &DT,
+ bool AllowConvergent) const {
+ // Return false if any loop blocks contain indirectbrs, or there are any calls
+ // to noduplicate functions.
+ for (BasicBlock *BB : this->blocks()) {
+ if (isa<IndirectBrInst>(BB->getTerminator()))
+ return false;
+
+ for (Instruction &I : *BB) {
+ if (I.getType()->isTokenTy()) {
+ for (const Use &U : I.uses()) {
+ if (!loopContainsUser(*this, *BB, U, DT))
+ return false;
+ }
+ }
+ if (auto *CB = dyn_cast<CallBase>(&I))
+ if (CB->cannotDuplicate() || (AllowConvergent || CB->isConvergent()))
+ return false;
+ }
+ }
+ return true;
+}
+
MDNode *Loop::getLoopID() const {
MDNode *LoopID = nullptr;
diff --git a/llvm/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp b/llvm/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp
index 9b40fc03da6bb..d046929061ea6 100644
--- a/llvm/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp
+++ b/llvm/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp
@@ -3277,19 +3277,10 @@ static bool collectUnswitchCandidatesWithInjections(
return Found;
}
-static bool isSafeForNoNTrivialUnswitching(Loop &L, LoopInfo &LI) {
- if (!L.isSafeToClone())
+static bool isSafeForNoNTrivialUnswitching(const DominatorTree &DT, Loop &L,
+ LoopInfo &LI) {
+ if (!L.isSafeToCloneConditionally(DT))
return false;
- for (auto *BB : L.blocks())
- for (auto &I : *BB) {
- if (I.getType()->isTokenTy() && I.isUsedOutsideOfBlock(BB))
- return false;
- if (auto *CB = dyn_cast<CallBase>(&I)) {
- assert(!CB->cannotDuplicate() && "Checked by L.isSafeToClone().");
- if (CB->isConvergent())
- return false;
- }
- }
// Check if there are irreducible CFG cycles in this loop. If so, we cannot
// easily unswitch non-trivial edges out of the loop. Doing so might turn the
@@ -3665,7 +3656,7 @@ static bool unswitchLoop(Loop &L, DominatorTree &DT, LoopInfo &LI,
}
// Perform legality checks.
- if (!isSafeForNoNTrivialUnswitching(L, LI))
+ if (!isSafeForNoNTrivialUnswitching(DT, L, LI))
return false;
// For non-trivial unswitching, because it often creates new loops, we rely on
diff --git a/llvm/lib/Transforms/Utils/LoopConstrainer.cpp b/llvm/lib/Transforms/Utils/LoopConstrainer.cpp
index 8f103153059e8..56e806222e764 100644
--- a/llvm/lib/Transforms/Utils/LoopConstrainer.cpp
+++ b/llvm/lib/Transforms/Utils/LoopConstrainer.cpp
@@ -750,6 +750,13 @@ bool LoopConstrainer::run() {
const SCEVConstant *MinusOneS =
cast<SCEVConstant>(SE.getConstant(IVTy, -1, true /* isSigned */));
+ if ((NeedsPreLoop || NeedsPostLoop) &&
+ !OriginalLoop.isSafeToCloneConditionally(DT)) {
+ LLVM_DEBUG(dbgs() << "cannot clone loop " << OriginalLoop.getName()
+ << " because it is not safe to clone.\n");
+ return false;
+ }
+
if (NeedsPreLoop) {
const SCEV *ExitPreLoopAtSCEV = nullptr;
diff --git a/llvm/test/Transforms/IRCE/pre_post_loops_clone.ll b/llvm/test/Transforms/IRCE/pre_post_loops_clone.ll
new file mode 100644
index 0000000000000..2ad20c15b569d
--- /dev/null
+++ b/llvm/test/Transforms/IRCE/pre_post_loops_clone.ll
@@ -0,0 +1,61 @@
+; RUN: opt -verify-loop-info -irce-print-changed-loops -passes=irce -S < %s 2>&1 | FileCheck %s
+; This test is the same as pre_post_loops.ll, except that the loop body contains a token-generating
+; call. We need to ensure that IRCE does not try to introduce a PHI or otherwise generate invalid IE.
+
+declare token @llvm.source_token()
+declare void @llvm.sink_token(token)
+
+; CHECK: define void @test_01
+define void @test_01(ptr %arr, ptr %a_len_ptr) {
+entry:
+ %len = load i32, ptr %a_len_ptr, !range !0
+ br label %loop
+
+loop:
+ %idx = phi i32 [ 0, %entry ], [ %idx.next, %in.bounds ]
+ %idx.next = add i32 %idx, 1
+ %abc = icmp slt i32 %idx, %len
+ %token = call token @llvm.source_token()
+ br i1 %abc, label %in.bounds, label %out.of.bounds
+
+in.bounds:
+ %addr = getelementptr i32, ptr %arr, i32 %idx
+ store i32 0, ptr %addr
+ %next = icmp slt i32 %idx.next, 2147483647
+ br i1 %next, label %loop, label %exit
+
+out.of.bounds:
+ ret void
+
+exit:
+ call void @llvm.sink_token(token %token)
+ ret void
+}
+
+define void @test_02(ptr %arr, ptr %a_len_ptr) {
+entry:
+ %len = load i32, ptr %a_len_ptr, !range !0
+ br label %loop
+
+loop:
+ %idx = phi i32 [ 2147483647, %entry ], [ %idx.next, %in.bounds ]
+ %idx.next = add i32 %idx, -1
+ %abc = icmp slt i32 %idx, %len
+ %token = call token @llvm.source_token()
+ br i1 %abc, label %in.bounds, label %out.of.bounds
+
+in.bounds:
+ %addr = getelementptr i32, ptr %arr, i32 %idx
+ store i32 0, ptr %addr
+ %next = icmp sgt i32 %idx.next, -1
+ br i1 %next, label %loop, label %exit
+
+out.of.bounds:
+ ret void
+
+exit:
+ call void @llvm.sink_token(token %token)
+ ret void
+}
+
+!0 = !{i32 0, i32 50}
|
@llvm/pr-subscribers-llvm-analysis Author: Keno Fischer (Keno) ChangesNot all loop bodies are safe to clone. For example, they may have As a particular example, consider the test case in this commit, where
Fix this by factoring out the legality check from the loop unswitch pass and using it in IRCE as well. I personally don't have any code that is I will also mention #56243 as related, which contains some related discussion of LCSSA in the presence of tokens. Fixes #63984 Full diff: https://github.com/llvm/llvm-project/pull/151062.diff 5 Files Affected:
diff --git a/llvm/include/llvm/Analysis/LoopInfo.h b/llvm/include/llvm/Analysis/LoopInfo.h
index a7a6a2753709c..7969d3f3d061c 100644
--- a/llvm/include/llvm/Analysis/LoopInfo.h
+++ b/llvm/include/llvm/Analysis/LoopInfo.h
@@ -327,6 +327,12 @@ class LLVM_ABI Loop : public LoopBase<BasicBlock, Loop> {
/// Return true if the loop body is safe to clone in practice.
bool isSafeToClone() const;
+ /// Like `isSafeToClone`, but also checks whether we may form phis for all
+ /// values that are live-out from the loop. This is required if either of
+ /// the cloned loop bodies may run conditionally.
+ bool isSafeToCloneConditionally(const DominatorTree &DT,
+ bool AllowConvergent = false) const;
+
/// Returns true if the loop is annotated parallel.
///
/// A parallel loop can be assumed to not contain any dependencies between
diff --git a/llvm/lib/Analysis/LoopInfo.cpp b/llvm/lib/Analysis/LoopInfo.cpp
index 901cfe03ecd33..833c7bfe71073 100644
--- a/llvm/lib/Analysis/LoopInfo.cpp
+++ b/llvm/lib/Analysis/LoopInfo.cpp
@@ -429,6 +429,27 @@ bool Loop::isCanonical(ScalarEvolution &SE) const {
return true;
}
+static bool loopContainsUser(const Loop &L, const BasicBlock &BB, const Use &U,
+ const DominatorTree &DT) {
+ const Instruction *UI = cast<Instruction>(U.getUser());
+ const BasicBlock *UserBB = UI->getParent();
+
+ // For practical purposes, we consider that the use in a PHI
+ // occurs in the respective predecessor block. For more info,
+ // see the `phi` doc in LangRef and the LCSSA doc.
+ if (const PHINode *P = dyn_cast<PHINode>(UI))
+ UserBB = P->getIncomingBlock(U);
+
+ // Check the current block, as a fast-path, before checking whether
+ // the use is anywhere in the loop. Most values are used in the same
+ // block they are defined in. Also, blocks not reachable from the
+ // entry are special; uses in them don't need to go through PHIs.
+ if (UserBB != &BB && !L.contains(UserBB) && DT.isReachableFromEntry(UserBB))
+ return false;
+
+ return true;
+}
+
// Check that 'BB' doesn't have any uses outside of the 'L'
static bool isBlockInLCSSAForm(const Loop &L, const BasicBlock &BB,
const DominatorTree &DT, bool IgnoreTokens) {
@@ -440,21 +461,7 @@ static bool isBlockInLCSSAForm(const Loop &L, const BasicBlock &BB,
continue;
for (const Use &U : I.uses()) {
- const Instruction *UI = cast<Instruction>(U.getUser());
- const BasicBlock *UserBB = UI->getParent();
-
- // For practical purposes, we consider that the use in a PHI
- // occurs in the respective predecessor block. For more info,
- // see the `phi` doc in LangRef and the LCSSA doc.
- if (const PHINode *P = dyn_cast<PHINode>(UI))
- UserBB = P->getIncomingBlock(U);
-
- // Check the current block, as a fast-path, before checking whether
- // the use is anywhere in the loop. Most values are used in the same
- // block they are defined in. Also, blocks not reachable from the
- // entry are special; uses in them don't need to go through PHIs.
- if (UserBB != &BB && !L.contains(UserBB) &&
- DT.isReachableFromEntry(UserBB))
+ if (!loopContainsUser(L, BB, U, DT))
return false;
}
}
@@ -500,6 +507,29 @@ bool Loop::isSafeToClone() const {
return true;
}
+bool Loop::isSafeToCloneConditionally(const DominatorTree &DT,
+ bool AllowConvergent) const {
+ // Return false if any loop blocks contain indirectbrs, or there are any calls
+ // to noduplicate functions.
+ for (BasicBlock *BB : this->blocks()) {
+ if (isa<IndirectBrInst>(BB->getTerminator()))
+ return false;
+
+ for (Instruction &I : *BB) {
+ if (I.getType()->isTokenTy()) {
+ for (const Use &U : I.uses()) {
+ if (!loopContainsUser(*this, *BB, U, DT))
+ return false;
+ }
+ }
+ if (auto *CB = dyn_cast<CallBase>(&I))
+ if (CB->cannotDuplicate() || (AllowConvergent || CB->isConvergent()))
+ return false;
+ }
+ }
+ return true;
+}
+
MDNode *Loop::getLoopID() const {
MDNode *LoopID = nullptr;
diff --git a/llvm/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp b/llvm/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp
index 9b40fc03da6bb..d046929061ea6 100644
--- a/llvm/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp
+++ b/llvm/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp
@@ -3277,19 +3277,10 @@ static bool collectUnswitchCandidatesWithInjections(
return Found;
}
-static bool isSafeForNoNTrivialUnswitching(Loop &L, LoopInfo &LI) {
- if (!L.isSafeToClone())
+static bool isSafeForNoNTrivialUnswitching(const DominatorTree &DT, Loop &L,
+ LoopInfo &LI) {
+ if (!L.isSafeToCloneConditionally(DT))
return false;
- for (auto *BB : L.blocks())
- for (auto &I : *BB) {
- if (I.getType()->isTokenTy() && I.isUsedOutsideOfBlock(BB))
- return false;
- if (auto *CB = dyn_cast<CallBase>(&I)) {
- assert(!CB->cannotDuplicate() && "Checked by L.isSafeToClone().");
- if (CB->isConvergent())
- return false;
- }
- }
// Check if there are irreducible CFG cycles in this loop. If so, we cannot
// easily unswitch non-trivial edges out of the loop. Doing so might turn the
@@ -3665,7 +3656,7 @@ static bool unswitchLoop(Loop &L, DominatorTree &DT, LoopInfo &LI,
}
// Perform legality checks.
- if (!isSafeForNoNTrivialUnswitching(L, LI))
+ if (!isSafeForNoNTrivialUnswitching(DT, L, LI))
return false;
// For non-trivial unswitching, because it often creates new loops, we rely on
diff --git a/llvm/lib/Transforms/Utils/LoopConstrainer.cpp b/llvm/lib/Transforms/Utils/LoopConstrainer.cpp
index 8f103153059e8..56e806222e764 100644
--- a/llvm/lib/Transforms/Utils/LoopConstrainer.cpp
+++ b/llvm/lib/Transforms/Utils/LoopConstrainer.cpp
@@ -750,6 +750,13 @@ bool LoopConstrainer::run() {
const SCEVConstant *MinusOneS =
cast<SCEVConstant>(SE.getConstant(IVTy, -1, true /* isSigned */));
+ if ((NeedsPreLoop || NeedsPostLoop) &&
+ !OriginalLoop.isSafeToCloneConditionally(DT)) {
+ LLVM_DEBUG(dbgs() << "cannot clone loop " << OriginalLoop.getName()
+ << " because it is not safe to clone.\n");
+ return false;
+ }
+
if (NeedsPreLoop) {
const SCEV *ExitPreLoopAtSCEV = nullptr;
diff --git a/llvm/test/Transforms/IRCE/pre_post_loops_clone.ll b/llvm/test/Transforms/IRCE/pre_post_loops_clone.ll
new file mode 100644
index 0000000000000..2ad20c15b569d
--- /dev/null
+++ b/llvm/test/Transforms/IRCE/pre_post_loops_clone.ll
@@ -0,0 +1,61 @@
+; RUN: opt -verify-loop-info -irce-print-changed-loops -passes=irce -S < %s 2>&1 | FileCheck %s
+; This test is the same as pre_post_loops.ll, except that the loop body contains a token-generating
+; call. We need to ensure that IRCE does not try to introduce a PHI or otherwise generate invalid IE.
+
+declare token @llvm.source_token()
+declare void @llvm.sink_token(token)
+
+; CHECK: define void @test_01
+define void @test_01(ptr %arr, ptr %a_len_ptr) {
+entry:
+ %len = load i32, ptr %a_len_ptr, !range !0
+ br label %loop
+
+loop:
+ %idx = phi i32 [ 0, %entry ], [ %idx.next, %in.bounds ]
+ %idx.next = add i32 %idx, 1
+ %abc = icmp slt i32 %idx, %len
+ %token = call token @llvm.source_token()
+ br i1 %abc, label %in.bounds, label %out.of.bounds
+
+in.bounds:
+ %addr = getelementptr i32, ptr %arr, i32 %idx
+ store i32 0, ptr %addr
+ %next = icmp slt i32 %idx.next, 2147483647
+ br i1 %next, label %loop, label %exit
+
+out.of.bounds:
+ ret void
+
+exit:
+ call void @llvm.sink_token(token %token)
+ ret void
+}
+
+define void @test_02(ptr %arr, ptr %a_len_ptr) {
+entry:
+ %len = load i32, ptr %a_len_ptr, !range !0
+ br label %loop
+
+loop:
+ %idx = phi i32 [ 2147483647, %entry ], [ %idx.next, %in.bounds ]
+ %idx.next = add i32 %idx, -1
+ %abc = icmp slt i32 %idx, %len
+ %token = call token @llvm.source_token()
+ br i1 %abc, label %in.bounds, label %out.of.bounds
+
+in.bounds:
+ %addr = getelementptr i32, ptr %arr, i32 %idx
+ store i32 0, ptr %addr
+ %next = icmp sgt i32 %idx.next, -1
+ br i1 %next, label %loop, label %exit
+
+out.of.bounds:
+ ret void
+
+exit:
+ call void @llvm.sink_token(token %token)
+ ret void
+}
+
+!0 = !{i32 0, i32 50}
|
LGTM. Though I would note the description makes it sound NFC for the existing loop unstitching, but it looks like you also improved the used outside BB check to be a more permissive used outside loop check, which should perhaps be noted in the commit message |
Yes, that's correct, since I also unified the logic with isBlockInLCSSAForm and that's what I wanted for IRCE to fix this bug (while not pessimizing unnecessarily for uses inside the loop). |
@xortator @aleks-tmb Any comments on this? Would be good to get input from people that have touched IRCE in the past. I would like to get this in to address the compiler error. I guess I'll wait another few days, otherwise, I'll make the commit message adjustment that @vtjnash requested and get this in. |
Not all loop bodies are safe to clone. For example, they may have
noduplicate
call in them. Additionally, if the transformation may cause any of the cloned loop bodies to execute conditionally, we must also disallowconvergent
calls, as well as any call that returns a token that is used outside of the loop. The loop unswitching pass had an appropriate legality check for this, but IRCE did not, causing issues downstream (JuliaLang/julia#48918).As a particular example, consider the test case in this commit, where
opt -passes=irce
, creates invalid IR:Fix this by factoring out the legality check from the loop unswitch pass and using it in IRCE as well. I personally don't have any code that is
noduplicate
orconvergent
, but I don't see any reason why IRCE would be different here.I will also mention #56243 as related, which contains some related discussion of LCSSA in the presence of tokens.
Fixes #63984