Skip to content

[MSSAUpdater] Replace recursion with worklist and cap it. #150543

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

alinas
Copy link
Contributor

@alinas alinas commented Jul 24, 2025

Replace the recursion used for finding and removing trivial MemoryPhis with a worklist approach and add a cap for it for pathological cases.

This came up in a profile that also found #150531, where a lot of time was found on copying the MemoryPhi operands during the recusion.

Replace the recursion used for finding and removing trivial MemoryPhis
with a worklist approach and add a cap for it for pathological cases.

This came up in a profile that also found #Issue150531, where a lot of
time was found on copying the MemoryPhi operands during the recusion.
@alinas alinas added the llvm:analysis Includes value tracking, cost tables and constant folding label Jul 24, 2025
@llvmbot
Copy link
Member

llvmbot commented Jul 24, 2025

@llvm/pr-subscribers-llvm-analysis

Author: Alina Sbirlea (alinas)

Changes

Replace the recursion used for finding and removing trivial MemoryPhis with a worklist approach and add a cap for it for pathological cases.

This came up in a profile that also found #Issue150531, where a lot of time was found on copying the MemoryPhi operands during the recusion.


Full diff: https://github.com/llvm/llvm-project/pull/150543.diff

2 Files Affected:

  • (modified) llvm/include/llvm/Analysis/MemorySSAUpdater.h (-1)
  • (modified) llvm/lib/Analysis/MemorySSAUpdater.cpp (+60-15)
diff --git a/llvm/include/llvm/Analysis/MemorySSAUpdater.h b/llvm/include/llvm/Analysis/MemorySSAUpdater.h
index 96bf99922d848..8c03f241e8609 100644
--- a/llvm/include/llvm/Analysis/MemorySSAUpdater.h
+++ b/llvm/include/llvm/Analysis/MemorySSAUpdater.h
@@ -257,7 +257,6 @@ class MemorySSAUpdater {
   MemoryAccess *
   getPreviousDefRecursive(BasicBlock *,
                           DenseMap<BasicBlock *, TrackingVH<MemoryAccess>> &);
-  MemoryAccess *recursePhi(MemoryAccess *Phi);
   MemoryAccess *tryRemoveTrivialPhi(MemoryPhi *Phi);
   template <class RangeType>
   MemoryAccess *tryRemoveTrivialPhi(MemoryPhi *Phi, RangeType &Operands);
diff --git a/llvm/lib/Analysis/MemorySSAUpdater.cpp b/llvm/lib/Analysis/MemorySSAUpdater.cpp
index ecfecb03c3757..56743e7e7caca 100644
--- a/llvm/lib/Analysis/MemorySSAUpdater.cpp
+++ b/llvm/lib/Analysis/MemorySSAUpdater.cpp
@@ -18,12 +18,17 @@
 #include "llvm/Analysis/MemorySSA.h"
 #include "llvm/IR/BasicBlock.h"
 #include "llvm/IR/Dominators.h"
+#include "llvm/Support/CommandLine.h"
 #include "llvm/Support/Debug.h"
 #include <algorithm>
 
 #define DEBUG_TYPE "memoryssa"
 using namespace llvm;
 
+static cl::opt<uint32_t> TrivialPhiProcessingLimit(
+    "mssaupdater-processing-limit", cl::Hidden, cl::init(20),
+    cl::desc("The worklist limit for trivial phi removal (default = 20)"));
+
 // This is the marker algorithm from "Simple and Efficient Construction of
 // Static Single Assignment Form"
 // The simple, non-marker algorithm places phi nodes at any join
@@ -181,18 +186,6 @@ MemoryAccess *MemorySSAUpdater::getPreviousDefFromEnd(
 
   return getPreviousDefRecursive(BB, CachedPreviousDef);
 }
-// Recurse over a set of phi uses to eliminate the trivial ones
-MemoryAccess *MemorySSAUpdater::recursePhi(MemoryAccess *Phi) {
-  if (!Phi)
-    return nullptr;
-  TrackingVH<MemoryAccess> Res(Phi);
-  SmallVector<TrackingVH<Value>, 8> Uses;
-  std::copy(Phi->user_begin(), Phi->user_end(), std::back_inserter(Uses));
-  for (auto &U : Uses)
-    if (MemoryPhi *UsePhi = dyn_cast<MemoryPhi>(&*U))
-      tryRemoveTrivialPhi(UsePhi);
-  return Res;
-}
 
 // Eliminate trivial phis
 // Phis are trivial if they are defined either by themselves, or all the same
@@ -230,9 +223,61 @@ MemoryAccess *MemorySSAUpdater::tryRemoveTrivialPhi(MemoryPhi *Phi,
     removeMemoryAccess(Phi);
   }
 
-  // We should only end up recursing in case we replaced something, in which
-  // case, we may have made other Phis trivial.
-  return recursePhi(Same);
+  // Continue traversal in a DFS worklist approach, in case we might find
+  // other trivial Phis.
+  if (!Same)
+    return nullptr;
+
+  TrackingVH<MemoryAccess> Result(Same);
+
+  // Worklist approach to recursively removing trivial Phis.
+  SmallVector<TrackingVH<Value>, 5> Worklist;
+
+  for (auto U : Same->users()) {
+    if (dyn_cast<MemoryPhi>(&*U))
+      Worklist.push_back(TrackingVH<Value>(U));
+  }
+
+  while (!Worklist.empty() || Worklist.size() > TrivialPhiProcessingLimit) {
+    MemoryPhi *RecPhi = dyn_cast<MemoryPhi>(&*(Worklist[Worklist.size() - 1]));
+    Worklist.pop_back();
+
+    if (!RecPhi)
+      continue;
+    auto RecOperands = RecPhi->operands();
+
+    // Repeat above algorithm.
+    if (NonOptPhis.count(RecPhi))
+      continue;
+
+    bool nextone = false;
+    MemoryAccess *RecSame = nullptr;
+    for (auto &Op : RecOperands) {
+      if (Op == RecPhi || Op == RecSame)
+        continue;
+      if (RecSame) {
+        nextone = true;
+        break;
+      }
+      RecSame = cast<MemoryAccess>(&*Op);
+    }
+
+    // Move on to the next Phi in the worklist.
+    if (nextone || RecSame == nullptr)
+      continue;
+
+    if (RecPhi) {
+      RecPhi->replaceAllUsesWith(RecSame);
+      removeMemoryAccess(RecPhi);
+    }
+
+    for (auto U : RecSame->users()) {
+      if (dyn_cast<MemoryPhi>(&*U))
+        Worklist.push_back(TrackingVH<Value>(U));
+    }
+  }
+
+  return Result;
 }
 
 void MemorySSAUpdater::insertUse(MemoryUse *MU, bool RenameUses) {

if (!Same)
return nullptr;

TrackingVH<MemoryAccess> Result(Same);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the key issue in this code is not really that we can visit many users (which should be ok as they are only visited if an actual simplification occurs), but that we use TrackingVH for ... everything here, which is going to be very expensive.

Can we avoid it by using a visited set so that we don't have to worry about revisiting a phi that has been erased in a different iteration?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fully agree with you on this one. The use of TrackingVH has been really baked into MSSA and its updater since the beginning and this is trying to deal with that (e.g.
MemorySSAUpdater.cpp:68 passes in a VH, and removing phis MemorySSAUpdater.cpp:1326 checks if a VH is passed in, to do RAUW). My initial attempts to use a visited list here and drop the VH resulted in asserting where a VH was expected (in Value.cpp ).

I'm happy to keep trying to address this and I'm open to feedback on what I missed when triggering the assert. The updated profile with this change, for the original test that found this issue, does show too much time spent on creating VH here. But it is a step fwd since with this change I can cap the recursion and number of phis processed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed a change that I think resolves this. PTAL?

continue;
auto RecOperands = RecPhi->operands();

// Repeat above algorithm.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of repeating it, can we handle the base case by starting with a one element worklist?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the issue here is that this API is used in two scenarios: 1) when looking at a normal MemoryPhi (called from tryRemoveTrivialPhi with the phi's operands), in which case this is just repeating the algorithm, and 2) when looking at an empty MemoryPhi with potential operands (during an update) and the goal is deciding whether to actually complete that Phi. This second case is also the only scenario where the return value is used.
Because the two cases have different types for how the list of operands is represented, this is templated to use a range, and this is also why this was originally a recursive function, so the range type can change (it only really changes at most once though).

I don't know how to resolve this without some duplication while avoiding copying the operands. Here's an alternative, but I'm open to suggestions.

// common case, public API
tryRemoveTrivialPhi(Phi) {
// calls tryRemoveTrivialPhiHelper and discards its return value, void
tryRemoveTrivialPhiHelper
}
// uncommon case, public API, remove template
tryRemoveIncompelteTrivialPhi(Phi, SmallVector<>& PhiOps) {
// duplicates the algorithm for the incoming range, before calling the helper
// returns the value from the helper
tryRemoveTrivialPhiHelper
}
// private API, returns a Phi
tryRemoveTrivialPhiHelper(Phi) {
// worklist algorithm using the Phi's operands, no repetition here
}

WDYT?

Copy link

github-actions bot commented Jul 28, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:analysis Includes value tracking, cost tables and constant folding
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants