lib/Dialect/Torch/IR/TorchOps.cpp: fix: use-after-free: erasing an operation during folding #4274

Manewing · 2025-07-16T08:55:40Z

This fixes a SEGFAULT in the GreedyPatternRewriteDriver and adds a missing size check to the torch.aten._assert_tensor_metadata operation.

Erasing an operation during folding is not allowed. Folding the operation may eithermodify it in place or return a set of replacements, but may not erase the operation. (see https://github.com/llvm/llvm-project/blob/e56384ff540e68f9d0500fa27a95354c0730e37b/mlir/lib/Transforms/Utils/GreedyPatternRewriteDriver.cpp#L492-L508)

Doing this causes a SEGFAULT (witnessed on macOS Sequoia 15.5, Apple M4):

Stack dump:
0.	Program arguments: build/bin/torch-mlir-opt -canonicalize --split-input-file -verify-diagnostics test/Dialect/Torch/invalid_canonicalize.mlir
 #0 0x0000000104091524 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (build/bin/torch-mlir-opt+0x10140d524)
 #1 0x000000010408fa5c llvm::sys::RunSignalHandlers() (build/bin/torch-mlir-opt+0x10140ba5c)
 #2 0x0000000104091bc8 SignalHandler(int, __siginfo*, void*) (build/bin/torch-mlir-opt+0x10140dbc8)
 #3 0x0000000181e10624 (/usr/lib/system/libsystem_platform.dylib+0x1804ac624)
 #4 0x0000000103c1f7a8 (anonymous namespace)::GreedyPatternRewriteDriver::processWorklist() (build/bin/torch-mlir-opt+0x100f9b7a8)
 #5 0x0000000103c1cf4c mlir::applyPatternsGreedily(mlir::Region&, mlir::FrozenRewritePatternSet const&, mlir::GreedyRewriteConfig, bool*) (build/bin/torch-mlir-opt+0x100f98f4c)
 #6 0x0000000102c8f62c (anonymous namespace)::Canonicalizer::runOnOperation() (build/bin/torch-mlir-opt+0x10000b62c)
 #7 0x0000000103c72fa4 mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int) (build/bin/torch-mlir-opt+0x100feefa4)
 #8 0x0000000103c750d4 mlir::PassManager::run(mlir::Operation*) (build/bin/torch-mlir-opt+0x100ff10d4)
 #9 0x0000000102c8d774 performActions(llvm::raw_ostream&, std::__1::shared_ptr<llvm::SourceMgr> const&, mlir::MLIRContext*, mlir::MlirOptMainConfig const&) (build/bin/torch-mlir-opt+0x100009774)
#10 0x0000000102c8d35c llvm::LogicalResult llvm::function_ref<llvm::LogicalResult (std::__1::unique_ptr<llvm::MemoryBuffer, std::__1::default_delete<llvm::MemoryBuffer>>, llvm::raw_ostream&)>::callback_fn<mlir::MlirOptMain(llvm::raw_ostream&, std::__1::unique_ptr<llvm::MemoryBuffer, std::__1::default_delete<llvm::MemoryBuffer>>, mlir::DialectRegistry&, mlir::MlirOptMainConfig const&)::$_0>(long, std::__1::unique_ptr<llvm::MemoryBuffer, std::__1::default_delete<llvm::MemoryBuffer>>, llvm::raw_ostream&) (build/bin/torch-mlir-opt+0x10000935c)
#11 0x000000010403194c mlir::splitAndProcessBuffer(std::__1::unique_ptr<llvm::MemoryBuffer, std::__1::default_delete<llvm::MemoryBuffer>>, llvm::function_ref<llvm::LogicalResult (std::__1::unique_ptr<llvm::MemoryBuffer, std::__1::default_delete<llvm::MemoryBuffer>>, llvm::raw_ostream&)>, llvm::raw_ostream&, llvm::StringRef, llvm::StringRef)::$_0::operator()(llvm::StringRef) const (build/bin/torch-mlir-opt+0x1013ad94c)
#12 0x00000001040316a4 mlir::splitAndProcessBuffer(std::__1::unique_ptr<llvm::MemoryBuffer, std::__1::default_delete<llvm::MemoryBuffer>>, llvm::function_ref<llvm::LogicalResult (std::__1::unique_ptr<llvm::MemoryBuffer, std::__1::default_delete<llvm::MemoryBuffer>>, llvm::raw_ostream&)>, llvm::raw_ostream&, llvm::StringRef, llvm::StringRef) (build/bin/torch-mlir-opt+0x1013ad6a4)
#13 0x0000000102c87078 mlir::MlirOptMain(llvm::raw_ostream&, std::__1::unique_ptr<llvm::MemoryBuffer, std::__1::default_delete<llvm::MemoryBuffer>>, mlir::DialectRegistry&, mlir::MlirOptMainConfig const&) (build/bin/torch-mlir-opt+0x100003078)
#14 0x0000000102c8731c mlir::MlirOptMain(int, char**, llvm::StringRef, llvm::StringRef, mlir::DialectRegistry&) (build/bin/torch-mlir-opt+0x10000331c)
#15 0x0000000102c87538 mlir::MlirOptMain(int, char**, llvm::StringRef, mlir::DialectRegistry&) (build/bin/torch-mlir-opt+0x100003538)
#16 0x0000000102c85cd0 main (build/bin/torch-mlir-opt+0x100001cd0)
#17 0x0000000181a36b98 
build/tools/torch-mlir/test/Dialect/Torch/Output/invalid_canonicalize.mlir.script: line 1: 72586 Segmentation fault: 11  build/bin/torch-mlir-opt -canonicalize --split-input-file -verify-diagnostics test/Dialect/Torch/invalid_canonicalize.mlir

Since the torch.aten._assert_tensor_metadata operation is only used for static assertion during compile time the folding can be replaced by a canonicalization that checks the assert and then uses a rewriter to erase the operation.

The second commit deals with a missing size check in the assert operation before using a zip operation. Without the explicit checkout of the size, the would assert not fail in case the size of the dimensions were the same, but there are either less or more dimensions in the input than specified in the assert.

…eration during folding is illegal, convert to a canonicalization pattern instead

…ng check on size before using zip

sahas3

Thanks for the fix. LGTM but please wait for some-one else to approve before merging.

lib/Dialect/Torch/IR/TorchOps.cpp

…ding but canonicalizing Signed-off-by: Florian Walbroel <walbroel@roofline.ai>

Manewing · 2025-07-24T12:40:09Z

@sahas3 Thanks for your review. I updated the error message, good catch. Who else could review this?

Manewing · 2025-07-29T11:14:58Z

@vivekkhandelwal1 could you have a look at this please?

sahas3 · 2025-08-06T15:00:36Z

@sahas3 Thanks for your review. I updated the error message, good catch. Who else could review this?

Sorry for the late reply. I was on vacation. I think @vivekkhandelwal1 / @zjgarvey can provide the final approval. Thanks!

zjgarvey

Ah, great catch, thanks. This looks good to me. Let me know if you need me to press the merge button.

Manewing · 2025-08-06T19:42:57Z

@zjgarvey yes please

Florian Walbroel added 2 commits July 15, 2025 17:58

lib/Dialect/Torch/IR/TorchOps.cpp: fix: use-after-free: erasing an op…

9c356ff

…eration during folding is illegal, convert to a canonicalization pattern instead

lib/Dialect/Torch/IR/TorchOps.cpp: fix: assert_tensor_metadata: missi…

41c866e

…ng check on size before using zip

sahas3 approved these changes Jul 22, 2025

View reviewed changes

lib/Dialect/Torch/IR/TorchOps.cpp Outdated Show resolved Hide resolved

lib/Dialect/Torch/IR/TorchOps.cpp: fix: update error message, not fol…

a297d32

…ding but canonicalizing Signed-off-by: Florian Walbroel <walbroel@roofline.ai>

zjgarvey approved these changes Aug 6, 2025

View reviewed changes

zjgarvey merged commit 60ffb91 into llvm:main Aug 6, 2025
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

lib/Dialect/Torch/IR/TorchOps.cpp: fix: use-after-free: erasing an operation during folding #4274

lib/Dialect/Torch/IR/TorchOps.cpp: fix: use-after-free: erasing an operation during folding #4274

Uh oh!

Manewing commented Jul 16, 2025 •

edited

Loading

Uh oh!

sahas3 left a comment

Uh oh!

Uh oh!

Manewing commented Jul 24, 2025

Uh oh!

Manewing commented Jul 29, 2025

Uh oh!

sahas3 commented Aug 6, 2025

Uh oh!

zjgarvey left a comment

Uh oh!

Manewing commented Aug 6, 2025

Uh oh!

Uh oh!

Uh oh!

lib/Dialect/Torch/IR/TorchOps.cpp: fix: use-after-free: erasing an operation during folding #4274

lib/Dialect/Torch/IR/TorchOps.cpp: fix: use-after-free: erasing an operation during folding #4274

Uh oh!

Conversation

Manewing commented Jul 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sahas3 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Manewing commented Jul 24, 2025

Uh oh!

Manewing commented Jul 29, 2025

Uh oh!

sahas3 commented Aug 6, 2025

Uh oh!

zjgarvey left a comment

Choose a reason for hiding this comment

Uh oh!

Manewing commented Aug 6, 2025

Uh oh!

Uh oh!

Uh oh!

Manewing commented Jul 16, 2025 •

edited

Loading