CANN: Use smart pointers to manage ACL objects #17238

hipudding · 2025-11-13T13:10:52Z

Previously, ACL objects were managed via manual destruction, which led to multiple memory-leak issues during runtime.
This patch replaces manual memory management with smart pointers so that ACL objects are properly released and ownership is clearly defined.

Note that the ownership of an ACL object belongs to the function that creates it.
Other internal functions should operate on these ACL objects using raw pointers to avoid unintended ownership transfers.

Additionally, since aclTensorList automatically frees its contained aclTensor objects, any aclTensor added to a tensor list must release ownership to avoid double free operations.

This PR also removes the asynchronous task submission mechanism.
Due to changes in recent CANN versions, tiling time has significantly decreased. Even with a dual-thread submission model, the dispatch overhead still falls on the critical path, making async submission less beneficial.
Moreover, aclGraph support provides a much better path to reducing operator dispatch latency.

Make sure to read the contributing guidelines before submitting a PR

Previously, ACL objects were managed via manual destruction, which led to multiple memory-leak issues during runtime. This patch replaces manual memory management with smart pointers so that ACL objects are properly released and ownership is clearly defined. Note that the ownership of an ACL object belongs to the function that creates it. Other internal functions should operate on these ACL objects using raw pointers to avoid unintended ownership transfers. Additionally, since aclTensorList automatically frees its contained aclTensor objects, any aclTensor added to a tensor list must release ownership to avoid double free operations. This PR also removes the asynchronous task submission mechanism. Due to changes in recent CANN versions, tiling time has significantly decreased. Even with a dual-thread submission model, the dispatch overhead still falls on the critical path, making async submission less beneficial. Moreover, aclGraph support provides a much better path to reducing operator dispatch latency.

Copilot

Pull Request Overview

This PR refactors the CANN backend to replace manual memory management of ACL objects with smart pointers (std::unique_ptr with custom deleters), eliminating memory leaks and clarifying ownership semantics. Additionally, it removes the asynchronous task submission mechanism which is no longer beneficial due to CANN version improvements.

Key changes:

Introduces smart pointer wrappers (acl_tensor_ptr, acl_scalar_ptr, acl_int_array_ptr, acl_tensor_list_ptr) with custom deleters for automatic ACL resource cleanup
Removes task queue infrastructure (cann_task_queue, cann_task, aclnn_task, etc.)
Replaces all manual aclDestroy* calls with automatic cleanup via smart pointers
Replaces async helper functions with direct ACL API calls
Removes the Doxyfile configuration (documentation generation file)

Reviewed Changes

Copilot reviewed 5 out of 7 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
acl_tensor.h	Defines smart pointer types and custom deleters for ACL objects; adds factory functions returning smart pointers
acl_tensor.cpp	Implements factory functions returning smart pointers; removes unreachable return statement
aclnn_ops.h	Updates function signatures to use smart pointers; removes task queue classes and async helper functions
aclnn_ops.cpp	Converts all ACL object creation/usage to smart pointers; removes manual resource cleanup calls
ggml-cann.cpp	Updates tensor creation calls to use smart pointers; replaces async helpers with direct ACL calls; removes task queue usage
common.h	Removes task queue and async task infrastructure; updates constructor
Doxyfile	Removes Doxygen configuration file (2579 lines deleted)

ggml/src/ggml-cann/aclnn_ops.cpp

hipudding · 2025-11-14T06:28:19Z

Test case passwd：

2701/2701 tests passed
Backend CANN0: OK
Backend 2/3: CANN1
Skipping
Backend 3/3: CPU
Skipping
3/3 backends

noemotiovon

Thanks for your contribution! This PR is really exciting — we finally don’t have to manually release host-side ACL tensors anymore!

ggml/src/ggml-cann/acl_tensor.h

noemotiovon · 2025-11-14T06:35:28Z

Additionally, I think we still need to verify that there are currently no memory leaks.

ggml/src/ggml-cann/common.h

hipudding · 2025-11-14T09:04:45Z

Additionally, I think we still need to verify that there are currently no memory leaks.

I ran Valgrind on llama-server and didn’t find any leaks. There were a few false positives, though, coming from inside CANN.

DajanaV mentioned this pull request Nov 13, 2025

UPSTREAM PR #17238: CANN: use unique_ptr for Ascend Tensors auroralabs-loci/llama.cpp#190

Open

github-actions bot added ggml changes relating to the ggml tensor library for machine learning Ascend NPU issues specific to Ascend NPUs labels Nov 13, 2025

hipudding force-pushed the unique_ptr branch from 9efc21c to 8981848 Compare November 14, 2025 02:42

hipudding changed the title ~~CANN: use unique_ptr for Ascend Tensors~~ CANN: Use smart pointers to manage ACL objects Nov 14, 2025

hipudding marked this pull request as ready for review November 14, 2025 06:02

hipudding self-assigned this Nov 14, 2025

hipudding requested review from Copilot and noemotiovon November 14, 2025 06:06

Copilot started reviewing on behalf of hipudding November 14, 2025 06:06 View session

Copilot finished reviewing on behalf of hipudding November 14, 2025 06:07

Copilot AI reviewed Nov 14, 2025

View reviewed changes

ggml/src/ggml-cann/aclnn_ops.cpp Outdated Show resolved Hide resolved

noemotiovon approved these changes Nov 14, 2025

View reviewed changes

ggml/src/ggml-cann/acl_tensor.h Outdated Show resolved Hide resolved

ggml/src/ggml-cann/acl_tensor.h Outdated Show resolved Hide resolved

noemotiovon approved these changes Nov 14, 2025

View reviewed changes

ggml/src/ggml-cann/common.h Show resolved Hide resolved

CANN: resolve review comments

4dd3c07

hipudding force-pushed the unique_ptr branch from 5ad0e56 to 4dd3c07 Compare November 14, 2025 08:19

hipudding requested review from ggerganov and slaren November 14, 2025 09:43

slaren approved these changes Nov 14, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CANN: Use smart pointers to manage ACL objects #17238

CANN: Use smart pointers to manage ACL objects #17238

hipudding commented Nov 13, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

hipudding commented Nov 14, 2025

Uh oh!

noemotiovon left a comment

Uh oh!

Uh oh!

Uh oh!

noemotiovon commented Nov 14, 2025

Uh oh!

Uh oh!

hipudding commented Nov 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

CANN: Use smart pointers to manage ACL objects #17238

Are you sure you want to change the base?

CANN: Use smart pointers to manage ACL objects #17238

Conversation

hipudding commented Nov 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

hipudding commented Nov 14, 2025

Uh oh!

noemotiovon left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

noemotiovon commented Nov 14, 2025

Uh oh!

Uh oh!

hipudding commented Nov 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hipudding commented Nov 13, 2025 •

edited

Loading