ACL operators need to be made stateless to avoid runtime initialization overhead

**Output of 'strings libarm_compute.so | grep arm_compute_version':**
arm_compute_version=v23.11 Build options: {'Werror': '0', 'debug': '0', 'neon': '1', 'opencl': '0', 'embed_kernels': '0', 'os': 'linux', 'arch': 'armv8a', 'build': 'native', 'multi_isa': '1', 'fixed_format_kernels': '1', 'openmp': '1', 'cppthreads': '0'} Git hash=b'add70ace1e57f65d1ae4d0cedaec6e4578cf87ff'

**Platform:**
AWS c7g.16xl

**Operating System:**
Ubuntu 22.04



**Problem description:**
One of the important optimizations for better inference performance is to cut down the kernel initialization overhead. This can be achieved by caching the operator after first time initialization and reuse it across similar tensor shapes.  Today it's not possible to cache ACL operator because they maintain the workspace state along with the initialization and the workspace is specific to the gemm operation. 
The requirement is to make the operators stateless so that they get initialized once and reused across multiple gemm operations of the same shapes.
more details are in this oneDNN discussion: https://github.com/oneapi-src/oneDNN/pull/1455#discussion_r979043207

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ACL operators need to be made stateless to avoid runtime initialization overhead #1085

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

ACL operators need to be made stateless to avoid runtime initialization overhead #1085

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions