skip per-page fault loop for CMA buffers #975
Merged
+21
−12
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem and Fix for https://jira.xilinx.com/browse/AIESW-21501:
Buffer Object (BO) import time was scaling linearly O(n) with buffer size,
causing significant latency in VART applications. Import time increased
from ~250μs for 1.5MB buffers to ~2900μs for 24MB buffers.
Root Cause:
In amdxdna_gem_shmem_insert_pages(), after calling dma_buf_mmap() for
imported buffers, a per-page handle_mm_fault() loop was executed to fault
in each page. For CMA buffers, this is redundant because dma_buf_mmap()
already establishes the complete mapping efficiently.
Solution:
Add a check to detect CMA buffers and skip the
per-page fault loop for them. Non-CMA buffer types retain the existing
behavior for backward compatibility with other platforms.
Results (24MB buffer import):
Tested: