Commit 4a245b8
authored
[SYCL] optimize createSyclObjFromImpl calls to take rvalue-ref to shared_ptr (#20859)
The optimization results in moving shared_pointer inside
_createSyclObjFromImpl_ instead of copying and thanks to it we save two
atomic operations (see e.g. [this SO
thread](https://stackoverflow.com/a/41874953/1654158)).
I've applied it to all possible places in the code, leaving only these
where copying is indeed needed (mostly for _context_impl_ use).
### Results summary
overhead over UR reduced by ~8% in scenarios using events. Other
benchmarks also show visible improvements in many cases, including new
pytorch multiqueue benchmarks which improved overall by 2.7%
### Results Examples
The new result is expressed by dots on the right sides of plots.
<img width="1548" height="735" alt="SubmitKernel out of order using
events long kernel, CPU count(1)"
src="https://github.com/user-attachments/assets/829ab30e-76f3-42a8-b6d9-c17714a5a145"
/>
old = 134.6, new = 132.8, UR baseline = 113, overhead over UR reduced by
8.3%
<img width="1548" height="735" alt="SubmitKernel out of order with
completion using events, CPU count(4)"
src="https://github.com/user-attachments/assets/17e9d7cf-c88d-455e-b116-39e2f1b8f04c"
/>
old = 140, new = 138.2, UR baseline = 118.1, **overhead over UR reduced
by 8.1%**
<img width="1548" height="735" alt="SubmitKernel in order, CPU count(5)"
src="https://github.com/user-attachments/assets/1e3f784a-efdb-47cc-8dea-bc516bdad33a"
/>
old = 122.3, new = 121.3, UR baseline = 108.1, **overhead over UR
reduced by 7.0%**
<img width="1548" height="810" alt="SubmitKernel in order using
events(2)"
src="https://github.com/user-attachments/assets/e96770aa-9716-4293-a0b6-29babd25ee5e"
/>
old time = 13.91, new time = 13.58, **whole stack reduced by 2.4%**
And finally new pytorch microbenchmarks:
<img width="1548" height="735" alt="KernelSubmitMultiQueue small"
src="https://github.com/user-attachments/assets/d23b3c40-ae0e-4528-9c5e-f2726e3030ce"
/>
old time = 1.81, new time = 1.76, L0 baseline = 1.44
whole stack reduced by 2.8%, **overhead over L0 reduced by 13.5%**1 parent 4fa9f71 commit 4a245b8
File tree
7 files changed
+14
-12
lines changed- sycl
- include/sycl
- ext/oneapi
- source
- detail
- program_manager
7 files changed
+14
-12
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
73 | 73 | | |
74 | 74 | | |
75 | 75 | | |
76 | | - | |
| 76 | + | |
| 77 | + | |
77 | 78 | | |
78 | 79 | | |
79 | 80 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
731 | 731 | | |
732 | 732 | | |
733 | 733 | | |
734 | | - | |
| 734 | + | |
| 735 | + | |
735 | 736 | | |
736 | 737 | | |
737 | 738 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1723 | 1723 | | |
1724 | 1724 | | |
1725 | 1725 | | |
1726 | | - | |
1727 | | - | |
1728 | | - | |
1729 | | - | |
| 1726 | + | |
| 1727 | + | |
1730 | 1728 | | |
1731 | 1729 | | |
1732 | 1730 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
81 | 81 | | |
82 | 82 | | |
83 | 83 | | |
84 | | - | |
| 84 | + | |
85 | 85 | | |
86 | 86 | | |
87 | 87 | | |
| |||
103 | 103 | | |
104 | 104 | | |
105 | 105 | | |
106 | | - | |
| 106 | + | |
| 107 | + | |
107 | 108 | | |
108 | 109 | | |
109 | 110 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
348 | 348 | | |
349 | 349 | | |
350 | 350 | | |
351 | | - | |
| 351 | + | |
352 | 352 | | |
353 | 353 | | |
354 | 354 | | |
| |||
361 | 361 | | |
362 | 362 | | |
363 | 363 | | |
364 | | - | |
| 364 | + | |
365 | 365 | | |
366 | 366 | | |
367 | 367 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
516 | 516 | | |
517 | 517 | | |
518 | 518 | | |
519 | | - | |
| 519 | + | |
520 | 520 | | |
521 | 521 | | |
522 | 522 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
92 | 92 | | |
93 | 93 | | |
94 | 94 | | |
95 | | - | |
| 95 | + | |
| 96 | + | |
96 | 97 | | |
97 | 98 | | |
98 | 99 | | |
| |||
0 commit comments