Skip to content

Commit 2147e29

Browse files
authored
[AMDGPU] Tests for unnecessary S_WAIT_XCNT insertion (#145688)
Hardware does an implicit "S_WAIT_XCNT 0" between SMEM and VMEM instructions, so there will never be outstanding address translations for both SMEM and VMEM at the same time.
1 parent 18edd82 commit 2147e29

File tree

1 file changed

+42
-0
lines changed

1 file changed

+42
-0
lines changed

llvm/test/CodeGen/AMDGPU/wait-xcnt.mir

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -966,3 +966,45 @@ body: |
966966
$vgpr2 = V_MOV_B32_e32 $vgpr2, implicit $exec
967967
$sgpr0 = S_MOV_B32 0
968968
...
969+
970+
# TODO: Unnecessary wait before overwriting vgpr0.
971+
---
972+
name: overwrite_vgpr_after_smem
973+
tracksRegLiveness: true
974+
machineFunctionInfo:
975+
isEntryFunction: true
976+
body: |
977+
bb.0:
978+
liveins: $vgpr0_vgpr1, $sgpr0_sgpr1
979+
; GCN-LABEL: name: overwrite_vgpr_after_smem
980+
; GCN: liveins: $vgpr0_vgpr1, $sgpr0_sgpr1
981+
; GCN-NEXT: {{ $}}
982+
; GCN-NEXT: $vgpr2 = GLOBAL_LOAD_DWORD $vgpr0_vgpr1, 0, 0, implicit $exec
983+
; GCN-NEXT: $sgpr2 = S_LOAD_DWORD_IMM $sgpr0_sgpr1, 0, 0
984+
; GCN-NEXT: S_WAIT_XCNT 0
985+
; GCN-NEXT: $vgpr0 = V_MOV_B32_e32 0, implicit $exec
986+
$vgpr2 = GLOBAL_LOAD_DWORD $vgpr0_vgpr1, 0, 0, implicit $exec
987+
$sgpr2 = S_LOAD_DWORD_IMM $sgpr0_sgpr1, 0, 0
988+
$vgpr0 = V_MOV_B32_e32 0, implicit $exec
989+
...
990+
991+
# TODO: Unnecessary wait before overwriting sgpr0.
992+
---
993+
name: overwrite_sgpr_after_vmem
994+
tracksRegLiveness: true
995+
machineFunctionInfo:
996+
isEntryFunction: true
997+
body: |
998+
bb.0:
999+
liveins: $vgpr0_vgpr1, $sgpr0_sgpr1
1000+
; GCN-LABEL: name: overwrite_sgpr_after_vmem
1001+
; GCN: liveins: $vgpr0_vgpr1, $sgpr0_sgpr1
1002+
; GCN-NEXT: {{ $}}
1003+
; GCN-NEXT: $sgpr2 = S_LOAD_DWORD_IMM $sgpr0_sgpr1, 0, 0
1004+
; GCN-NEXT: $vgpr2 = GLOBAL_LOAD_DWORD $vgpr0_vgpr1, 0, 0, implicit $exec
1005+
; GCN-NEXT: S_WAIT_XCNT 0
1006+
; GCN-NEXT: $sgpr0 = S_MOV_B32 0
1007+
$sgpr2 = S_LOAD_DWORD_IMM $sgpr0_sgpr1, 0, 0
1008+
$vgpr2 = GLOBAL_LOAD_DWORD $vgpr0_vgpr1, 0, 0, implicit $exec
1009+
$sgpr0 = S_MOV_B32 0
1010+
...

0 commit comments

Comments
 (0)