Skip to content

Conversation

@Flakebi
Copy link
Contributor

@Flakebi Flakebi commented Jan 2, 2026

There is an ongoing discussion in #150452 about using address spaces from the Rust language in some way.
As that discussion will likely not conclude soon, this PR adds one rustc_intrinsic with an addrspacecast to unblock getting basic information like launch and workgroup size and make it possible to implement something like core::gpu.

Add a rustc intrinsic amdgpu_dispatch_ptr to access the kernel dispatch packet on amdgpu.
The HSA kernel dispatch packet contains important information like the launch size and workgroup size.

The Rust intrinsic lowers to the llvm.amdgcn.dispatch.ptr LLVM intrinsic, which returns a ptr addrspace(4), plus an addrspacecast to addrspace(0), so it can be returned as a Rust reference.
The returned pointer/reference is valid for the whole program lifetime, and is therefore 'static.
The return type of the intrinsic (&'static ()) does not mention the struct so that rustc does not need to know the exact struct type. An alternative would be to define the struct as lang item or add a generic argument to the function.
Is this ok or is there a better way (also, should it return a pointer instead of a reference)?

Short version:

#[cfg(target_arch = "amdgpu")]
pub fn amdgpu_dispatch_ptr() -> &'static ();

Tracking issue: #135024

r? RalfJung as you are already aware of the background (feel free to re-assign)

@rustbot
Copy link
Collaborator

rustbot commented Jan 2, 2026

Some changes occurred to the intrinsics. Make sure the CTFE / Miri interpreter
gets adapted for the changes, if necessary.

cc @rust-lang/miri, @RalfJung, @oli-obk, @lcnr

@rustbot rustbot added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jan 2, 2026
@RalfJung
Copy link
Member

RalfJung commented Jan 2, 2026

I can help with design review but not for the implementation, sorry.
@rustbot reroll

Regarding the design, if the return type is "erased" I would suggest using a raw pointer instead of a reference.

@workingjubilee
Copy link
Member

yoink.

If anyone wants to chip in on the review, please feel free, I just want to make sure I have a gander before it ships so I can keep vaguely abreast of what's happening in this space.

#[rustc_intrinsic]
#[cfg(target_arch = "amdgpu")]
#[must_use = "returns a reference that does nothing unless used"]
pub fn amdgpu_dispatch_ptr() -> &'static ();
Copy link
Member

@RalfJung RalfJung Jan 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should have a new "amdgpu" file or so for this, to keep it separate from the typically more portable intrinsics in the rest of this file?

Or a new "gpu" file that offload also goes into? I don't know what a sensible grouping here would look like.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gpu.rs seems like a good place to start, to me, that way other gpu targets are encouraged to reuse code from amdgpu by generalizing it instead of repeating it (as we have discussed before, GPU targets are like siblings: they make much of their tiny differences, while being mostly similar).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Glancing at NVPTX, it seems they achieve the same things as AMDGPU here by having special registers that are read, whereas AMDGPU uses the struct pointer, so at least for this case they will differ.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

( given their code is JITted by the device driver anyways, for all I know they both actually use the same pattern in the actual machine code they lower to. )

Add a rustc intrinsic `amdgpu_dispatch_ptr` to access the kernel
dispatch packet on amdgpu.
The HSA kernel dispatch packet contains important information like the
launch size and workgroup size.

The Rust intrinsic lowers to the `llvm.amdgcn.dispatch.ptr` LLVM
intrinsic, which returns a `ptr addrspace(4)`, plus an addrspacecast to
`addrspace(0)`, so it can be returned as a Rust reference.

The returned pointer/reference is valid for the whole program lifetime,
and is therefore `'static`.

The return type of the intrinsic (`*const ()`) does not mention the
struct so that rustc does not need to know the exact struct type.
An alternative would be to define the struct as lang item or add a
generic argument to the function.

Short version:
```rust
#[cfg(target_arch = "amdgpu")]
pub fn amdgpu_dispatch_ptr() -> *const ();
```
@Flakebi Flakebi force-pushed the dispatch-ptr-intrinsic branch from 1d98b13 to 13d7a3c Compare January 2, 2026 19:09
@Flakebi
Copy link
Contributor Author

Flakebi commented Jan 2, 2026

Thanks for the quick reviews!
I changed the return type from a reference to *const () and moved the intrinsic to a new gpu.rs module.

cc @ZuseZ4 FYI for the core::intrinsics::gpu module.

@Flakebi Flakebi mentioned this pull request Jan 2, 2026
26 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants