Skip to content

improve va_arg assembly on arm targets #144549

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

folkertdev
Copy link
Contributor

tracking issue: #44930

For this example

#![feature(c_variadic)]

#[unsafe(no_mangle)]
unsafe extern "C" fn variadic(a: f64, mut args: ...) -> f64 {
    let b = args.arg::<f64>();
    let c = args.arg::<f64>();

    a + b + c
}

We currently generate (via llvm):

variadic:
    sub     sp, sp, #12
    stmib   sp, {r2, r3}
    vmov    d0, r0, r1
    add     r0, sp, #4
    vldr    d1, [sp, #4]
    add     r0, r0, #15
    bic     r0, r0, #7
    vadd.f64        d0, d0, d1
    add     r1, r0, #8
    str     r1, [sp]
    vldr    d1, [r0]
    vadd.f64        d0, d0, d1
    vmov    r0, r1, d0
    add     sp, sp, #12
    bx      lr

LLVM is not doing a good job. By using our own emit_ptr_va_arg we can save 3 instructions:

variadic:
	sub sp, sp, #12
	stmib sp, {r2, r3}
	vmov d0, r0, r1
	add r2, sp, #4
	vldr d1, [sp, #4]
	add r2, r2, #16
	vldr d2, [sp, #12]
	vadd.f64 d0, d0, d1
	vadd.f64 d0, d0, d2
	vmov r0, r1, d0
	str r2, [sp], #12
	bx lr

But clang generates even fewer instructions by removing the use of r2 in the above code. The crucial missing piece is ending the lifetime of the va_list value, see https://godbolt.org/z/TK3rsdExG. That va_list is generated internally, so I'm guessing that is why the lifetime end is not emitted by default? But the va_list is a stack value so obviously the lifetime really does end before a return.

With that we generate exactly the instructions that clang generates (though the loads and adds are slightly reordered):

variadic:
    sub sp, sp, #12
    stmib sp, {r2, r3}
    vmov d0, r0, r1
    vldr d1, [sp, #4]
    vldr d2, [sp, #12]
    vadd.f64 d0, d0, d1
    vadd.f64 d0, d0, d2
    vmov r0, r1, d0
    add sp, sp, #12
    bx lr

The arguments to emit_ptr_va_arg are based on the clang implemenation.

r? @workingjubilee (I can re-roll if your queue is too full, but you do seem like the right person here)

@rustbot
Copy link
Collaborator

rustbot commented Jul 27, 2025

workingjubilee is currently at their maximum review capacity.
They may take a while to respond.

@rustbot rustbot added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jul 27, 2025
@rustbot
Copy link
Collaborator

rustbot commented Jul 27, 2025

Some changes occurred in compiler/rustc_codegen_ssa

cc @WaffleLapkin

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. F-c_variadic `#![feature(c_variadic)]` S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants