improve va_arg
assembly on arm targets
#144549
Open
+50
−3
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
tracking issue: #44930
For this example
We currently generate (via llvm):
LLVM is not doing a good job. By using our own
emit_ptr_va_arg
we can save 3 instructions:But clang generates even fewer instructions by removing the use of
r2
in the above code. The crucial missing piece is ending the lifetime of theva_list
value, see https://godbolt.org/z/TK3rsdExG. Thatva_list
is generated internally, so I'm guessing that is why the lifetime end is not emitted by default? But theva_list
is a stack value so obviously the lifetime really does end before a return.With that we generate exactly the instructions that clang generates (though the loads and adds are slightly reordered):
The arguments to
emit_ptr_va_arg
are based on the clang implemenation.r? @workingjubilee (I can re-roll if your queue is too full, but you do seem like the right person here)