-
Notifications
You must be signed in to change notification settings - Fork 5.3k
[RISC-V] correct comments in emitLoadImmediate #122382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[RISC-V] correct comments in emitLoadImmediate #122382
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR corrects and improves comments in the emitLoadImmediate function for RISC-V code generation. The changes clarify a previously speculative comment that has been proven incorrect through investigation, fix a mismatch between comment and implementation for formula ordering, and modernize comment style to follow repository conventions.
Key changes:
- Clarifies that smaller offsets do not always yield fewer instructions, providing a concrete counterexample
- Corrects the formula in condition (3) from
(b - a) - (y - x) >= 11to(y - x) - (b - a) >= 11to match the actual implementation - Converts all block comments from
/* */style to//style per repository conventions
tomeksowi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, cc @fuad1502
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
|
Hi @credo-quia-absurdum, thank you for looking into this 🙏 I always wondered how the heuristic compared with an optimal solution. GCC, for example, if I recall correctly, uses backtracking to find the optimal solution. If you have the analysis publicly available, I would love to see it 🙏 If you want to look into optimizing this further as mentioned by @tomeksowi, you can refer to a draft PR I've made before here. I forgot why I didn't follow through with it, but maybe there's something there that can help 🙏 |
fuad1502
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you 🙏
@fuad1502 Here is the analysis I performed. I reproduced the emitter logic in a small C++ program with a simple register emulator and uniformly sampled across 64-bit space. In the text files linked below, Validating the "Smaller Offset Is Better" AssumptionFirst, I examined whether the assumption "the smaller offset always yields fewer instructions" holds. The results, available here, show that this assumption does not hold in a noticeable portion. In fact, 9,617 out of 1 million samples (~1%) were counterexamples where the larger offset produced a shorter instruction sequence. Potential Improvement Over the Current Heuristic with FallbackNext, I evaluated how much improvement would be possible if the heuristic were perfect, under the constraint that we fall back to For this analysis, I only tracked cases where:
Out of 1,000,000,000 sampled immediates, only 1,389 cases (~1 per 0.7M) showed any opportunity for improvement over the current implementation. This extremely small gain is primarily due to the fallback rule: once the instruction count exceeds five, the emitter switches to |
|
@credo-quia-absurdum Please check jit format error and remove trailing whitespace. |
|
@clamp03 Thanks for pointing it out. I've updated the PR and removed the trailing whitespace. |
Summary
This PR makes three improvements to the comments in
emitLoadImmediate://instead of/* */).@clamp03 @tomeksowi @SkyShield, @namu-lee
part of #84834, cc @dotnet/samsung
Details
1. Speculation in the comment proven incorrect
The previous comment included the following speculative statement.
I investigated this and confirmed that the statement is not always true.
In most cases, a smaller offset with subtract mode produces fewer instructions, but not always.
For example, for the immediate value:
The add form with the larger offset requires only 5 instructions,
while the subtract form with the smaller offset requires 8 instructions.
Addition Mode
Offset :
0xFC80'05F4lui0x7'39B80x0000'0000'739B'8000slli110x0000'039C'DC00'0000addi0x7E40x0000'039C'DC00'07E4slli210x739B'8000'FC80'0000addi0x5F40x739B'8000'FC80'05F4Subtract Mode
Offset :
0x037F'FA0Clui0x7'39B80x0000'0000'739B'8000addiw0x0010x0000'0000'739B'8001slli170x0000'E737'0002'0000addi0x9010x0000'E737'0001'F901slli110x0739'B800'0FC8'0800addi0x8600x0739'B800'0FC8'0060slli40x739B'8000'FC80'0600addi0xFF40x739B'8000'FC80'05F4I experimented with an optimization that computes the exact instruction count for each mode and selects the cheaper one. However, the overall improvement was minimal.
Uniformly sampling the 64-bit space showed that only about one case per million improved compared to the existing heuristic, so I decided to drop this approach.
Although there are more cases where the larger add offset yields a better sequence, the practical impact remains small because the code generator falls back to
emitDataConstwhenever the required instruction count exceeds five.2. Mismatch between comment and implementation
In the original comment, the third condition for enabling the left-shift form was written as:
(b - a) - (y - x) >= 11However, the actual implementation evaluates the expression in the opposite order:
This PR corrects the comment to match the implementation and also formats the conditions in the comment to clarify the combined condition
(1) && ( (2) || (3) ).