Commit 01bab43
committed
RV64: Avoid repeated VLEN-evaluation in rejection sampling
For VLEN >= 512, there are tail iterations in the rejection
handling loop where we require less coefficients than fit into
a vector, requiring a adjustment of the dynamic VL.
The previous code did re-evaluate the dynamic VL in every iteration,
which incurred a signifcant runtime cost. This commit instead splits
the rejection sampling loop in two nested loops, where the inner loop
proceeds for a fixed VL and the outer loop re-evaluates the VL.
For VL <= 256, there is only one iteration of the outer loop, rendering
it as efficient as the original verison.
Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>1 parent 6dc4a61 commit 01bab43
1 file changed
+22
-16
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
751 | 751 | | |
752 | 752 | | |
753 | 753 | | |
754 | | - | |
755 | | - | |
756 | | - | |
757 | | - | |
758 | | - | |
759 | | - | |
760 | | - | |
761 | | - | |
762 | | - | |
763 | | - | |
764 | | - | |
765 | | - | |
766 | | - | |
| 754 | + | |
| 755 | + | |
| 756 | + | |
| 757 | + | |
767 | 758 | | |
768 | | - | |
| 759 | + | |
| 760 | + | |
| 761 | + | |
| 762 | + | |
| 763 | + | |
| 764 | + | |
| 765 | + | |
| 766 | + | |
| 767 | + | |
| 768 | + | |
| 769 | + | |
| 770 | + | |
| 771 | + | |
| 772 | + | |
| 773 | + | |
| 774 | + | |
| 775 | + | |
| 776 | + | |
769 | 777 | | |
770 | | - | |
771 | | - | |
772 | 778 | | |
773 | 779 | | |
774 | 780 | | |
| |||
0 commit comments