Skip to content

Conversation

@stephenswat
Copy link
Member

This commit, which is very similar to #1006, flattens several different calls to the navigator initialization into a single loop which reduces the amount of code inlining, improves instruction cache performance and also reducing divergence.

This commit, which is very similar to acts-project#1006, flattens several different
calls to the navigator initialization into a single loop which reduces
the amount of code inlining, improves instruction cache performance and
also reducing divergence.
@stephenswat stephenswat requested a review from niermann999 June 20, 2025 12:53
@stephenswat stephenswat added refactor refactoring the current codes performance Improvements to compute performance labels Jun 20, 2025
@sonarqubecloud
Copy link

@stephenswat
Copy link
Member Author

Before

------------------------------------------------------------------------------------------------------------------------------
Benchmark                                                                    Time             CPU   Iterations UserCounters...
------------------------------------------------------------------------------------------------------------------------------
BM_PROPAGATION_TOY_DETECTOR_W_COV_TRANSPORT_262144_TRACKS/real_time   51675881 ns     51580750 ns           13 TracksPropagated=5.07285M/s
BM_PROPAGATION_TOY_DETECTOR_262144_TRACKS/real_time                   16518001 ns     16488511 ns           42 TracksPropagated=15.8702M/s

After

------------------------------------------------------------------------------------------------------------------------------
Benchmark                                                                    Time             CPU   Iterations UserCounters...
------------------------------------------------------------------------------------------------------------------------------
BM_PROPAGATION_TOY_DETECTOR_W_COV_TRANSPORT_262144_TRACKS/real_time   48408848 ns     48318820 ns           15 TracksPropagated=5.41521M/s
BM_PROPAGATION_TOY_DETECTOR_262144_TRACKS/real_time                   15791432 ns     15763779 ns           44 TracksPropagated=16.6004M/s

So seems to improve performance by about 5-7%.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance Improvements to compute performance refactor refactoring the current codes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant