-
Notifications
You must be signed in to change notification settings - Fork 265
Closed
Labels
performanceIssue relates to the speed, memory usage, or scaling aspects of the package.Issue relates to the speed, memory usage, or scaling aspects of the package.
Description
This initialization happens at every n-sized node with cost O(N) where N is the full data set size. It should just be initialized once at instantiation of the splitting rule.
To do that, we just need to add one more optional argument to the SplittingRuleFactory::create signature (the caller has has everything it needs for that).
Back when adding SurvivalSplittingRule
, I skipped this "optimization" with resulting signature change since it made practically zero difference in performance: log-rank splitting was the bottle neck. But, with #1509 (which also uses relabeled_failures
), this change could give a gain of around ~10%.
This is not super urgent, but worth addressing. Are you OK with this change, @jtibshirani ?
Metadata
Metadata
Assignees
Labels
performanceIssue relates to the speed, memory usage, or scaling aspects of the package.Issue relates to the speed, memory usage, or scaling aspects of the package.