-
Notifications
You must be signed in to change notification settings - Fork 0
feat: parallelize maximization step #72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: refactor/rework-arch
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
| def _optimization_worker( | ||
| state: PipelineState, | ||
| block: OptimizationBlock, | ||
| strategy: Callable, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unnecessary argument, you can get the strategy from the OptimizationBlock as follows:
block.maximization_strategy| strategy = self._strategies[block.maximization_strategy] | ||
| component_id, new_params = strategy(curr_mixture[block.component_id], state, block, self.optimizer) | ||
| results.append((component_id, new_params)) | ||
| results = Parallel(n_jobs=-1)( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
User may want to choose number of CPU cores
I would add n_jobs as a key parameter in the class constructor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a bug with joblib on Windows where running something with joblib outside of the if __name__ == "__main__" block causes the program to crash.
To solve this problem, I suggest explicitly stating in the documentation that a special launch method is required on Windows, and providing sample code.
Additionally, within run, you could check for the operating system, the number of processes, and whether it's inside the required if block. I'm not sure this is possible, but if possible, it would be a good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't forget add your name in __author__
This pull request introduces parallel computation to the MaximizationStep by leveraging the joblib.Parallel library.
Parallel MaximizationStep Implementation
Testing
Results