Skip to content

Conversation

@xImoZA
Copy link

@xImoZA xImoZA commented Oct 13, 2025

This pull request introduces parallel computation to the MaximizationStep by leveraging the joblib.Parallel library.

Parallel MaximizationStep Implementation

  • Added Parallel Execution in rework_pysatl_mpest/estimators/iterative/steps/maximization_step.py

Testing

  • A serial_executor mocking strategy was implemented to ensure that the parallel logic is executed sequentially during testing.

Results

  • Benchmarks were executed with 5,000,000 samples and 100 components on 20 runs.
Version Mean time (s) Std Relative Speed
Sequential 8.4879 ±0.5904 ---
Parallel (threading) 6.3824 ±0.0438 1.33× faster

@xImoZA xImoZA linked an issue Oct 13, 2025 that may be closed by this pull request
Copy link

@iraedeus iraedeus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

def _optimization_worker(
state: PipelineState,
block: OptimizationBlock,
strategy: Callable,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary argument, you can get the strategy from the OptimizationBlock as follows:

block.maximization_strategy

strategy = self._strategies[block.maximization_strategy]
component_id, new_params = strategy(curr_mixture[block.component_id], state, block, self.optimizer)
results.append((component_id, new_params))
results = Parallel(n_jobs=-1)(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

User may want to choose number of CPU cores
I would add n_jobs as a key parameter in the class constructor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a bug with joblib on Windows where running something with joblib outside of the if __name__ == "__main__" block causes the program to crash.

To solve this problem, I suggest explicitly stating in the documentation that a special launch method is required on Windows, and providing sample code.

Additionally, within run, you could check for the operating system, the number of processes, and whether it's inside the required if block. I'm not sure this is possible, but if possible, it would be a good idea.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't forget add your name in __author__

@xImoZA xImoZA requested a review from iraedeus October 13, 2025 17:12
@iraedeus iraedeus added the invalid This doesn't seem right label Oct 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

invalid This doesn't seem right

Projects

None yet

Development

Successfully merging this pull request may close these issues.

FEAT: Parallelize MaximizationStep

3 participants