Skip to content

Add another OEM update post #174

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 28, 2025
Merged

Add another OEM update post #174

merged 2 commits into from
Jul 28, 2025

Conversation

odow
Copy link
Member

@odow odow commented Jul 23, 2025

@jajhall @galabovaa @Opt-Mucca @joaquimg let me know if you have any changes/additions to add.

bound tree. The downside to this approach is that it is not deterministic;
repeated runs of the same model are not guaranteed to find the same optimal
solution. In our preliminary testing, the speed up that can be expected is
problem-dependent, but, with eight threads, it is common for the speedup to be
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing (22/07) on easy MIPLIB problems with 4 instances racing, the MIP race winner is usually about 50% slower, with just a few showing good speed up.

@mathgeekcoder wasn't getting 2--5 times speed-up with the Python variant

I'll do more testing in due course, but there's still development work to do

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about larger problems? Why is it slower?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW: I wasn't testing first-to-fail with highspy, so can't comment on speed-up. That said, a user on HiGHS discord noticed a 19.5 hr -> 5.25 hr speed-up (3.7x) with highspy concurrent + first-to-fail.

Also, I'm not sure if it's the same issue @jajhall is noticing, but I've seen a slow down on fiball.mps when using the C++ concurrent solver. However, if I perform the presolve step once and share the presolve info (extremely ugly hack), fiball.mps no longer had the slow down.

To clarify:
original: 33.8s
original (best seed): 6.6s
Concurrent (individual presolve): 20.9s
Concurrent (individual presolve, but only sharing solution with main thread): 11.4s
Concurrent (shared presolve): 6.6s

Clearly more testing is required, but hopefully this gives more insight.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is "but only sharing solution with main thread" 2x faster than the individual presolve?

Is it overhead of the communication between threads? Or is it because the new solutions cause a change in the search tree?

I don't see why the concurrent solver should ever be slower than serial. If sharing solutions is a bottleneck, then we shouldn't share solutions; just run in total isolation.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm inclined to think it's changing the search tree. But this is just one example. Need to test more.

Obviously threading has overheads, and there are already parts of the MIP solve which is parallelized, so we may need to be careful about task prioritization. But yeah, concurrent should typically be faster than synchronous (with default seed).

Copy link

@jajhall jajhall Jul 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sharing solutions will change the behaviour of an individual thread, so may well slow down a "lucky" fast serial run with one of the random seeds used when racing.

This code needs more R&D.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why the concurrent solver should ever be slower than serial. If sharing solutions is a bottleneck, then we shouldn't share solutions; just run in total isolation.

With enough threads there will be a slow-down because threads are competing with each other for memory access in a process that is memory-bound. This is why I've not tried racing more than 4 yet.

Only if there are no more threads than memory channels will no slowdown occur.

Copy link

@jajhall jajhall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See in-line comments

@odow odow merged commit 8ef80b3 into master Jul 28, 2025
@odow odow deleted the od/oem-jul branch July 28, 2025 10:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

5 participants