-
Notifications
You must be signed in to change notification settings - Fork 11
Add another OEM update post #174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
_posts/2025-07-21-oem_update.md
Outdated
bound tree. The downside to this approach is that it is not deterministic; | ||
repeated runs of the same model are not guaranteed to find the same optimal | ||
solution. In our preliminary testing, the speed up that can be expected is | ||
problem-dependent, but, with eight threads, it is common for the speedup to be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Testing (22/07) on easy MIPLIB problems with 4 instances racing, the MIP race winner is usually about 50% slower, with just a few showing good speed up.
@mathgeekcoder wasn't getting 2--5 times speed-up with the Python variant
I'll do more testing in due course, but there's still development work to do
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about larger problems? Why is it slower?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW: I wasn't testing first-to-fail with highspy, so can't comment on speed-up. That said, a user on HiGHS discord noticed a 19.5 hr -> 5.25 hr speed-up (3.7x) with highspy concurrent + first-to-fail.
Also, I'm not sure if it's the same issue @jajhall is noticing, but I've seen a slow down on fiball.mps when using the C++ concurrent solver. However, if I perform the presolve step once and share the presolve info (extremely ugly hack), fiball.mps no longer had the slow down.
To clarify:
original: 33.8s
original (best seed): 6.6s
Concurrent (individual presolve): 20.9s
Concurrent (individual presolve, but only sharing solution with main thread): 11.4s
Concurrent (shared presolve): 6.6s
Clearly more testing is required, but hopefully this gives more insight.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is "but only sharing solution with main thread" 2x faster than the individual presolve?
Is it overhead of the communication between threads? Or is it because the new solutions cause a change in the search tree?
I don't see why the concurrent solver should ever be slower than serial. If sharing solutions is a bottleneck, then we shouldn't share solutions; just run in total isolation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm inclined to think it's changing the search tree. But this is just one example. Need to test more.
Obviously threading has overheads, and there are already parts of the MIP solve which is parallelized, so we may need to be careful about task prioritization. But yeah, concurrent should typically be faster than synchronous (with default seed).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sharing solutions will change the behaviour of an individual thread, so may well slow down a "lucky" fast serial run with one of the random seeds used when racing.
This code needs more R&D.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see why the concurrent solver should ever be slower than serial. If sharing solutions is a bottleneck, then we shouldn't share solutions; just run in total isolation.
With enough threads there will be a slow-down because threads are competing with each other for memory access in a process that is memory-bound. This is why I've not tried racing more than 4 yet.
Only if there are no more threads than memory channels will no slowdown occur.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See in-line comments
@jajhall @galabovaa @Opt-Mucca @joaquimg let me know if you have any changes/additions to add.