-
Notifications
You must be signed in to change notification settings - Fork 74
Feature: Enable appending to trajectory when using ts.optimize/ts.integrate
#361
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
orionarcher
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also fixes a bug where, whenever a structure has converged, all other trajectories are reset by re-opening them in "w" mode: https://github.com/TorchSim/torch-sim/blob/main/torch_sim/runners.py#L483.
Can you say more here? This would be a pretty serious bug because it would mean ll the optimization trajectories are wrong.
More generally, I think this PR needs to grapple with the fact that if we are appending, the trajectories we are appending to might have different past progress. In a batched setting, we can't just take the maximum progress of one as the progress of all. I realize that makes literally everything more annoying but such is the curse and blessing of TorchSim.
Unfortunately I believe this affects all optimization trajectories. Looking at |
|
I would rather push for a PR allowing for a restart of simulation #43. In practice the difference is that supposed if you run multiple time |
@thomasloux happy to change it to this way of running. Just to make sure I understand. The proposal is to:
If that's the case, I think we have to do it like in the |
|
For a MD simulations at least, there is no such thing, all systems evolve at the same time. I actually assumes it the same for optimize, if not for convergence criteria that would be met. |
|
I don't have any strict opinion on optimization as I don't use it this much |
Thanks @thomasloux! I've updated the code to assume that for |
thomasloux
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's exactly what I wanted for restart and ts.integrate ! Just waiting for the tests to pass to approve
|
Will take a look at the failures and hopefully fix them |
|
Tests pass locally for me now, let's see what happens in the CI. |
thomasloux
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One part to clarify for ts.optimize, otherwise it's a really nice work. I think in a later PR we should clarify the Trajectory objects. I'm not a fan of the mode 'r' vs 'a'. I would probably prefer an argument: either 'restart=True' or 'overwrite_if_exists=True'.
thomasloux
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good for me now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the code to assume that for integrate() all trajectories start from the same time step, and then only integrate for n_steps - initial_step steps. Is that in line with what you had in mind?
Sorry for being a bit late on this, but I am going to push back on this implementation.
I think that n_steps should mean the number of steps you take, not the number of steps you reach. My heavy prior would be that n_steps is a number of steps taken, as a user, I'd find it quite strange if that wasn't the case. What do you think?
I also don't think that if the steps don't match that we should truncate all the trajectories to the same length. In my mind, we should detect the current step across all the trajectories, then progressively append to that step + i, where i is the number of steps taken by the integrator. Does that make sense?
I want to say I am quite appreciative of the PR and I think being able to append with the runners is a good feature. I am being nit picky because I think it's challenging to get the right behavior and expectations for this in a batched context.
|
Hey Orion, I think that both implementations make sense. I see two possibilities:
I think the 2nd option is more appealing then the first one. Also we could easily obtained the first behaviour by adding an argument to the functions. The truncation is also a direct result of the fact I pushed for a restart behaviour. At the moment a MD simulation from ts.integrate should always be in sync for all states in batch, so it does not make sense to allow different indices in this PR. We could change that in another PR though |
|
Good framing. Both use cases are valid so perhaps it's worth considering the left out group. Finishing the simulation if
|
|
In the case of an interupted simulation it's likely that there will be some files that are complete, some that are incomplete, and some that have not started. We need a way to resume simulations that balances ease, sharp edges, and encouraging best practices. Here is what I propose:
Does this workflow feel too complicated? Let me know what y'all think. |
Hi @orionarcher, Thanks for the detailed suggestions. That makes sense to me, I'll incorporate these suggestions, though due to vacation might only have the time towards the end of the month. Let's also iron out the last two open discussions in this PR and then hopefully we're good to go :). Daniel |
|
@orionarcher could you take another look? I've changed the behavior to always append The test failures here appear to be unrelated to this PR, feel free to let me know if you think otherwise. Thanks! |
Summary
test_optimize_append_to_trajectoryandtest_integrate_append_to_trajectoryto ensure trajectories are being appended to."w"mode: https://github.com/TorchSim/torch-sim/blob/main/torch_sim/runners.py#L483.Closes #356.
Checklist
Before a pull request can be merged, the following items must be checked: