-
Notifications
You must be signed in to change notification settings - Fork 20
Description
Describe the bug
A PreequilibratedSimulation workflow depends on sourcing from a file storage system (named stored_data
by default) that contains equilibrated data. When stored_data
is empty, running a PreequilibratedSimulation workflow with multiple protocols [?] triggers the error A protocol with the same id already exists in this workflow
.
To Reproduce
Run a re-fit and forget to copy over stored_data
or provide it as a path to LocalFileStorage
.
Output
Link to gist, also my own error below:
File "/data/homezvol3/lilyw7/miniforge3/envs/evaluator-050/lib/python3.11/site-packages/openff/evaluator/server/server.py", line 697, in _handle_connections
self._handle_stream(connection, connection.getpeername())
File "/data/homezvol3/lilyw7/miniforge3/envs/evaluator-050/lib/python3.11/site-packages/openff/evaluator/server/server.py", line 678, in _handle_stream
self._handle_job_submission(connection, address, message_length)
File "/data/homezvol3/lilyw7/miniforge3/envs/evaluator-050/lib/python3.11/site-packages/openff/evaluator/server/server.py", line 617, in _handle_job_submission
self._launch_batch(batch)
File "/data/homezvol3/lilyw7/miniforge3/envs/evaluator-050/lib/python3.11/site-packages/openff/evaluator/server/server.py", line 541, in _launch_batch
current_layer.schedule_calculation(
File "/data/homezvol3/lilyw7/miniforge3/envs/evaluator-050/lib/python3.11/site-packages/openff/evaluator/layers/layers.py", line 416, in schedule_calculation
futures = cls._schedule_calculation(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/homezvol3/lilyw7/miniforge3/envs/evaluator-050/lib/python3.11/site-packages/openff/evaluator/layers/workflow.py", line 229, in _schedule_calculation
workflow_graph, provenance = cls._build_workflow_graph(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/homezvol3/lilyw7/miniforge3/envs/evaluator-050/lib/python3.11/site-packages/openff/evaluator/layers/workflow.py", line 155, in _build_workflow_graph
workflow_graph.add_workflows(*workflows)
File "/data/homezvol3/lilyw7/miniforge3/envs/evaluator-050/lib/python3.11/site-packages/openff/evaluator/workflow/workflow.py", line 913, in add_workflows
workflow.replace_protocol(original_protocol, new_protocol, True)
File "/data/homezvol3/lilyw7/miniforge3/envs/evaluator-050/lib/python3.11/site-packages/openff/evaluator/workflow/workflow.py", line 487, in replace_protocol
raise ValueError(
ValueError: ('A protocol with the same id already exists in this workflow: ', 'dens_2188924364196997512|unpack_data')
Computing environment (please complete the following information):
- Operating system
- Output of running
conda list
Additional context
What's supposed to happen is that the protocol unpack_data_mixture
just errors when it runs. However, this error is triggering on workflow setup, before protocols get executed (or even passed to the dask scheduler).
(thinking out loud below)
Why are these protocols mergeable? I think mergeability checks the inputs and when simulation_data_path
is undefined that confuses Evaluator. simulation_data_path
is the only input to UnpackStoredEquilibrationData
so it's guaranteed to be confusing when one or more boxes has no simulation_data_path
. I think this has not been caught by tests because due to trying to keep them small and minimal, I've only tested the case where only one box is missing.
We could probably band-aid this with the same solution as #637 but for UnpackStoredEquilibrationData
.