Skip to content

Conversation

Shaobo-Zhou
Copy link
Contributor

@Shaobo-Zhou Shaobo-Zhou commented Jul 29, 2025

Description

This PR introduces significant improvements to the action space and handling of stochastic mapping passes in the reinforcement learning environment:

🚀 Major Changes
Expanded Action Space

  • Added AIRouting as a routing option and wrapped it in SafeAIRouting for robust integration into the RL pipeline.
  • Introduced hybrid mapping actions that combine layout and routing stages.

Motivation for Mapping Redesign

  • The previous separation of layout and routing did not truly reflect the state machine described in the MQT Predictor paper.
  • If a circuit is mapped (i.e., layout + routing) and then undergoes further optimization passes, it may become "unmapped" again — violating hardware constraints or invalidating the original layout.
  • In such cases, continuing with the previous layout and only re-running routing leads to suboptimal results, as the old layout is no longer optimal for the changed circuit structure.
  • The redesigned mapping action ensures layout and routing are always performed together when remapping is needed, preventing mismatches between layout and circuit state and improving both optimality and reliability.

Support for Stochastic Passes

  • Wrapped stochastic actions (e.g., AIRouting, SabreLayout) in a multi-trial evaluation loop, similar to QiskitO3 pipeline.
  • Introduced (layout_trials, routing_trials) as parameters to control trial counts, enabling improved predictor performance and stability.
  • Applied Qiskit O3 default parameters for SabreLayout and VF2Layout.

Fixes and Enhancements

  • Fixed a bug in OptimizeCliffords by ensuring CollectCliffords runs beforehand.
  • Prevented Qiskit fallback to default gate basis ['id', 'u1', 'u2', 'u3', 'cx'] (see [https://quantum.cloud.ibm.com/docs/en/api/qiskit/0.24/transpiler]) during:
    • decompose
    • tk_to_qiskit conversion
    • Optimize1qGatesDecomposition
      ensuring correct native gate usage which is essential if individual gate counts are included in the feature space—otherwise RL cannot meaningfully learn from a generic gate basis.
  • Fixed incorrect usage of GatesInBasis in rl/predictorenv.py
  • Changed benchmark level to INDEP in test_predictor_rl.py, since the current action space does not guarantee support for high-level gates.

Dependency Update

  • Added qiskit-ibm-ai-local-transpiler to the dependencies
  • Pinned networkx==2.8.5 to ensure compatibility with qiskit-ibm-ai-local-transpiler
  • Upgraded pytket_qiskit>=0.71.0
  • Upgraded torch>=2.7.1,<2.8.0
  • Removed support for Python 3.13 due to incompatibility with qiskit-ibm-ai-local-transpiler (requires <=3.12)

Checklist:

  • The pull request only contains commits that are focused and relevant to this change.
  • I have added appropriate tests that cover the new/changed functionality.
  • I have updated the documentation to reflect these changes.
  • I have added entries to the changelog for any noteworthy additions, changes, fixes, or removals.
  • I have added migration instructions to the upgrade guide (if needed).
  • The changes follow the project's style guidelines and introduce no new warnings.
  • The changes are fully tested and pass the CI checks.
  • I have reviewed my own code changes.

@Shaobo-Zhou Shaobo-Zhou marked this pull request as ready for review July 29, 2025 16:13
@Shaobo-Zhou Shaobo-Zhou marked this pull request as draft July 29, 2025 16:36
@Shaobo-Zhou Shaobo-Zhou marked this pull request as ready for review July 29, 2025 16:37
Fix: resolve pre-commit issues and add missing annotations

Fix: resolve pre-commit issues and add missing annotations

Remove example_test.py

Remove example_test.py
@Shaobo-Zhou Shaobo-Zhou force-pushed the hybrid-mapping branch 4 times, most recently from fd457a0 to 4b78134 Compare July 29, 2025 19:39
@Shaobo-Zhou Shaobo-Zhou marked this pull request as draft July 29, 2025 19:53
Fix: resolve pre-commit issues and add missing annotations

Fix: resolve pre-commit issues and add missing annotations

Fix: resolve pre-commit issues and add missing annotations
@burgholzer
Copy link
Member

@Shaobo-Zhou Just fyi: you can also run mypy locally to debug this on your machine without relying on CI by following: https://mqt.readthedocs.io/projects/predictor/en/latest/development_guide.html#code-formatting-and-linting
You can also run the tests locally as described here: https://mqt.readthedocs.io/projects/predictor/en/latest/development_guide.html#running-tests

@burgholzer
Copy link
Member

@Shaobo-Zhou Is there any reason you keep closing your issues?

@burgholzer burgholzer reopened this Jul 30, 2025
@Shaobo-Zhou
Copy link
Contributor Author

@burgholzer Hi, sorry for the confusion. I had been closing PRs because I noticed that each push triggered notifications, and I didn’t want to create unnecessary noise while I was still figuring things out.
I’ll keep this PR open now and run the checks locally first, then push updates once everything passes. Thanks for your clarification!

@burgholzer
Copy link
Member

Don't worry about the notifications. That's fine and it's exactly what draft PRs are for 😌
You also do not need to force push. You can simply accumulate commits. We squash merge PRs anyway.
If you want to avoid the automatic commits from the pre-commit bot in the CI, you can simply run pre-commit locally before pushing

pre-commit run -a

or

nox -s lint

and commit the resulting changes.

Once the PR is really ready for review, simply give us a ping here.

@burgholzer burgholzer added enhancement New feature or request minor Part of a minor release refactor PR or issues that refactor code labels Aug 4, 2025
Fix bugs

Fix bugs

Fix bugs
@burgholzer
Copy link
Member

By the way: If you are annoyed that you always have to wait for someone from the team to trigger the CI, the easiest way to get around that is to submit another PR with a small change (could be a typo fix or similar) that we can quickly merge. This makes you a repeating contributor, which automatically runs CI without direct approval.
Just in case that's starting to become annoying.

@Shaobo-Zhou
Copy link
Contributor Author

@burgholzer Thanks for the info!

Copy link

codecov bot commented Aug 7, 2025

Codecov Report

❌ Patch coverage is 83.15217% with 31 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/mqt/predictor/rl/helper.py 70.8% 14 Missing ⚠️
src/mqt/predictor/rl/actions.py 81.9% 13 Missing ⚠️
src/mqt/predictor/rl/predictorenv.py 92.8% 4 Missing ⚠️

📢 Thoughts on this report? Let us know!

@Shaobo-Zhou Shaobo-Zhou marked this pull request as ready for review August 8, 2025 12:05
@Shaobo-Zhou
Copy link
Contributor Author

@burgholzer Could you also add @flowerthrower as a reviewer for this PR so we can clarify any potential confusion during the review process?

@burgholzer
Copy link
Member

@burgholzer Could you also add @flowerthrower as a reviewer for this PR so we can clarify any potential confusion during the review process?

Done ✅

Copy link
Collaborator

@flowerthrower flowerthrower left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Shaobo, and thank you for setting up this PR.
Great to see you working with our setup — this should make future PRs easier to manage.

A few general points (in addition to the inline comments):

  • Why is it no longer possible to work with algorithm-level circuits (but was before)?
  • This PR should focus as much as possible on adding the new actions.
  • Other improvements not strictly required for these passes should be left out of this single-purpose PR, perhaps we can make this a bit more lean.

determine_valid_actions_for_state
I agree, better documentation would help, but the function generally works as intended.
No need to merge layout and routing.

If a circuit becomes unmapped, determine_valid_actions_for_state already enforces:

  • Re-layout if self.layout is unset
  • Re-routing if unrouted

This depends on self.layout being correctly updated after optimizations.
If that is currently not the case, I would suggest to keep the current structure, but ensure self.layout always matches the circuit state after each optimization.


Next steps:
I haven’t yet reviewed the new passes in detail, but you can already address this feedback (may affect the combined actions) and request another review once done.

Overall, I am happy to see this progressing — this brings us really closer to integrating your work into the MQT Predictor.

@@ -29,7 +30,7 @@

# TODO(denialhaag): Add 3.14 when all dependencies support it
# https://github.com/munich-quantum-toolkit/predictor/issues/420
PYTHON_ALL_VERSIONS = ["3.10", "3.11", "3.12", "3.13"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems rather unfortunate to lose Python 3.13 support to enable a single pass; it would be preferable to make the pass optional based on the available Python version (ideally with a short comment that mentions the current limitation and can be revisited in the future).

@@ -66,7 +67,9 @@ def _run_tests(
"test",
*install_args,
"pytest",
"-v",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No expert on this, but these flags seem like artefacts from debugging and apparently were not necessary before (are they now?).

@@ -82,6 +85,9 @@ def tests(session: nox.Session) -> None:
@nox.session(reuse_venv=True, venv_backend="uv", python=PYTHON_ALL_VERSIONS)
def minimums(session: nox.Session) -> None:
"""Test the minimum versions of dependencies."""
if platform.system() == "Windows":
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we know why this is suddenly too slow? Seems unfortunate to miss this test if it was running fine before (note, there have always been longer test durations with the Predictor, so this is potentially fine).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually the windows test runs fine and is not skipped...I will delete this part

@@ -133,5 +139,7 @@ def docs(session: nox.Session) -> None:
"--frozen",
"sphinx-autobuild" if serve else "sphinx-build",
*shared_args,
"-v",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

@@ -44,8 +44,10 @@ dependencies = [
"numpy>=1.24; python_version >= '3.11'",
"numpy>=1.22",
"numpy>=1.22,<2; sys_platform == 'darwin' and 'x86_64' in platform_machine and python_version < '3.13'", # Restrict numpy v2 for macOS x86 since it is not supported anymore since torch v2.3.0
"torch>=2.7.1,<2.8.0; sys_platform == 'darwin' and 'x86_64' in platform_machine and python_version < '3.13'", # Restrict torch v2.3.0 for macOS x86 since it is not supported anymore.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason for deleting this?

layouted_qc = layout_pm.run(qc)
layout_props = dict(layout_pm.property_set)
except Exception:
return qc, {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps an error message as below would be nice.

qc: The input quantum circuit.
max_iteration: A tuple (layout_trials, routing_trials) specifying
how many times to try.
metric_fn: Optional function to score circuits; defaults to circuit depth.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use the already existing figure_of_merits here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be an option, I tried to use the standard SWAP count metric as in Sabre

if getattr(action, "stochastic", False):

def metric_fn(circ: QuantumCircuit) -> float:
return float(circ.count_ops().get("swap", 0))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the docstring above, it was mentioned that this defaults to circuit depth?

@@ -37,7 +38,7 @@ def test_predictor_env_reset_from_string() -> None:
device = get_device("ibm_eagle_127")
predictor = Predictor(figure_of_merit="expected_fidelity", device=device)
qasm_path = Path("test.qasm")
qc = get_benchmark("dj", BenchmarkLevel.ALG, 3)
qc = get_benchmark("dj", BenchmarkLevel.INDEP, 3)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this restriction is really unavoidable (see general comment), then the examples and documentation should be updated accordingly.

ml_predictor.compile_training_circuits(
timeout=600, path_compiled_circuits=target_path, path_uncompiled_circuits=source_path, num_workers=1
)
ml_predictor.compile_training_circuits(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be an issue as the timeout_watcherstill executes the function (even though without the timeout).

@Shaobo-Zhou
Copy link
Contributor Author

Hi Shaobo, and thank you for setting up this PR. Great to see you working with our setup — this should make future PRs easier to manage.

A few general points (in addition to the inline comments):

  • Why is it no longer possible to work with algorithm-level circuits (but was before)?
  • This PR should focus as much as possible on adding the new actions.
  • Other improvements not strictly required for these passes should be left out of this single-purpose PR, perhaps we can make this a bit more lean.

determine_valid_actions_for_state I agree, better documentation would help, but the function generally works as intended. No need to merge layout and routing.

If a circuit becomes unmapped, determine_valid_actions_for_state already enforces:

  • Re-layout if self.layout is unset
  • Re-routing if unrouted

This depends on self.layout being correctly updated after optimizations. If that is currently not the case, I would suggest to keep the current structure, but ensure self.layout always matches the circuit state after each optimization.

Next steps: I haven’t yet reviewed the new passes in detail, but you can already address this feedback (may affect the combined actions) and request another review once done.

Overall, I am happy to see this progressing — this brings us really closer to integrating your work into the MQT Predictor.

Thanks for the feedback! I will work on those points in detail, but maybe first some comments to the general points you mentioned:

  • Why it is no longer possible to work with algorithm-level circuits (but was before):
    From my experience with earlier releases, the RL predictor was specifically designed for indep-level circuits, so I am not sure why the current tests still use algorithm-level circuits in this way. Given that the action space does not include HighLevelSynthesis, it should not be able to resolve special gates such as Oracle. I will try to debug how this worked previously, but my suspicion is that the terminate status might never have been reached and the tests passed due to timeout(see the latest closed PR for reference).

  • Merging of layout and routing:
    As you mentioned, one reason for merging is that the current structure does not perform re-layout after the circuit state changes. Introducing re-layout would require additional checks in the logic to ensure that a re-layout is enforced after every optimization step, in order to match the current circuit state. This can get a bit messy. I can try to do this in the current PR and then include a merged version of layout and routing in a follow-up PR.

@flowerthrower
Copy link
Collaborator

flowerthrower commented Aug 13, 2025

  • Merging of layout and routing:
    As you mentioned, one reason for merging is that the current structure does not perform re-layout after the circuit state changes. Introducing re-layout would require additional checks in the logic to ensure that a re-layout is enforced after every optimization step, in order to match the current circuit state. This can get a bit messy. I can try to do this in the current PR and then include a merged version of layout and routing in a follow-up PR.

Currently, determine_valid_actions_for_state works as follows:

  • If not synthesized:

    • If laid out → perform synthesis, optimization, or routing
    • Else → perform synthesis and optimization only
  • If synthesized:

    • If laid out and mapped → perform optimization or terminate
    • If only laid out → perform routing
    • Else → perform layout, mapping, or optimization

This structure already ensures re-layout whenever necessary and enforces the flow described in the predictor paper — i.e., start with synthesis before mapping. That is of course, as long as the layout attribute is correctly set after it has changed, e.g., in an optimization pass.

EDIT: I can see how this order is broken in the case of not synthesized + laid out, and a routing pass is selected, but I don't see how that would break the re-layout whenever necessary.

@Shaobo-Zhou
Copy link
Contributor Author

Shaobo-Zhou commented Aug 14, 2025

  • Merging of layout and routing:
    As you mentioned, one reason for merging is that the current structure does not perform re-layout after the circuit state changes. Introducing re-layout would require additional checks in the logic to ensure that a re-layout is enforced after every optimization step, in order to match the current circuit state. This can get a bit messy. I can try to do this in the current PR and then include a merged version of layout and routing in a follow-up PR.

Currently, determine_valid_actions_for_state works as follows:

  • If not synthesized:

    • If laid out → perform synthesis, optimization, or routing
    • Else → perform synthesis and optimization only
  • If synthesized:

    • If laid out and mapped → perform optimization or terminate
    • If only laid out → perform routing
    • Else → perform layout, mapping, or optimization

This structure already ensures re-layout whenever necessary and enforces the flow described in the predictor paper — i.e., start with synthesis before mapping. That is of course, as long as the layout attribute is correctly set after it has changed, e.g., in an optimization pass.

EDIT: I can see how this order is broken in the case of not synthesized + laid out, and a routing pass is selected, but I don't see how that would break the re-layout whenever necessary.

That is of course, as long as the layout attribute is correctly set after it has changed, e.g., in an optimization pass.

I think the real issue is exactly that optimization passes change the circuit structure but don’t reset the layout (only layout actions set the layout), so the layout attribute never changes. Since a layout exists, the only viable option becomes routing, even though a re-layout might actually be needed.

@burgholzer
Copy link
Member

I think the real issue is exactly that optimization passes change the circuit structure but don’t reset the layout (only layout actions set the layout), so the layout attribute never changes. Since a layout exists, the only viable option becomes routing, even though a re-layout might actually be needed.

Just throwing a random comment in here because I was roughly following the discussion.
Originally, we took special care that the optimizations that are applied after layout and routing are routing-aware in the sense that they preserve the circuit layout.
I think this distinction in the optimizations makes sense. Many of the existing passes can be configured to take a coupling map into account. While that typically limits the optimization potential, it does not require re-routing or re-layouting.
Just my two cents here though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request minor Part of a minor release refactor PR or issues that refactor code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants