Skip to content

Conversation

@hzyhhzy
Copy link
Contributor

@hzyhhzy hzyhhzy commented Dec 31, 2025

This PR enables the TensorRT backend to support ONNX models, facilitating experimentation with new architectures such as Transformers. This implementation references the work of @yehu3d.

  • Dual Model Support: Supports both .onnx models and traditional .bin.gz models. The format is automatically detected based on the file extension.

  • Model Export: The ONNX export script is available here: export_onnx.py.

    • Note: This script is designed for the KataGo_Transformer repository. Minor modifications may be required to adapt it for the official KataGo's PyTorch code.
  • While different board sizes are supported, there is a known limitation where pos_len is fixed during the ONNX export. Consequently, TensorRT inference is restricted to using nnXLen and nnYLen values that match the fixed pos_len. So compatibility with different board sizes is currently handled via masking, which incurs a certain degree of performance loss.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant