Support ONNX models in TensorRT backend #1139

hzyhhzy · 2025-12-31T08:49:43Z

This PR enables the TensorRT backend to support ONNX models, facilitating experimentation with new architectures such as Transformers. This implementation references the work of @yehu3d.

Dual Model Support: Supports both .onnx models and traditional .bin.gz models. The format is automatically detected based on the file extension.
Model Export: The ONNX export script is available here: export_onnx.py.
- Note: This script is designed for the KataGo_Transformer repository. Minor modifications may be required to adapt it for the official KataGo's PyTorch code.
While different board sizes are supported, there is a known limitation where pos_len is fixed during the ONNX export. Consequently, TensorRT inference is restricted to using nnXLen and nnYLen values that match the fixed pos_len. So compatibility with different board sizes is currently handled via masking, which incurs a certain degree of performance loss.

hzyhhzy added 5 commits December 29, 2025 02:57

to support onnx model (todo: load metadata)

1a5a875

load metadata from onnx

4901680

load more onnx metadata

eeb25a7

more onnx metadata; correct file name for plan cache and timing cache

a2f45cc

make trt-onnx work on boardsizes not equal to nn_len

d5806f0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support ONNX models in TensorRT backend #1139

Support ONNX models in TensorRT backend #1139

Uh oh!

hzyhhzy commented Dec 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Support ONNX models in TensorRT backend #1139

Are you sure you want to change the base?

Support ONNX models in TensorRT backend #1139

Uh oh!

Conversation

hzyhhzy commented Dec 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant