Skip to content

Example of TensorRT-LLM Whisper backend for PyTriton #65

@aleksandr-smechov

Description

@aleksandr-smechov

Describe the solution you'd like
With the recent TensorRT-LLM support for Whipser, and now that PyTriton supports TensorRT-LLM, would be great to get examples of efficient client and server code, as well as decoupled mode examples.

Describe alternatives you've considered
I've experimented with WhisperS2T coupled with FastAPI and PyTriton, and both perform well. It would be great to get a more involved example, like here and here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestnon-staleThis label can be used to prevent marking issues or PRs as Stale

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions