Skip to content

Warmup batching #112

@jkiczka-nvidia

Description

@jkiczka-nvidia

Description

batching=False doesn't work with warmup. There is always additional dimension added at the beginning of the request making it impossible to infer a sample without batch dimension at the beginning, no matter if we set batch_size=0 or batch_size=1 in ModelWarmup()

To reproduce

# server
import numpy as np
from pytriton.model_config import ModelConfig, Tensor
from pytriton.model_config.common import ModelWarmup, WarmupInput
from pytriton.triton import Triton


def _infer_fn(input):
    print(input[0].data['input1'])
    print(input[0].data['input1'].shape)
    return {"out": np.array((1,))}

with Triton() as triton:
    warmup = ModelWarmup(
        name="warmup",
        batch_size=1, # setting to 0 or 1
        inputs={
           "input1": WarmupInput(
                   dtype=np.float32,
                      shape=(2, 3),
                      zero_data=True,
                  ),
              },
              count=1,
    )

    triton.bind(
        model_name="MyModel",
        infer_func=_infer_fn,
        inputs=[
            Tensor(name="in1", dtype=np.float32, shape=(2,3)),
        ],
        outputs=[
            Tensor(name="out", dtype=np.float32, shape=(-1,)),
        ],
        config=ModelConfig(
            batching=False,
            model_warmup=[warmup],
        ),
    )
    triton.serve()


#output for batch_size=0:
#[]
#(0, 2, 3)

#output for batch_size=1:
#[[[0. 0. 0.]
#[0. 0. 0.]]]
#(1, 2, 3)

Environment

pytriton version: 0.6.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    non-staleThis label can be used to prevent marking issues or PRs as Stale

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions