-
Notifications
You must be signed in to change notification settings - Fork 55
Open
Labels
non-staleThis label can be used to prevent marking issues or PRs as StaleThis label can be used to prevent marking issues or PRs as Stale
Description
Description
batching=False
doesn't work with warmup. There is always additional dimension added at the beginning of the request making it impossible to infer a sample without batch dimension at the beginning, no matter if we set batch_size=0
or batch_size=1
in ModelWarmup()
To reproduce
# server
import numpy as np
from pytriton.model_config import ModelConfig, Tensor
from pytriton.model_config.common import ModelWarmup, WarmupInput
from pytriton.triton import Triton
def _infer_fn(input):
print(input[0].data['input1'])
print(input[0].data['input1'].shape)
return {"out": np.array((1,))}
with Triton() as triton:
warmup = ModelWarmup(
name="warmup",
batch_size=1, # setting to 0 or 1
inputs={
"input1": WarmupInput(
dtype=np.float32,
shape=(2, 3),
zero_data=True,
),
},
count=1,
)
triton.bind(
model_name="MyModel",
infer_func=_infer_fn,
inputs=[
Tensor(name="in1", dtype=np.float32, shape=(2,3)),
],
outputs=[
Tensor(name="out", dtype=np.float32, shape=(-1,)),
],
config=ModelConfig(
batching=False,
model_warmup=[warmup],
),
)
triton.serve()
#output for batch_size=0:
#[]
#(0, 2, 3)
#output for batch_size=1:
#[[[0. 0. 0.]
#[0. 0. 0.]]]
#(1, 2, 3)
Environment
pytriton version: 0.6.0
Metadata
Metadata
Assignees
Labels
non-staleThis label can be used to prevent marking issues or PRs as StaleThis label can be used to prevent marking issues or PRs as Stale