-
Notifications
You must be signed in to change notification settings - Fork 76
Open
Description
使用一下脚本benchmark测试,提示某些方法找不到。
https://github.com/intel/xFasterTransformer/blob/main/serving/vllm-xft.md#benchmark-offline-inference-throughput
numactl -C 0-15 -l python benchmark_throughput.py --tokenizer /data/DeepSeek-R1-Distill-Qwen-32B --model /data/DeepSeek-R1-Distill-Qwen-32B-xft --dataset /data/ShareGPT_V3_unfiltered_cleaned_split.json
Traceback (most recent call last):
File "/data/vllm/benchmarks/benchmark_throughput.py", line 24, in <module>
from vllm.entrypoints.openai.api_server import (
ImportError: cannot import name 'build_async_engine_client_from_engine_args' from 'vllm.entrypoints.openai.api_server' (/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py)
目前安装vllm-xft==0.5.5.4, 没有安装vllm。
Metadata
Metadata
Assignees
Labels
No labels