Skip to content

Commit eda2913

Browse files
XkunWCopilot
andauthored
Update docs/user_guide.md
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
1 parent 5aa5171 commit eda2913

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

docs/user_guide.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -94,7 +94,7 @@ export VEC_INF_CONFIG=/h/<username>/my-model-config.yaml
9494
**NOTE**
9595
* There are other parameters that can also be added to the config but not shown in this example, check the [`ModelConfig`](https://github.com/VectorInstitute/vector-inference/blob/main/vec_inf/client/config.py) for details.
9696
* Check [vLLM Engine Arguments](https://docs.vllm.ai/en/stable/serving/engine_args.html) for the full list of available vLLM engine arguments. The default parallel size for any parallelization defaults to 1, so none of the sizes were set specifically in this example.
97-
* For GPU partitions with non-Ampere architectures, e.g. `rtx6000`, `t4v2`, BF16 isn't supported. For models that have BF16 as the default type, when using a non-Ampere GPU, use FP16 instead, i.e. `--dtype: float16`
97+
* For GPU partitions with non-Ampere architectures, e.g. `rtx6000`, `t4v2`, BF16 isn't supported. For models that have BF16 as the default type, when using a non-Ampere GPU, use FP16 instead, i.e. `--dtype: float16`.
9898
* Setting `--compilation-config` to `3` currently breaks multi-node model launches, so we don't set them for models that require multiple nodes of GPUs.
9999

100100
### `status` command

0 commit comments

Comments
 (0)