From aff89a2186d750bc9a412e8bd698c8ae11f87ff4 Mon Sep 17 00:00:00 2001 From: haic0 <149741444+haic0@users.noreply.github.com> Date: Wed, 10 Sep 2025 10:36:29 +0800 Subject: [PATCH 1/2] Update vllm_deployment_guide.md for the AMD GPU support --- docs/vllm_deployment_guide.md | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+) diff --git a/docs/vllm_deployment_guide.md b/docs/vllm_deployment_guide.md index 3516335..61a9a28 100644 --- a/docs/vllm_deployment_guide.md +++ b/docs/vllm_deployment_guide.md @@ -151,6 +151,36 @@ cd vllm/ pip install -e . ``` +### AMD GPU Support + +### Please follow the steps here to install and run LongCat-Flash-Chat models on AMD MI300X GPU. +### Step By Step Guide +#### Step 1 +Launch the Rocm-vllm docker: + +```shell +docker pull rocm/vllm-dev:nightly + +docker run -d -it --ipc=host --network=host --privileged --cap-add=CAP_SYS_ADMIN --device=/dev/kfd --device=/dev/dri --device=/dev/mem --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v /:/work -e SHELL=/bin/bash --name vllm_minimax rocm/vllm-dev:nightly +``` + +#### Step 2 + Huggingface login + +```shell + huggingface-cli login +``` + + +#### Step 3 +Run the vllm online serving +Sample Command + +```shell +SAFETENSORS_FAST_GPU=1 VLLM_USE_V1=0 vllm serve MiniMaxAI/MiniMax-M1-80k -tp 8 --gpu-memory-utilization 0.95 --no-enable-prefix-caching --trust-remote-code --max_model_len 4096 --dtype bfloat16 +``` + + ## 📮 Getting Support If you encounter any issues while deploying MiniMax-M1 model: From c6c72982ab1e9a7cfc8beabe65f26fa839f48158 Mon Sep 17 00:00:00 2001 From: haic0 <149741444+haic0@users.noreply.github.com> Date: Sat, 13 Sep 2025 11:30:29 +0800 Subject: [PATCH 2/2] Update vllm_deployment_guide.md --- docs/vllm_deployment_guide.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/docs/vllm_deployment_guide.md b/docs/vllm_deployment_guide.md index 61a9a28..17d3253 100644 --- a/docs/vllm_deployment_guide.md +++ b/docs/vllm_deployment_guide.md @@ -152,8 +152,7 @@ pip install -e . ``` ### AMD GPU Support - -### Please follow the steps here to install and run LongCat-Flash-Chat models on AMD MI300X GPU. +### Please follow the steps here to install and run MiniMax-M1 models on AMD MI300X GPU. ### Step By Step Guide #### Step 1 Launch the Rocm-vllm docker: