Skip to content

Conversation

@AndreasKaratzas
Copy link
Contributor

@AndreasKaratzas AndreasKaratzas commented Nov 19, 2025

This PR fixes tests for labels:

  • Pytorch Nightly Dependency Override Check
  • Python-only Installation Test

It also includes torchaudio package into Dockerfile.

- Fixed setup.py logging format (G001)
- Added ROCm support for precompiled wheels
- Improved Dockerfile.rocm test stage with hf_transfer and v1 package
- Fixed pytorch_nightly_dependency.sh to detect ROCm version correctly
- Added source code directory for python_only_compile.sh test

Signed-off-by: Andreas Karatzas <akaratza@amd.com>
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
@mergify mergify bot added ci/build rocm Related to AMD ROCm labels Nov 19, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces fixes for PyTorch nightly and Python-only build tests on the ROCm platform, and adds torchaudio to the ROCm Docker image. The changes in setup.py and the test scripts are well-implemented, improving the robustness of the build process. My review focuses on optimizing the Dockerfiles. I've suggested combining several RUN instructions to reduce image layers, which is a best practice that improves build times and reduces image size. I've also recommended adding apt-get update and cache cleanup to an apt-get install command to ensure correctness and efficiency.

Comment on lines 104 to 105
RUN mkdir src
RUN mv vllm src/vllm
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

To optimize the Docker image size and reduce the number of layers, it's a best practice to chain related commands into a single RUN instruction. The mkdir and mv commands can be combined.

RUN mkdir src && mv vllm src/vllm

&& python3 --version && python3 -m pip --version

RUN pip install -U packaging 'cmake<4' ninja wheel 'setuptools<80' pybind11 Cython
RUN apt-get install -y libjpeg-dev libsox-dev libsox-fmt-all sox
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

For correctness and to keep the Docker image size minimal, it's crucial to run apt-get update before apt-get install and to clean up the apt cache in the same RUN layer. This prevents issues with stale package lists and removes unnecessary files.

RUN apt-get update && apt-get install -y libjpeg-dev libsox-dev libsox-fmt-all sox && rm -rf /var/lib/apt/lists/*

Comment on lines +87 to +92
RUN git clone ${PYTORCH_AUDIO_REPO} audio
RUN cd audio && git checkout ${PYTORCH_AUDIO_BRANCH} \
&& git submodule update --init --recursive \
&& pip install -r requirements.txt \
&& python3 setup.py bdist_wheel --dist-dir=dist \
&& pip install dist/*.whl
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

To reduce the number of Docker image layers and improve build efficiency, it's recommended to combine the git clone and the subsequent build commands for torchaudio into a single RUN instruction.

RUN git clone ${PYTORCH_AUDIO_REPO} audio && cd audio \
    && git checkout ${PYTORCH_AUDIO_BRANCH} \
    && git submodule update --init --recursive \
    && pip install -r requirements.txt \
    && python3 setup.py bdist_wheel --dist-dir=dist \
    && pip install dist/*.whl

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

setup.py Outdated
Comment on lines 680 to 695
# If using precompiled, extract and patch package_data (in advance of setup)
if envs.VLLM_USE_PRECOMPILED:
assert _is_cuda(), "VLLM_USE_PRECOMPILED is only supported for CUDA builds"
assert _is_cuda() or _is_hip(), (
"VLLM_USE_PRECOMPILED is only supported for CUDA or ROCm builds."

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Allowing precompiled mode on ROCm installs CUDA wheel

The precompiled path now explicitly allows _is_hip() (ROCm) builds, but the precompiled wheel download below still always targets the CUDA binary (wheels.vllm.ai/...vllm-1.0.0.dev-cp38-abi3-<arch>.whl, no ROCm variant). On ROCm agents where VLLM_USE_PRECOMPILED=1 is used (e.g., python_only_compile.sh in the AMD pipeline), this will install CUDA .so files without the CUDA runtime present, causing import-time failures instead of a clean skip. Either keep the CUDA-only guard or fetch a ROCm-specific wheel before enabling this path on ROCm.

Useful? React with 👍 / 👎.

pip install -r requirements.txt && git submodule update --init --recursive \
ARG PYTORCH_AUDIO_REPO

RUN git clone ${PYTORCH_REPO} pytorch && cd pytorch \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't combine these steps
It won't affect the final docker size because of the multi stage build process.
But this allows to cache the git clone stage when you're debugging torch build issues

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out. Let me know if the modifications are now inline with the above feedback.

# -----------------------
# Test vLLM image
FROM base AS test
ARG PYTHON_VERSION=3.12
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is being set in the base docker. A better way would be to set it as an env there and inherit in this image

ENV HF_HUB_ENABLE_HF_TRANSFER=1

# Copy in the v1 package for testing (it isn't distributed yet)
COPY vllm/v1 /usr/local/lib/python${PYTHON_VERSION}/dist-packages/vllm/v1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks kinda hacky, why do we need it for tests, but not for the normal distribution?

Signed-off-by: Andreas Karatzas <akaratza@amd.com>
SERVER_ARGS = ["--enforce-eager"]


@pytest.fixture(scope="module", autouse=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be put in conftest.py?

Copy link
Contributor Author

@AndreasKaratzas AndreasKaratzas Nov 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review. I moved this into conftest. Let me know if it looks better now.

@AndreasKaratzas
Copy link
Contributor Author

Added support for Entrypoints Integration Test (Pooling) test group on ROCm.

@AndreasKaratzas
Copy link
Contributor Author

Added FLEX_ATTENTION back since it was an integral part of some encoder only self attention tests under entrypoints test group. Also, updated version of terratorch to 1.1.1 because current version gave error:

FAILED entrypoints/openai/test_vision_embeds.py::test_single_request[ibm-nasa-geospatial/Prithvi-EO-2.0-300M-TL-Sen1Floods11]
Click to expand full log ```log FAILED entrypoints/openai/test_vision_embeds.py::test_single_request[ibm-nasa-geospatial/Prithvi-EO-2.0-300M-TL-Sen1Floods11]

self = _ModelRegistry(models={'AfmoeForCausalLM': LazyRegisteredModel(module_name='vllm.model_executor.models.afmoe', class...(module_name='vllm.model_executor.models.transformers', class_name='TransformersMultiModalForSequenceClassification')})
architectures = ['Terratorch']
model_config = ModelConfig(model='ibm-nasa-geospatial/Prithvi-EO-2.0-300M-TL-Sen1Floods11', runner='pooling', convert='auto', task=No...its_processors=None, io_processor_plugin=None, pooler_config=None, override_pooler_config=None, multimodal_config=None)
def is_text_generation_model(
self,
architectures: str | list[str],
model_config: ModelConfig,
) -> bool:
model_cls, _ = self.inspect_model_cls(architectures, model_config)

  return model_cls.is_text_generation_model

E AttributeError: 'NoneType' object has no attribute 'is_text_generation_model'


</details>

is_eager_execution = compilation_config == CUDAGraphMode.NONE

use_aiter_rms_norm = rocm_aiter_ops.is_rmsnorm_enabled()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added also logic to auto set block size to 64 on ROCm if attention backend is AITER Unified Attention since it yields significant performance gains.

…or terratorch to follow upstream

Signed-off-by: Andreas Karatzas <akaratza@amd.com>
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/build gpt-oss Related to GPT-OSS models rocm Related to AMD ROCm v1

Projects

Status: To Triage

Development

Successfully merging this pull request may close these issues.

3 participants