-
-
Notifications
You must be signed in to change notification settings - Fork 11.4k
[BugFix][CI/Build][ROCM] Fix import error and apply assert in appropriate case in test_struct_output_generate #28311
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Randall Smith <ransmith@amd.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request correctly addresses two issues: an import error in rocm_aiter_fa.py and an overly strict assertion in test_struct_output_generate.py. The change in rocm_aiter_fa.py corrects the import path for get_num_sms. The modification in test_struct_output_generate.py makes the assertion for content conditional on the model name, which aligns with the intended testing logic. The changes are well-implemented and resolve the described bugs. I have no further comments.
Signed-off-by: Randall Smith <ransmith@amd.com>
|
AITER main has |
It doesn't match what vLLM is currently installing. When I looked at what was actually installed, this was the correct import. |
Yep the AITER commit in the base dockerfile is slightly old now, just adding the context here for anyone playing around with newer AITER commits. |
|
@rasmith : The OOM issue can be reproducible if you run pytest -v -s v1/entrypoints to test all the tests (same behavior in what we do in CI) https://github.com/vllm-project/vllm/blob/main/.buildkite/test-amd.yaml#L328 |
|
This pull request has merge conflicts that must be resolved before it can be |
Signed-off-by: Randall Smith <ransmith@amd.com>
Signed-off-by: Randall Smith <ransmith@amd.com>
|
This pull request has merge conflicts that must be resolved before it can be |
|
@rasmith Can you rebase this as well? |
Signed-off-by: Randall Smith <ransmith@amd.com>
…iate case in test_struct_output_generate (vllm-project#28311) Signed-off-by: Randall Smith <ransmith@amd.com> Co-authored-by: Randall Smith <ransmith@amd.com> Signed-off-by: ernie-chang <ernie.chang@cienet.com>
…iate case in test_struct_output_generate (vllm-project#28311) Signed-off-by: Randall Smith <ransmith@amd.com> Co-authored-by: Randall Smith <ransmith@amd.com> Signed-off-by: George D. Torres <gdavtor@gmail.com>
…iate case in test_struct_output_generate (vllm-project#28311) Signed-off-by: Randall Smith <ransmith@amd.com> Co-authored-by: Randall Smith <ransmith@amd.com> Signed-off-by: Bram Wasti <bwasti@meta.com>
This PR fixes and import error in test_struct_output_generate which was importing get_num_sms from
device_infowhen it should have beenarch_info. There was also an assert that should have been applied only for Qwen3 but was being applied to other models.All tests now pass.
This fixes issue 27844.
FYI, I found no evidence of OOM during the execution of a successful test. If a test fails, then it looks like leakage can occur which can cause the remaining tests to OOM, but this seems like a separate issue.
============================= test session starts ==============================
platform linux -- Python 3.12.12, pytest-8.4.2, pluggy-1.6.0
rootdir: /vllm-upstream
configfile: pyproject.toml
plugins: asyncio-1.2.0, anyio-4.11.0
asyncio: mode=Mode.STRICT, debug=False, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function
collected 20 items
tests/v1/entrypoints/llm/test_struct_output_generate.py ................ [ 80%]
.... [100%]
=============================== warnings summary ===============================
:488
:488: DeprecationWarning: builtin type SwigPyPacked has no module attribute
:488
:488: DeprecationWarning: builtin type SwigPyObject has no module attribute
tests/v1/entrypoints/llm/test_struct_output_generate.py::test_structured_output[mistralai/Ministral-8B-Instruct-2410-xgrammar-auto-None]
tests/v1/entrypoints/llm/test_struct_output_generate.py::test_structured_output[mistralai/Ministral-8B-Instruct-2410-guidance-auto-None]
tests/v1/entrypoints/llm/test_struct_output_generate.py::test_structured_output[mistralai/Ministral-8B-Instruct-2410-lm-format-enforcer-auto-None]
tests/v1/entrypoints/llm/test_struct_output_generate.py::test_structured_output[mistralai/Ministral-8B-Instruct-2410-outlines-auto-speculative_config6]
tests/v1/entrypoints/llm/test_struct_output_generate.py::test_structured_output[mistralai/Ministral-8B-Instruct-2410-guidance-auto-speculative_config7]
tests/v1/entrypoints/llm/test_struct_output_generate.py::test_structured_output_auto_mode[mistralai/Ministral-8B-Instruct-2410-auto]
/vllm-upstream/vllm/transformers_utils/tokenizer.py:287: FutureWarning: It is strongly recommended to run mistral models with
--tokenizer-mode "mistral"to ensure correct encoding and decoding.return get_tokenizer(
tests/v1/entrypoints/llm/test_struct_output_generate.py::test_structured_output_with_structural_tag[xgrammar]
/vllm-upstream/tests/v1/entrypoints/llm/test_struct_output_generate.py:899: DeprecationWarning: guided_decoding is deprecated. This will be removed in v0.12.0 or v1.0.0, which ever is soonest. Please use structured_outputs instead.
sampling_params = SamplingParams(
-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
================== 20 passed, 9 warnings in 762.31s (0:12:42) ==================