Skip to content

Conversation

@yao-matrix
Copy link
Contributor

@SunMarc , pls help review, thx very much.

@yao-matrix yao-matrix changed the title fix continuous batching issues on XPU, extend ut cases to xpu fix continuous batching issues, extend ut cases to xpu Oct 29, 2025
@yao-matrix
Copy link
Contributor Author

@SunMarc , could you pls take a review? Thx very much.

@SunMarc SunMarc requested a review from remi-or November 4, 2025 14:53
Copy link
Collaborator

@remi-or remi-or left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for adding, I just have 2 nits :)

Expectations,
require_kernels,
require_torch_accelerator,
require_torch_gpu,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is @require_torch_gpu still used after those changes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@remi-or paged attention enabling on XPU in kernels is still ongoing, suppose be ready soon, then we will transfer the left cases from CUDA-only to XPU, at this time, we still have some paged attention cases which are still CUDA only.

blocks = blocks.reshape(local_experts, -1, module.intermediate_size // 2)
if getattr(target_device, "type", target_device) == "cpu":
target_device = "cuda"
target_device = torch.accelerator.current_accelerator().type if hasattr(torch, "accelerator") else "cuda"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if @SunMarc can OK this, not familiar w/ different accelerators

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be fine !

@yao-matrix
Copy link
Contributor Author

@SunMarc , I think we are OK to go now, thx very much.

@SunMarc SunMarc enabled auto-merge (squash) November 10, 2025 12:52
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@SunMarc SunMarc merged commit dba6aeb into huggingface:main Nov 10, 2025
23 checks passed
Abdennacer-Badaoui pushed a commit to Abdennacer-Badaoui/transformers that referenced this pull request Nov 10, 2025
…1830)

* extend conrinuous batching cases to xpu

Signed-off-by: Yao, Matrix <matrix.yao@intel.com>

* fix style

Signed-off-by: Yao, Matrix <matrix.yao@intel.com>

* fix style

Signed-off-by: Yao, Matrix <matrix.yao@intel.com>

---------

Signed-off-by: Yao, Matrix <matrix.yao@intel.com>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
@yao-matrix yao-matrix deleted the cb-xpu branch November 10, 2025 16:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants