Remove Qwen Image Redundant RoPE Cache #12452

dg845 · 2025-10-09T01:12:12Z

What does this PR do?

This PR removes self.rope_cache in QwenEmbedRope, so that RoPE frequency caching is only done once through the functools.lru_cache decorator on _compute_video_freqs. It also changes the maxsize argument for lru_cache from None to 128 to prevent the cache from causing OOM errors.

Fixes #12401

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@sayakpaul
@naykun
@chenxiao111222

HuggingFaceDocBuilderDev · 2025-10-09T01:20:11Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

sayakpaul

@naykun could you also give it a look?

dg845 · 2025-10-09T04:29:58Z

One thing to note is that the current PR implementation (in 52cf252) isn't quite equivalent to the implementation in main when not compiling since frame is part of the cache key for the lru_cache, but not part of the cache key for self.rope_cache.

@naykun do you think changing the cache key to include frame makes sense? If frame is the same for most calls to _compute_video_freqs, then I think the two implementations are approximately equivalent.

sayakpaul · 2025-10-09T05:05:53Z

I think the use of a frame based notation is present to allow extension to videos easily. For images, this shouldn't matter at all because the idx is always a constant and since img_shapes is supplemented just once from the pipeline:

diffusers/src/diffusers/models/transformers/transformer_qwenimage.py

Line 569 in a519272

img_shapes: Optional[List[Tuple[int, int, int]]] = None,

diffusers/src/diffusers/pipelines/qwenimage/pipeline_qwenimage_edit.py

Line 752 in a519272

img_shapes = [

(example)

do you think changing the cache key to include frame makes sense? If frame is the same for most calls to _compute_video_freqs, then I think the two implementations are approximately equivalent.

For the rope_cache, I am not sure how much of a readability compromise that would be, though.

dg845 · 2025-10-17T06:00:27Z

Gentle ping @naykun to take a look :).

naykun · 2025-10-19T09:51:01Z

Hi everyone!
Apologies for missing the notifications earlier.

As @sayakpaul mentioned, we’re currently keeping the frame parameter in place to maintain future compatibility with video-related features. For text-to-image and image editing scenarios, the frame will always be 1, and the idx will vary depending on the input conditions or output images. The changes in this PR won’t affect that logic or impact caching efficiency.

That said, I’d like to double-check with @sayakpaul: using lru_cache won’t interfere with compilation in any edge cases, right?

sayakpaul · 2025-10-19T16:05:06Z

That said, I’d like to double-check with @sayakpaul: using lru_cache won’t interfere with compilation in any edge cases, right?

At least for the current image-based stuff, it won't. We cannot fully compile the model without having graph-breaks, anyway, because:
https://github.com/huggingface/diffusers/blob/23ebbb4bc81a17ebea17cb7cb94f301199e49a7f/src/diffusers/models/transformers/transformer_qwenimage.py#L185C9-L186C37

But that's fine because we can still apply regional compilation and obtain speedups from the QwenImage family.

sayakpaul · 2025-10-20T01:45:12Z

@dg845 could you also run the compilation tests for QwenImage? Sorry, I forgot to mention it.

diffusers/tests/models/transformers/test_models_transformer_qwenimage.py

Line 95 in 7853bfb

    
           class QwenImageTransformerCompileTests(TorchCompileTesterMixin, unittest.TestCase):

dg845 · 2025-10-20T02:10:11Z

Running

RUN_SLOW=1 RUN_COMPILE=1 pytest tests/models/transformers/test_models_transformer_qwenimage.py::QwenImageTransformerCompileTests

gives no errors (at least locally on DGX).

Refactor QwenEmbedRope to only use the LRU cache for RoPE caching

52cf252

sayakpaul approved these changes Oct 9, 2025

View reviewed changes

Merge branch 'main' into remove-redundant-qwen-rope-cache

b880d1c

dg845 merged commit 7853bfb into huggingface:main Oct 20, 2025
10 of 11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Remove Qwen Image Redundant RoPE Cache #12452

Remove Qwen Image Redundant RoPE Cache #12452

dg845 commented Oct 9, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Oct 9, 2025

Uh oh!

sayakpaul left a comment

Uh oh!

dg845 commented Oct 9, 2025

Uh oh!

sayakpaul commented Oct 9, 2025

Uh oh!

dg845 commented Oct 17, 2025

Uh oh!

naykun commented Oct 19, 2025

Uh oh!

sayakpaul commented Oct 19, 2025

Uh oh!

Uh oh!

sayakpaul commented Oct 20, 2025

Uh oh!

dg845 commented Oct 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Remove Qwen Image Redundant RoPE Cache #12452

Remove Qwen Image Redundant RoPE Cache #12452

Conversation

dg845 commented Oct 9, 2025

What does this PR do?

Who can review?

Uh oh!

HuggingFaceDocBuilderDev commented Oct 9, 2025

Uh oh!

sayakpaul left a comment

Choose a reason for hiding this comment

Uh oh!

dg845 commented Oct 9, 2025

Uh oh!

sayakpaul commented Oct 9, 2025

Uh oh!

dg845 commented Oct 17, 2025

Uh oh!

naykun commented Oct 19, 2025

Uh oh!

sayakpaul commented Oct 19, 2025

Uh oh!

Uh oh!

sayakpaul commented Oct 20, 2025

Uh oh!

dg845 commented Oct 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants