[loading] Fix device when source and target are different #42246

Cyrilvallez · 2025-11-17T17:54:04Z

What does this PR do?

The device_map specifies the target keys when loading. The PR updates the loading accordingly, otherwise we have issues when using a device_map with any model using a _conversion_mapping (the VLMs) for example, where source and targets are different.
Currently, the only thing saving us is the fact that accelerate_dispatch will move the parameters if the device is not correct during post processing, which is why it was not detected before! But of course this is muuuuch more costly than our smart loading.

I checked very carefully (by running benchmarks AND checking source code), and performances are the same if using this PR, or if opening safetensors directly on device! This can also be verified by looking at the safetensors rust bindings here: it actually simply calls tensor.to(device) internally when calling get_slice, so this PR has no impact on performances (may even be slightly better due to avoiding to opening the files again and again)

To understand the issue, consider the following snippet:

import transformers
from transformers import AriaForConditionalGeneration
import torch

# Monkey-patch `accelerate_dispatch` just to illustrate the problem
def dummy_dispatch(*args, **kwargs):
    pass
transformers.modeling_utils.accelerate_dispatch = dummy_dispatch

model_name = "rhymes-ai/Aria"
model = AriaForConditionalGeneration.from_pretrained(model_name, device_map=0, dtype=torch.float16)

for k, v in model.state_dict().items():
    if v.device != torch.device(0):
        print(f"Param {k} is not on the correct device! Expected {0}, found {v.device}")

On main, it currently outputs:

Param model.vision_tower.embeddings.patch_embedding.weight is not on the correct device! Expected 0, found cpu
Param model.vision_tower.embeddings.patch_embedding.bias is not on the correct device! Expected 0, found cpu
Param model.vision_tower.embeddings.position_embedding.weight is not on the correct device! Expected 0, found cpu
Param model.vision_tower.encoder.layers.0.self_attn.k_proj.weight is not on the correct device! Expected 0, found cpu
Param model.vision_tower.encoder.layers.0.self_attn.k_proj.bias is not on the correct device! Expected 0, found cpu
Param model.vision_tower.encoder.layers.0.self_attn.v_proj.weight is not on the correct device! Expected 0, found cpu
Param model.vision_tower.encoder.layers.0.self_attn.v_proj.bias is not on the correct device! Expected 0, found cpu
Param model.vision_tower.encoder.layers.0.self_attn.q_proj.weight is not on the correct device! Expected 0, found cpu
...

Basically, all params are on cpu instead of 0, due to the mismatch between targets and sources in the _checkpoint_conversion_mapping.

On this PR, everything is fine again, and params are loaded immediately on the correct device.
This is also needed for my other offloading PR #42242

HuggingFaceDocBuilderDev · 2025-11-17T18:04:48Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

ArthurZucker

ty its simpler as well

fix device

c8ee5e8

Cyrilvallez added 3 commits November 17, 2025 20:03

fix

ae5c0bf

CI

68d21f0

simplify a bit

c4d2737

ArthurZucker approved these changes Nov 18, 2025

View reviewed changes

Cyrilvallez merged commit 1742d11 into main Nov 18, 2025
24 checks passed

Cyrilvallez deleted the fix-device-map branch November 18, 2025 08:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[loading] Fix device when source and target are different #42246

[loading] Fix device when source and target are different #42246

Uh oh!

Cyrilvallez commented Nov 17, 2025 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Nov 17, 2025

Uh oh!

ArthurZucker left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[loading] Fix device when source and target are different #42246

[loading] Fix device when source and target are different #42246

Uh oh!

Conversation

Cyrilvallez commented Nov 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Nov 17, 2025

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Cyrilvallez commented Nov 17, 2025 •

edited

Loading