Skip to content

IndexError: The shape of the mask [1406] at index 0 does not match the shape of the indexed tensor [1405] at index 0 #41093

@wyn1015

Description

@wyn1015

System Info

transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py”,在 get_rope_index 中 [rank3]: input_ids = input_ids[attention_mask[i] == 1] IndexError: The shape of the mask [1406] at index 0 does not match the shape of the indexed tensor [1405] at index 0
transformers==4.49.0 transformers==4.51.2

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

Parameter Offload: Total persistent parameters: 848896 in 368 params
--- DEBUGGING prompt_inputs ---
Key: input_ids, Shape: torch.Size([1, 1411])
Key: attention_mask, Shape: torch.Size([1, 1411])
Key: pixel_values, Shape: torch.Size([5476, 1176])
Key: image_grid_thw, Shape: torch.Size([1, 3])

0%| | 0/4 [00:00<?, ?it/s]--- DEBUGGING prompt_inputs ---
Key: input_ids, Shape: torch.Size([1, 1402])
Key: attention_mask, Shape: torch.Size([1, 1402])
Key: pixel_values, Shape: torch.Size([5476, 1176])
Key: image_grid_thw, Shape: torch.Size([1, 3])
generation_config default values have been modified to match model-specific defaults: {'use_cache': False, 'temperature': 1e-06, 'repetition_penalty': 1.05, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643]}. If this is not desired, please set these values explicitly.
generation_config default values have been modified to match model-specific defaults: {'use_cache': False, 'temperature': 1e-06, 'repetition_penalty': 1.05, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643]}. If this is not desired, please set these values explicitly.
/ conda-envs/searchlm_cu121/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None
warnings.warn(
/ conda-envs/searchlm_cu121/lib/python3.10/site-packages/torch/utils/checkpoint.py:87: UserWarning: None of the inputs have requires_grad=True. Gradients will be None
warnings.warn(
[rank0]: Traceback (most recent call last):
[rank0]: File "/ unified/UnifiedReward-main/UnifiedReward-Think/src/open_r1/grpo.py", line 337, in
[rank0]: main(script_args, training_args, model_args)
[rank0]: File "/ unified/UnifiedReward-main/UnifiedReward-Think/src/open_r1/grpo.py", line 326, in main
[rank0]: trainer.train()
[rank0]: File "/ conda-envs/searchlm_cu121/lib/python3.10/site-packages/transformers/trainer.py", line 2237, in train
[rank0]: return inner_training_loop(
[rank0]: File "/ conda-envs/searchlm_cu121/lib/python3.10/site-packages/transformers/trainer.py", line 2578, in _inner_training_loop
[rank0]: tr_loss_step = self.training_step(model, inputs, num_items_in_batch)
[rank0]: File "/ conda-envs/searchlm_cu121/lib/python3.10/site-packages/transformers/trainer.py", line 3792, in training_step
[rank0]: loss = self.compute_loss(model, inputs, num_items_in_batch=num_items_in_batch)
[rank0]: File "/ unified/UnifiedReward-main/UnifiedReward-Think/src/open_r1/trainer/grpo_trainer.py", line 495, in compute_loss
[rank0]: prompt_completion_ids = unwrapped_model.generate(**prompt_inputs, generation_config=self.generation_config)
[rank0]: File "/ conda-envs/searchlm_cu121/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
[rank0]: return func(*args, **kwargs)
[rank0]: File "/ conda-envs/searchlm_cu121/lib/python3.10/site-packages/transformers/generation/utils.py", line 2633, in generate
[rank0]: result = self._sample(
[rank0]: File "/ conda-envs/searchlm_cu121/lib/python3.10/site-packages/transformers/generation/utils.py", line 3607, in _sample
[rank0]: model_inputs = self.prepare_inputs_for_generation(input_ids, **model_kwargs)
[rank0]: File "/ conda-envs/searchlm_cu121/lib/python3.10/site-packages/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py", line 1561, in prepare_inputs_for_generation
[rank0]: vision_positions, rope_deltas = self.model.get_rope_index(
[rank0]: File "/ conda-envs/searchlm_cu121/lib/python3.10/site-packages/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py", line 1057, in get_rope_index
[rank0]: input_ids = input_ids[attention_mask[i] == 1]
[rank0]: IndexError: The shape of the mask [1406] at index 0 does not match the shape of the indexed tensor [1405] at index 0

0%| | 0/4 [00:06<?, ?it/s]
[2025-09-23 05:40:36,349] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 1295796
[2025-09-23 05:40:36,350] [INFO] [launch.py:319:sigkill_handler] Killing subprocess 1295797
[2025-09-23 05:40:36,565] [ERROR] [launch.py:325:sigkill_handler] ['/ conda-envs/searchlm_cu121/bin/python3.10', '-u', 'src/open_r1/grpo.py', '--local_rank=1', '--deepspeed', 'scripts/zero3.json', '--ddp_timeout', '180000000', '--output_dir', './checkpoints/UnifiedReward-Think-qwen-GRPO', '--model_name_or_path', '/ model/UnifiedReward-qwen-7b', '--dataset_name', '/ unified/UnifiedReward-main/UnifiedReward-Think/dataset/HPD/HPD_train_data_qwen1.json', '--max_prompt_length', '2048', '--max_completion_length', '1024', '--num_generations', '2', '--per_device_train_batch_size', '1', '--gradient_accumulation_steps', '1', '--learning_rate', '1e-6', '--logging_steps', '1', '--bf16', 'True', '--torch_dtype', 'bfloat16', '--report_to', 'none', '--gradient_checkpointing', 'true', '--attn_implementation', 'eager', '--max_pixels', '147456', '--save_steps', '40', '--save_total_limit', '8', '--save_only_model', 'false', '--num_train_epochs', '2'] exits with return code = 1

Expected behavior

It seems to be a tranformers version issue. Could you help take a look

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions