[BUG]RuntimeError: stack expects each tensor to be equal size, but got [] at entry 0 and [1] at entry 6

When using torchrl’s SyncDataCollector with a custom environment object, in the _step() method, if the "done" value returns True, an error occurs:

/Users/xjf/miniforge3/envs/drive-into-llm/lib/python3.10/site-packages/torchrl/collectors/collectors.py:870: UserWarning: total_frames (1000) is not exactly divisible by frames_per_batch (30). This means 20 additional frames will be collected.To silence this message, set the environment variable RL_WARNINGS to False.
  warnings.warn(
/Users/xjf/miniforge3/envs/drive-into-llm/lib/python3.10/site-packages/torchrl/collectors/collectors.py:1429: UserWarning: An output with one or more elements was resized since it had shape [], which does not match the required output shape [1]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize_(0). (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/Resize.cpp:38.)
  traj_ids = traj_ids.masked_scatter(traj_sop, new_traj)
Traceback (most recent call last):
  File "/Users/xjf/miniforge3/envs/drive-into-llm/lib/python3.10/site-packages/torchrl/collectors/collectors.py", line 1586, in rollout
    result = torch.stack(
  File "/Users/xjf/miniforge3/envs/drive-into-llm/lib/python3.10/site-packages/tensordict/base.py", line 658, in __torch_function__
    return TD_HANDLED_FUNCTIONS[func](*args, **kwargs)
  File "/Users/xjf/miniforge3/envs/drive-into-llm/lib/python3.10/site-packages/tensordict/_torch_func.py", line 737, in _stack
    out._stack_onto_(list_of_tensordicts, dim)
  File "/Users/xjf/miniforge3/envs/drive-into-llm/lib/python3.10/site-packages/tensordict/_td.py", line 2665, in _stack_onto_
    new_dest = torch.stack(
  File "/Users/xjf/miniforge3/envs/drive-into-llm/lib/python3.10/site-packages/tensordict/base.py", line 658, in __torch_function__
    return TD_HANDLED_FUNCTIONS[func](*args, **kwargs)
  File "/Users/xjf/miniforge3/envs/drive-into-llm/lib/python3.10/site-packages/tensordict/_torch_func.py", line 737, in _stack
    out._stack_onto_(list_of_tensordicts, dim)
  File "/Users/xjf/miniforge3/envs/drive-into-llm/lib/python3.10/site-packages/tensordict/_td.py", line 2665, in _stack_onto_
    new_dest = torch.stack(
RuntimeError: stack expects each tensor to be equal size, but got [] at entry 0 and [1] at entry 6

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/xjf/Gaode_Projects/busi_scene_cover/assert_assembly/fine_tunning_model.py", line 240, in <module>
    example2()
  File "/Users/xjf/Gaode_Projects/busi_scene_cover/assert_assembly/fine_tunning_model.py", line 232, in example2
    fine_tunning_model(ds, task_id, model_url_, url_,
  File "/Users/xjf/Gaode_Projects/busi_scene_cover/assert_assembly/fine_tunning_model.py", line 213, in fine_tunning_model
    fint_tunning(ft_data, model_path, json_tokenizer_file, save_dir)
  File "/Users/xjf/Gaode_Projects/busi_scene_cover/assert_assembly/fine_tunning_with_rl_v2.py", line 480, in fint_tunning
    for epoch, data in enumerate(collector):
  File "/Users/xjf/miniforge3/envs/drive-into-llm/lib/python3.10/site-packages/torchrl/collectors/collectors.py", line 341, in __iter__
    yield from self.iterator()
  File "/Users/xjf/miniforge3/envs/drive-into-llm/lib/python3.10/site-packages/torchrl/collectors/collectors.py", line 1256, in iterator
    tensordict_out = self.rollout()
  File "/Users/xjf/miniforge3/envs/drive-into-llm/lib/python3.10/site-packages/torchrl/_utils.py", line 661, in unpack_rref_and_invoke_function
    return func(self, *args, **kwargs)
  File "/Users/xjf/miniforge3/envs/drive-into-llm/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    return func(*args, **kwargs)
  File "/Users/xjf/miniforge3/envs/drive-into-llm/lib/python3.10/site-packages/torchrl/collectors/collectors.py", line 1594, in rollout
    result = torch.stack(
  File "/Users/xjf/miniforge3/envs/drive-into-llm/lib/python3.10/site-packages/tensordict/base.py", line 658, in __torch_function__
    return TD_HANDLED_FUNCTIONS[func](*args, **kwargs)
  File "/Users/xjf/miniforge3/envs/drive-into-llm/lib/python3.10/site-packages/tensordict/_torch_func.py", line 737, in _stack
    out._stack_onto_(list_of_tensordicts, dim)
  File "/Users/xjf/miniforge3/envs/drive-into-llm/lib/python3.10/site-packages/tensordict/_td.py", line 2665, in _stack_onto_
    new_dest = torch.stack(
  File "/Users/xjf/miniforge3/envs/drive-into-llm/lib/python3.10/site-packages/tensordict/base.py", line 658, in __torch_function__
    return TD_HANDLED_FUNCTIONS[func](*args, **kwargs)
  File "/Users/xjf/miniforge3/envs/drive-into-llm/lib/python3.10/site-packages/tensordict/_torch_func.py", line 737, in _stack
    out._stack_onto_(list_of_tensordicts, dim)
  File "/Users/xjf/miniforge3/envs/drive-into-llm/lib/python3.10/site-packages/tensordict/_td.py", line 2665, in _stack_onto_
    new_dest = torch.stack(
RuntimeError: stack expects each tensor to be equal size, but got [] at entry 0 and [1] at entry 6


Through debugging, I found that when merging the TensorDicts, an error is raised for the key "traj_ids". If the "done" value is False, everything works fine.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BUG]RuntimeError: stack expects each tensor to be equal size, but got [] at entry 0 and [1] at entry 6 #3137

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BUG]RuntimeError: stack expects each tensor to be equal size, but got [] at entry 0 and [1] at entry 6 #3137

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions