Skip to content

when I set the batchsize 1, I found the the shape of feature from the PillarFeatureNet becomes [64] instead of [N, 64] #152

@qfwysw

Description

@qfwysw

If you do not know the root cause of the problem / bug, and wish someone to help you, please
post according to this template:

Instructions To Reproduce the Issue:

  1. what changes you made (git diff) or what code you wrote
samples_per_gpu=1,
workers_per_gpu=1,
  1. what exact command you run: python tools/train.py examples/point_pillars/configs/kitti_point_pillars_mghead_syncbn.py
  2. what you observed (including the full logs):
2021-09-14 16:50:33,540 - INFO - workflow: [('train', 5), ('val', 1)], max: 100 epochs
The shape of feature from the PillarFeatureNet
input_features.shape:  torch.Size([8550, 64])
The shape of feature from the PillarFeatureNet
input_features.shape:  torch.Size([10580, 64])
The shape of feature from the PillarFeatureNet
input_features.shape:  torch.Size([10413, 64])
The shape of feature from the PillarFeatureNet
input_features.shape:  torch.Size([9167, 64])
The shape of feature from the PillarFeatureNet
input_features.shape:  torch.Size([12000, 64])
The shape of feature from the PillarFeatureNet
input_features.shape:  torch.Size([12000, 64])
The shape of feature from the PillarFeatureNet
input_features.shape:  torch.Size([12000, 64])
The shape of feature from the PillarFeatureNet
input_features.shape:  torch.Size([10283, 64])
The shape of feature from the PillarFeatureNet
input_features.shape:  torch.Size([12000, 64])
The shape of feature from the PillarFeatureNet
input_features.shape:  torch.Size([12000, 64])
The shape of feature from the PillarFeatureNet
input_features.shape:  torch.Size([11906, 64])
The shape of feature from the PillarFeatureNet
input_features.shape:  torch.Size([64])
Traceback (most recent call last):
  File "/home/wensuinan/program/centerpoint/det_modified/tools/train.py", line 132, in <module>
    main()
  File "/home/wensuinan/program/centerpoint/det_modified/tools/train.py", line 127, in main
    logger=logger,
  File "/home/wensuinan/program/centerpoint/det_modified/det3d/torchie/apis/train.py", line 325, in train_detector
    trainer.run(data_loaders, cfg.workflow, cfg.total_epochs, local_rank=cfg.local_rank)
  File "/home/wensuinan/program/centerpoint/det_modified/det3d/torchie/trainer/trainer.py", line 537, in run
    epoch_runner(data_loaders[i], self.epoch, **kwargs)
  File "/home/wensuinan/program/centerpoint/det_modified/det3d/torchie/trainer/trainer.py", line 404, in train
    self.model, data_batch, train_mode=True, **kwargs
  File "/home/wensuinan/program/centerpoint/det_modified/det3d/torchie/trainer/trainer.py", line 363, in batch_processor_inline
    losses = model(example, return_loss=True)
  File "/home/wensuinan/anaconda3/envs/centerpoint/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/wensuinan/program/centerpoint/det_modified/det3d/models/detectors/two_stage.py", line 169, in forward
    out = self.single_det.forward_two_stage(example, return_loss, **kwargs)
  File "/home/wensuinan/program/centerpoint/det_modified/det3d/models/detectors/point_pillars.py", line 74, in forward_two_stage
    x = self.extract_feat(data)     # 3, 384, 248, 216
  File "/home/wensuinan/program/centerpoint/det_modified/det3d/models/detectors/point_pillars.py", line 28, in extract_feat
    input_features, data["coors"], data["batch_size"], data["input_shape"]
  File "/home/wensuinan/anaconda3/envs/centerpoint/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/wensuinan/program/centerpoint/det_modified/det3d/models/readers/pillar_encoder.py", line 198, in forward
    voxels = voxel_features[batch_mask, :]
IndexError: too many indices for tensor of dimension 1
  1. please also simplify the steps as much as possible so they do not require additional resources to
    run, such as a private dataset.

Expected behavior:

If there are no obvious error in "what you observed" provided above,
please tell us the expected behavior.

If you expect the model to converge / work better, note that we do not give suggestions
on how to train a new model.
Only in one of the two conditions we will help with it:
(1) You're unable to reproduce the results in model zoo.
(2) It indicates a bug in Det3D.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions