-
Notifications
You must be signed in to change notification settings - Fork 53
Open
Description
I encountered the following issue while configuring the training environment according to the QuickStart documentation provided in the RoboBrain 2.0 repository:
- I have completed environment setup, model conversion, and data preparation as per the documentation, and verified that these steps should be correct.
- I have also applied the module patches as instructed in the QuickStart documentation of the robobrain2.0 repository.
However, when starting the training, I encountered the following error:
ModuleNotFoundError: No module named 'megatron_patch'
I haven’t found an effective solution so far.
Could you please advise on what might be causing this issue? Are there any other configuration items or dependencies I should check?
Below is the detailed error message:
[default2]:[rank2]: Traceback (most recent call last):
[default2]:[rank2]: File "/data2/user/RoboBrain_Project/FlagScale/./flagscale/train/train_qwen2_5_vl.py", line 797, in <module>
[default2]:[rank2]: pretrain(
[default2]:[rank2]: File "/data2/user/RoboBrain_Project/FlagScale/flagscale/train/train.py", line 1056, in pretrain
[default2]:[rank2]: build_train_valid_test_data_iterators(train_valid_test_dataset_provider)
[default2]:[rank2]: File "/data2/user/RoboBrain_Project/FlagScale/flagscale/train/train.py", line 3247, in build_train_valid_test_data_iterators
[default2]:[rank2]: train_dataloader, valid_dataloader, test_dataloader = build_train_valid_test_data_loaders(
[default2]:[rank2]: File "/data2/user/RoboBrain_Project/FlagScale/flagscale/train/train.py", line 3211, in build_train_valid_test_data_loaders
[default2]:[rank2]: train_ds, valid_ds, test_ds = build_train_valid_test_datasets(
[default2]:[rank2]: File "/data2/user/RoboBrain_Project/FlagScale/flagscale/train/train.py", line 3180, in build_train_valid_test_datasets
[default2]:[rank2]: return build_train_valid_test_datasets_provider(train_valid_test_num_samples)
[default2]:[rank2]: File "/data2/user/RoboBrain_Project/FlagScale/./flagscale/train/train_qwen2_5_vl.py", line 694, in train_valid_test_dataloaders_provider
[default2]:[rank2]: train_ds, valid_ds1, test_ds = datasets_provider(worker_config)
[default2]:[rank2]: File "/data2/user/RoboBrain_Project/FlagScale/./flagscale/train/train_qwen2_5_vl.py", line 605, in datasets_provider
[default2]:[rank2]: train_dataset = get_train_dataset(
[default2]:[rank2]: File "/data2/user/RoboBrain_Project/FlagScale/third_party/Megatron-LM/megatron/energon/task_encoder/loader.py", line 93, in get_train_dataset
[default2]:[rank2]: blend_mode, datasets = loader.get_datasets(
[default2]:[rank2]: File "/data2/user/RoboBrain_Project/FlagScale/third_party/Megatron-LM/megatron/energon/metadataset/dataset_loader.py", line 94, in get_datasets
[default2]:[rank2]: self.get_dataset(
[default2]:[rank2]: File "/data2/user/RoboBrain_Project/FlagScale/third_party/Megatron-LM/megatron/energon/metadataset/dataset_loader.py", line 68, in get_dataset
[default2]:[rank2]: return get_dataset_from_config(
[default2]:[rank2]: File "/data2/user/RoboBrain_Project/FlagScale/third_party/Megatron-LM/megatron/energon/dataset_config.py", line 92, in get_dataset_from_config
[default2]:[rank2]: dataset: BaseCoreDatasetFactory[T_sample] = load_config(
[default2]:[rank2]: File "/data2/user/RoboBrain_Project/FlagScale/third_party/Megatron-LM/megatron/energon/dataset_config.py", line 48, in load_config
[default2]:[rank2]: return parser.raw_to_instance(data, default_type)
[default2]:[rank2]: File "/data2/user/RoboBrain_Project/FlagScale/third_party/Megatron-LM/megatron/energon/typed_converter.py", line 154, in raw_to_instance
[default2]:[rank2]: cls = self._resolve_object(
[default2]:[rank2]: File "/data2/user/RoboBrain_Project/FlagScale/third_party/Megatron-LM/megatron/energon/typed_converter.py", line 79, in _resolve_object
[default2]:[rank2]: module = importlib.import_module(module_name)
[default2]:[rank2]: File "/home/share/.conda/envs/flagscale-train-wxr/lib/python3.10/importlib/__init__.py", line 126, in import_module
[default2]:[rank2]: return _bootstrap._gcd_import(name[level:], package, level)
[default2]:[rank2]: File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
[default2]:[rank2]: File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
[default2]:[rank2]: File "<frozen importlib._bootstrap>", line 992, in _find_and_load_unlocked
[default2]:[rank2]: File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
[default2]:[rank2]: File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
[default2]:[rank2]: File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
[default2]:[rank2]: File "<frozen importlib._bootstrap>", line 992, in _find_and_load_unlocked
[default2]:[rank2]: File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
[default2]:[rank2]: File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
[default2]:[rank2]: File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
[default2]:[rank2]: File "<frozen importlib._bootstrap>", line 992, in _find_and_load_unlocked
[default2]:[rank2]: File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
[default2]:[rank2]: File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
[default2]:[rank2]: File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
[default2]:[rank2]: File "<frozen importlib._bootstrap>", line 1004, in _find_and_load_unlocked
[default2]:[rank2]: ModuleNotFoundError: No module named 'megatron_patch'
Metadata
Metadata
Assignees
Labels
No labels