Skip to content

Commit 469efb5

Browse files
authored
Adding remote code trust flag for Dataloader (#373)
HF requires the --trust-remote-code=True flag to be passed in to the Dataloader.
1 parent 2abc6bd commit 469efb5

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

3.test_cases/10.FSDP/model_utils/train_utils.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -456,7 +456,7 @@ def create_streaming_dataloader(dataset,
456456
split=None):
457457
print(f"dataset={dataset}, name={name}")
458458
tokenizer = AutoTokenizer.from_pretrained(tokenizer)
459-
data = load_dataset(dataset, name=name, streaming=True, split=split).shuffle(42+global_rank)
459+
data = load_dataset(dataset, name=name, streaming=True, split=split, trust_remote_code=True).shuffle(42+global_rank)
460460
train_concat_dataset = ConcatTokensDataset(data, tokenizer, max_context_width, True)
461461
train_dataloader = DataLoader(train_concat_dataset,
462462
batch_size=batch_size,

0 commit comments

Comments
 (0)