File tree Expand file tree Collapse file tree 1 file changed +4
-4
lines changed Expand file tree Collapse file tree 1 file changed +4
-4
lines changed Original file line number Diff line number Diff line change @@ -93,7 +93,7 @@ An example script to prepare data for GPT training is:
93
93
python tools/preprocess_data.py \
94
94
--input my-corpus.json \
95
95
--output-prefix my-gpt2 \
96
- --vocab gpt2-vocab.json \
96
+ --vocab-file gpt2-vocab.json \
97
97
--dataset-impl mmap \
98
98
--tokenizer-type GPT2BPETokenizer \
99
99
--merge-file gpt2-merges.txt \
@@ -132,7 +132,7 @@ xz -d oscar-1GB.jsonl.xz
132
132
python tools/preprocess_data.py \
133
133
--input oscar-1GB.jsonl \
134
134
--output-prefix my-gpt2 \
135
- --vocab gpt2-vocab.json \
135
+ --vocab-file gpt2-vocab.json \
136
136
--dataset-impl mmap \
137
137
--tokenizer-type GPT2BPETokenizer \
138
138
--merge-file gpt2-merges.txt \
@@ -192,13 +192,13 @@ DATA_ARGS=" \
192
192
--data-path $DATA_PATH \
193
193
"
194
194
195
- CMD="pretrain_gpt.py $GPT_ARGS $ OUTPUT_ARGS $DATA_ARGS"
195
+ CMD="pretrain_gpt.py GPTARGSGPT_ARGS OUTPUT_ARGS $DATA_ARGS"
196
196
197
197
N_GPUS=1
198
198
199
199
LAUNCHER="deepspeed --num_gpus $N_GPUS"
200
200
201
- $LAUNCHER $ CMD
201
+ LAUNCHERLAUNCHER CMD
202
202
```
203
203
204
204
Note, we replaced ` python ` with ` deepspeed --num_gpus 1 ` . For multi-gpu training update ` --num_gpus ` to the number of GPUs you have.
You can’t perform that action at this time.
0 commit comments