-
Notifications
You must be signed in to change notification settings - Fork 219
Description
(torchscale) yehuicheng@bdp-gpu04:~/torchscale/examples/fairseq$ torchrun --nproc_per_node=8 --master_port 29501 --nnodes=1 train.py /home/data/dataset/yehuicheng/LongNet_example/DNA_example/longnet_example --num-workers 0 --activation-fn gelu --share-decoder-input-output-embed --validate-interval-updates 1000 --save-interval-updates 1000 --no-epoch-checkpoints --memory-efficient-fp16 --fp16-init-scale 4 --arch transformer --task language_modeling --sample-break-mode none --tokens-per-sample 4096 --optimizer adam --adam-betas "(0.9, 0.98)" --adam-eps 1e-08 --clip-norm 0.0 --lr 5e-4 --lr-scheduler polynomial_decay --warmup-updates 750 --dropout 0.1 --attention-dropout 0.1 --weight-decay 0.01 --batch-size 4 --update-freq 1 --required-batch-size-multiple 1 --total-num-update 50000 --max-update 50000 --seed 1 --ddp-backend=c10d --flash-attention --segment-length [2048,4096] --dilated-ratio [1,2]
W1108 21:43:12.431143 140431967650432 torch/distributed/run.py:779]
W1108 21:43:12.431143 140431967650432 torch/distributed/run.py:779] *****************************************
W1108 21:43:12.431143 140431967650432 torch/distributed/run.py:779] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
W1108 21:43:12.431143 140431967650432 torch/distributed/run.py:779] *****************************************