Skip to content
This repository was archived by the owner on Sep 23, 2025. It is now read-only.

Commit fd4d31f

Browse files
author
Wu, Gangsheng
committed
update
1 parent 85e9b81 commit fd4d31f

File tree

1 file changed

+3
-5
lines changed

1 file changed

+3
-5
lines changed

llm_on_ray/finetune/finetune.py

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -309,11 +309,9 @@ def load_model(config: Dict):
309309
model.generation_config.pad_token_id = 0
310310
model.generation_config.bos_token_id = 1
311311
model.generation_config.eos_token_id = 2
312-
attn_softmax_bf16 = config["General"]["attn_softmax_bf16"]
313-
if attn_softmax_bf16 and device == "hpu":
314-
model.generation_config.attn_softmax_bf16
315-
use_flash_attention = config["General"]["use_flash_attention"]
316-
if use_flash_attention and device == "hpu":
312+
if device == "hpu" and config["General"]["attn_softmax_bf16"]:
313+
model.generation_config.attn_softmax_bf16 = True
314+
if device == "hpu" and config["General"]["use_flash_attention"]:
317315
model.generation_config.use_flash_attention = True
318316
model.generation_config.flash_attention_recompute = False
319317
model.generation_config.flash_attention_causal_mask = False

0 commit comments

Comments
 (0)