Skip to content
This repository was archived by the owner on Sep 23, 2025. It is now read-only.

Commit 3e6ccac

Browse files
authored
Merge branch 'main' into chat_template
Signed-off-by: minmingzhu <45281494+minmingzhu@users.noreply.github.com>
2 parents 3cb18dd + 9182907 commit 3e6ccac

File tree

5 files changed

+9
-2
lines changed

5 files changed

+9
-2
lines changed
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
finetune: gpt2, bigscience/bloom-560m, facebook/opt-125m, mosaicml/mpt-7b-chat, huggyllama/llama-7b
1+
finetune: gpt2, bigscience/bloom-560m, facebook/opt-125m, mosaicml/mpt-7b, huggyllama/llama-7b

docs/finetune_parameters.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ The following are the parameters supported in the finetuning workflow.
77
|Configuration Name| Default|Meaning|
88
|-|-|-|
99
|base_model| EleutherAI/gpt-j-6b|Path to pretrained model or model identifier from huggingface.co/models|
10+
|tokenizer_name|None|Path to pretrained tokenizer from huggingface.co/models. If not provided, the tokenizer will be loaded from the `base_model`.|
1011
|gpt_base_model|True|This parameter is for [Transformers#22482](https://github.com/huggingface/transformers/issues/22482). It needs to be set to True when the pretrained model is realted to gpt, otherwise it is False.|
1112
|output_dir|/tmp/llm-ray/output|The output directory to store the finetuned model|
1213
|checkpoint_dir|/tmp/llm-ray/checkpoint|The directory to store checkpoint|

llm_on_ray/finetune/finetune.py

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -155,6 +155,10 @@ def train_func(config: Dict[str, Any]):
155155

156156
gradient_accumulation_steps = config["Training"].get("gradient_accumulation_steps", 1)
157157
base_model = config["General"]["base_model"]
158+
if config["General"].get("tokenizer_name") is not None:
159+
tokenizer_name = config["General"].get("tokenizer_name")
160+
else:
161+
tokenizer_name = base_model
158162
dataset_file = config["Dataset"]["train_file"]
159163

160164
seed = config["Training"].get("seed")
@@ -171,7 +175,7 @@ def train_func(config: Dict[str, Any]):
171175

172176
tokenizer = common.tokenizer.Tokenizer.registory.get("HuggingFaceTokenizer")()(
173177
config={
174-
"name": base_model,
178+
"name": tokenizer_name,
175179
"config": config["General"]["config"],
176180
}
177181
)

llm_on_ray/finetune/finetune_config.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,7 @@ class DeltatunerConfig(BaseModel):
5151

5252
class General(BaseModel):
5353
base_model: str
54+
tokenizer_name: Optional[str] = None
5455
gpt_base_model: bool
5556
output_dir: str
5657
checkpoint_dir: Optional[str]

llm_on_ray/finetune/models/mpt-7b.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
General:
22
base_model: mosaicml/mpt-7b
3+
tokenizer_name: EleutherAI/gpt-neox-20b
34
gpt_base_model: false
45
output_dir: /tmp/llm-ray/output
56
checkpoint_dir: /tmp/llm-ray/checkpoint

0 commit comments

Comments
 (0)