Skip to content

Conversation

@Jiang-Stan
Copy link
Collaborator

@Jiang-Stan Jiang-Stan commented May 8, 2023

Model 1epoch PPL 3epoch PPL
LLaMA-7b 2.397 2.345
LLaMA-65b 2.304

LLaMA-7b 4bit QLoRA(lr=3e-4) finetune loss曲线:
2023-05-12 14-27-44 的屏幕截图

LLaMA-65b 4bit QLoRA(lr=1e-4) finetune loss曲线(目前仅1 epoch):
2023-05-15 10-38-15 的屏幕截图

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant