使用 mmlu 数据集进行模型效果的时候,为什么 mmlu_gen 比 mmlu_ppl 评估时间要多用几个小时? #401
amulil
started this conversation in
View all discussions
Replies: 2 comments
-
想请问一下你的 mmlu_ppl复现的结果和opencompass榜单上给的结果是一样的吗? |
Beta Was this translation helpful? Give feedback.
0 replies
-
The model tends to predict between 10 and 100 words in its generalization mode, which can noticeably slow down the inference process. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
rt.
还有这两种数据集评估具体有什么区别和适用场景,是 chat 模型推荐用 mmlu_gen、非 chat 模型用 mmlu_ppl 吗?
Beta Was this translation helpful? Give feedback.
All reactions