Skip to content

Pull requests: huggingface/lighteval

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[EVAL] BIG-Bench Extra Hard
#1099 opened Dec 5, 2025 by jgyasu Draft
aime_avg was not added to TASKS_TABLE
#1098 opened Dec 4, 2025 by francesco-bertolotti Loading…
[EVAL] SciCode
#1086 opened Nov 27, 2025 by akshathmangudi Loading…
Enable loading data sets from files for custom tasks
#1083 opened Nov 24, 2025 by davebiagioni Loading…
3 tasks done
Evals on the hub
#1082 opened Nov 24, 2025 by NathanHB Loading…
Feature/tvd mi metric feature
#1080 opened Nov 22, 2025 by zrobertson466920 Loading…
[EVAL] MultiChallenge new-task
#1075 opened Nov 21, 2025 by akshathmangudi Loading…
[EVAL] Long Horizon Execution new-task
#1074 opened Nov 21, 2025 by akshathmangudi Loading…
4 tasks done
graceful shutdown of vllm async bug
#1064 opened Nov 17, 2025 by f14-bertolotti Loading…
Adds Profbench new-task
#1041 opened Nov 6, 2025 by NathanHB Loading…
Fix PERPLEXITY task
#1037 opened Nov 4, 2025 by ScottHoang Loading…
Legal NLP tasks on Swiss data
#1032 opened Oct 31, 2025 by rolshoven Loading…
Add support to vllm==0.11.0
#1027 opened Oct 22, 2025 by anmarques Loading…
Wrap vllm inputs to compatible with VLLM>=0.10.2
#1003 opened Oct 2, 2025 by JIElite Loading…
Fix caching logic
#994 opened Sep 25, 2025 by jxmorris12 Loading…
Fix deberta overflow error bug
#990 opened Sep 24, 2025 by amstu2 Loading…
run slow tests aginst vllm and transformers main
#985 opened Sep 23, 2025 by NathanHB Loading…
ProTip! Exclude everything labeled bug with -label:bug.