v0.0.7

dkundel-openai released this 15 Sep 19:30

· 9 commits to main since this release

7802bf2

What's Changed

Evals: correctly pass temperature/max_tokens when using Responses API by @Maratyszcza in #174
Metal: move sampling to GPU by @Maratyszcza in #175
Metal: benchmark generation of 100 tokens instead of 1 by @Maratyszcza in #178
Metal: support generating multiple tokens at once by @Maratyszcza in #179
Adding prefill benchmarking for metal backend by @ibahmed-oai in #181
Metal: tune threadgroup sizes by @Maratyszcza in #180
Metal: Adding optimized dense matmul kernel to optimize prefill perf by @ibahmed-oai in #183
Metal: fused QKV projection (matmul+RoPE) kernel by @Maratyszcza in #184
[Bugfix]Capture stderr for python tool with uv as backend by @wuhang2014 in #182

New Contributors

@ibahmed-oai made their first contribution in #181
@wuhang2014 made their first contribution in #182

Full Changelog: v0.0.6...v0.0.7

Contributors

Maratyszcza, wuhang2014, and ibahmed-oai

Assets 2