What's Changed
- add support for passing weight to the loss functions by @volker48 in #260
- fix: padding token not recognized, update transformers by @stephantul in #265
- Fix tag train documentation by @Lhemamou in #269
- chore: Added python 3.13 to pyproject and CI by @Pringled in #270
- feat: add classifier freezing by @stephantul in #274
- fix: remove windows tests by @stephantul in #277
- feat: add configurable pad token by @stephantul in #276
- feat: faster loading if model already cached by @stephantul in #278
- feat: add vocabulary quantization by @stephantul in #271
- fix: load faster, make quantization better by @stephantul in #279
- fix: F rule, A rule, update ruff by @stephantul in #281
- feat: Added embedding_dtype and vocabulary_quantization to config by @Pringled in #280
- fix: Disable MPS for Torch versions >=2.8.0 by @Pringled in #287
- feat: Add configurable pooling for distillation by @Pringled in #288
- chore: Deprecate apply_zipf and use_subword parameters by @Pringled in #289
- chore: Rename PoolingType to PoolingMode by @Pringled in #290
- docs: Update main docs by @Pringled in #291
- chore: Bump version by @Pringled in #292
New Contributors
Deprecation warnings ⚠️
apply_zipf
anduse_subword
are now officially deprecated from distill
Full Changelog: v0.6.0...v0.7.0