Skip to content

Optionally delegate classifiers to XGBoost for finetuning and inference #114

@JackHopkins

Description

@JackHopkins

Is your feature request related to a problem? Please describe.
LLMs are extremely inefficient at classification. XGBoost is better if the data is available. We could use the aligned data from the LLM to train an XGBoost model, which would be much faster to run.

Describe the solution you'd like
When the output types denote a classification task (i.e where the goal is to sample one type in a union of literal types, or an enum), we optionally distil the teacher model into a decision forest using the XGBoost library.

Additional context
We could represent student models as optional packages, sort of like drivers, that the user could install through PIP.

E.g pip3 install tanuki.py[xgboost]

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions