Skip to content

Avoid model introspection by requiring users to provide a function that defines a model instance #199

@CameronBieganek

Description

@CameronBieganek

I think I mentioned this idea a few years ago on a different thread, but I never gave the proposal its own issue.

This is a breaking change, but I think it would provide a cleaner interface for hyperparameter tuning, and it would also address #174.

The idea is that instead of specifying the hyperparameters that need to be tuned with a quoted expression like
:(estimator.leafsize), the user can simply provide a function that creates a new model. Here is one way that it could work:

function make_model(; _K, _leafsize)
    Pipeline(
        encoder = ContinuousEncoder(),
        estimator = KNNRegressor(K=_K, leafsize=_leafsize)
    )
end

domain = Domain(
    _K = (1, 3, 5, 7, 11),
    _leafsize = (5, 10, 15)
)

tunable_model = TunedModel(make_model, domain; strategy=Grid())

I've prefixed the feature names with an underscore to emphasize that what matters is that the keyword argument names in make_model match the keyword argument names in Domain. The actual names are arbitrary.

The downside to requiring the user defined function to use keyword arguments is that you can't really use do-notation in TunedModel, because as far as I can tell there is no way to define an anonymous function with keyword arguments by using do-notation. So, an alternative interface would be to require the user defined function to take a single positional argument with property destructuring, like this:

function make_model((; _K, _leafsize))
    Pipeline(
        encoder = ContinuousEncoder(),
        estimator = KNNRegressor(K=_K, leafsize=_leafsize)
    )
end

domain = Domain(
    _K = (1, 3, 5, 7, 11),
    _leafsize = (5, 10, 15)
)

tunable_model = TunedModel(make_model, domain; strategy=Grid())

This can then be expressed with do-notation as follows:

tunable_model = (
    TunedModel(domain; strategy=Grid()) do (; _K, _leafsize)
        Pipeline(
            encoder = ContinuousEncoder(),
            estimator = KNNRegressor(K=_K, leafsize=_leafsize)
        )
    end
)

I believe this interface is generic enough to work for any hyperparameter tuning strategy.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    tracking/discussion/metaissues/misc

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions