-
Notifications
You must be signed in to change notification settings - Fork 981
Add context to IR.do_evaluate #20322
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add context to IR.do_evaluate #20322
Conversation
This adds a keyword-only `context` argument to cudf_polars IR.do_evaluate method. The purpose to provide access to special pieces of data that might be necessary for controlling an IR nodes' execution, but doesn't belong on the IR node itself as a non-child argument. Specifically, we'd like to provide a CUDA `stream` argument, but we generalize that slightly and provide a system for providing arbitrary data.
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
|
5a77a80 has a POC for how this can be used. We add a Alternatively, rather than giving a Finally, we could drop the dataclass and just make it a dictionary. But I'd prefer to keep things structured where possible, so that both the functions and the callers of the function know what belongs in the context. We can attach an |
|
/merge |
Description
This adds a keyword-only
contextargument to cudf_polars IR.do_evaluate method. The purpose to provide access to special pieces of data that might be necessary for controlling an IR nodes' execution, but doesn't belong on the IR node itself as a non-child argument. Specifically, we'd like to provide a CUDAstreamargument as part of #20228, but we generalize that slightly and provide a system for providing arbitrary data.A few notes on the implementation:
_callbackand passed intoir.evaluate/evaluate_streamingand from there to all the methods that require it.contextkeyword only inIR.do_evaluate(..., context). However, Dask's task graph doesn't really deal with that. It wants a tuple of(function, arg1, arg2, ...). So that requires usingfunctools.partial(function, context=context)(arg1, arg2, ...).Expr.evaluatealso takes a context, and its a different typeExecutionContext:( I can rename the IR variant if we want.Just a draft for now, and probably not worth reviewing until I have a branch somewhere that combines CUDA streams with this to verify it meets our needs.