[FEA] Accelerate `compute_data_page_mask` using a custom kernel

**Is your feature request related to a problem? Please describe.**

Related to #19707 

The multithreaded loop in compute data page mask can be accelerated using a (simple) custom CUDA kernel. Doing so will also allow us to avoid copying `row_mask` column data to the host.

**Describe the solution you'd like**
GPU-accelerated algorithm for `compute_data_page_mask`

**Describe alternatives you've considered**
CPU multithreaded solution implemented in #19707 

**Additional context**
Originally posted by @mhaseeb123 in https://github.com/rapidsai/cudf/pull/19602/files#r2286725034

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[FEA] Accelerate `compute_data_page_mask` using a custom kernel #19748

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[FEA] Accelerate compute_data_page_mask using a custom kernel #19748

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[FEA] Accelerate `compute_data_page_mask` using a custom kernel #19748