Question for ssl_cuda_kernel implementation 

Thank you for your great work! In the paper, I see that using sparsity can reduce the memory movement in the shift operation, but in the code, the shift operation, \ie, `ssl_cuda_kernel` will always copy or move all the channels. The sparsity will thus not reduce the memory cost of the shift operation. So I wonder if the shift operation implementation in inference mode should be different from the training mode. If that's so, would you mind sharing the `ssl_cuda_kernel` implementation for the inference mode? Thanks a lot!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Question for ssl_cuda_kernel implementation #4

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Question for ssl_cuda_kernel implementation #4

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions