Describe the feature request
Having power infer kernels compatible with sparse weight cache would open up all the models in sparse transformers to support weight lazy loading and having faster inference kernels for skipMLP
Additional context
SJTU-IPADS/PowerInfer#93