Describe the feature request
Bring support for Llama.cpp inferencing and benchmarking.
Describe the solution you'd like
modelling_llama_skip.py changes for exporting to GGUF 
- Add and dispatch inference to llama.cpp with sparse transformers GGUF
 
- update 
run_benchmark.py to support llama.cpp