Skip to content
This repository was archived by the owner on Aug 30, 2024. It is now read-only.
This repository was archived by the owner on Aug 30, 2024. It is now read-only.

i wish for simpler way to run the model #230

@kolinfluence

Description

@kolinfluence

i'm not well versed with python and where do i put the downloaded llama-2-7b-chat.Q4_0.gguf file?

i can make llama.cpp work real easy on my laptop but i cant seem to get this to work

i did git clone the neural speed, i did the pip install ... saved the file in run_model.py...

python run_model.py

from transformers import AutoTokenizer, TextStreamer
from intel_extension_for_transformers.transformers import AutoModelForCausalLM

# Specify the GGUF repo on the Hugginface
model_name = "TheBloke/Llama-2-7B-Chat-GGUF"
# Download the the specific gguf model file from the above repo
model_file = "llama-2-7b-chat.Q4_0.gguf"
# make sure you are granted to access this model on the Huggingface.
tokenizer_name = "meta-llama/Llama-2-7b-chat-hf"

prompt = "Once upon a time"
tokenizer = AutoTokenizer.from_pretrained(tokenizer_name, trust_remote_code=True)
inputs = tokenizer(prompt, return_tensors="pt").input_ids
streamer = TextStreamer(tokenizer)
model = AutoModelForCausalLM.from_pretrained(model_name, model_file = model_file)
outputs = model.generate(inputs, streamer=streamer, max_new_tokens=300)
(base) root@ubuntu:/usr/local/src/neural-speed# python run_model.py 
Traceback (most recent call last):
  File "/usr/local/src/neural-speed/run_model.py", line 2, in <module>
    from intel_extension_for_transformers.transformers import AutoModelForCausalLM, WeightOnlyQuantConfig
ImportError: cannot import name 'WeightOnlyQuantConfig' from 'intel_extension_for_transformers.transformers' (/root/miniconda3/lib/python3.11/site-packages/intel_extension_for_transformers/transformers/__init__.py)
(base) root@ubuntu:/usr/local/src/neural-speed# 

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions