Skip to content
This repository was archived by the owner on Aug 30, 2024. It is now read-only.
This repository was archived by the owner on Aug 30, 2024. It is now read-only.

Loading checkpoint shards takes too long #251

@irjawais

Description

@irjawais

When I load "meta-llama/Meta-Llama-3-8B-Instruct" model like this

from transformers import AutoTokenizer, TextStreamer from intel_extension_for_transformers.transformers import AutoModelForCausalLM model_name = "meta-llama/Meta-Llama-3-8B-Instruct" # Hugging Face model_id or local model tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True) streamer = TextStreamer(tokenizer) model = AutoModelForCausalLM.from_pretrained(model_name, load_in_4bit=True)

it got hanged. Then only way is to restart instance to recover it.

Is there any issue in my spec?

my instance spec ubunu 32 GB RAM.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions