Skip to content

Conversation

wade3han
Copy link

Tested with own fine-tuned 7B alpaca model

python inference.py \
    --model_name_or_path {model_path}
Instruction: Tell me about alpacas.
|  2499 | Al       | -15.960 | 0.00%
| 29886 | p        | -33.403 | 0.00%
|   562 | ac       | -32.065 | 0.00%
|   294 | as       | -24.586 | 0.00%
|   526 | are      | -20.448 | 0.00%
|   263 | a        | -17.845 | 0.00%
|  6606 | species  | -16.602 | 0.00%
|   310 | of       | -15.564 | 0.00%
|  4275 | South    | -11.832 | 0.00%
|  3082 | American | -22.230 | 0.00%
|  3949 | cam      | -12.354 | 0.00%
|   295 | el       | -34.635 | 0.00%
|   333 | id       | -19.849 | 0.00%
| 29892 | ,        | -20.313 | 0.00%
...
| 29889 | .        | -25.931 | 0.00%
|     2 | </s>     | -21.040 | 0.00%
Response:  Alpacas are a species of South American camelid, related to the llama. They are smaller than llamas and typically have finer fiber. Alpacas are primarily bred for thei
r fiber, which can be spun into soft and luxurious yarns. They are also used for their meat, which is similar to that of a chicken. Alpacas are social animals and live in herds w
ith a dominant male leader.</s>

...

Largely influenced by https://github.com/kriskrisliu/stanford_alpaca/tree/krisliu

@MrRace
Copy link

MrRace commented Apr 10, 2023

    indices = sequences[:, cut_idx:] + beam_sequence_indices
RuntimeError: The size of tensor a (114) must match the size of tensor b (259) at non-singleton dimension 1

Have you meet error like it? @wade3han

@wade3han
Copy link
Author

No, I didn't encounter that error. Can you give me more context?

@MrRace
Copy link

MrRace commented Apr 10, 2023

No, I didn't encounter that error. Can you give me more context?

just use :

instructions = [
        "模仿鲁迅的风格, 吐槽一下最近食堂饭菜涨价",
    ]

@diichen
Copy link

diichen commented Apr 13, 2023

same problem.

@magnificent1208
Copy link

RuntimeError: CUDA error: CUBLAS_STATUS_INVALID_VALUE when calling cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)

same for inferencing both llama-7b-hf and fine-tune model

@diichen
Copy link

diichen commented Apr 14, 2023

Cool! The problem has been fixed.

@BaoBaoGitHub
Copy link

BaoBaoGitHub commented Jul 20, 2023

Thanks for the code!

However, I had some problems when I run the code in my server with three 3090 GPUs with VRAM of 24GB*3.
I solved the error of out of memory by commenting out the line model.cuda().
Then I solved the error "Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0!" by commenting out the line num_beams=4, .

I know model.cuda() will set all model to the first GPU.
But what happend when I commenting out the line num_beams=4 ? Why it can fix the error?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants