Add fusion-for-decoder-only for llama #733

binxuan · 2023-07-28T00:27:32Z

This is a fusion-in-decoder implementation for decoder-only model in FasterTransformer which supports encoding N contexts in parallel and do generation based on these N contexts. This implementation is based on the great work from this repo. We currently re-use batch_size dimension to do this encoding operation. A followup technique report/paper will be released soon.

…tance.h Co-authored-by: Bram Wasti <bwasti@fb.com>

…into main

void-main and others added 24 commits April 23, 2023 12:12

get llama coded

f3bd8e6

make the code work :yay:

a32fc1d

fix llama rms ln

ce8700f

add bf16 support

91989cb

add triton model for streaming callback

4bc97c3

register RMS for bf16

a6d51ec

revert bf16

7a72ca3

revert bf16

9820565

bugfix

bfeebef

add megatron llama convert

0379cc5

Update src/fastertransformer/triton_backend/llama/LlamaTritonModelIns…

d65adf1

…tance.h Co-authored-by: Bram Wasti <bwasti@fb.com>

donot callback too frequnetly

cf1b9b1

add bf16

95afed4

make sure examples work for bf16

9aee02e

support bf16 conversion with bfloat 16 numpy ext

8ddac81

Merge branch 'main' of https://github.com/void-main/FasterTransformer …

694faec

…into main

bugfix

40fbe48

load layernorm_eps from config; change cb default to 5

f6cf9da

Merge branch 'main' of https://github.com/void-main/FasterTransformer …

d752088

…into main

update megatron convert script

da2ad14

fix callback issue

abd1e4d

Merge branch 'NVIDIA:main' into main

b942806

fix name

50fdb0c

add fusion-for-decoder-only for llama

dce9b65

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add fusion-for-decoder-only for llama #733

Add fusion-for-decoder-only for llama #733

Uh oh!

binxuan commented Jul 28, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add fusion-for-decoder-only for llama #733

Are you sure you want to change the base?

Add fusion-for-decoder-only for llama #733

Uh oh!

Conversation

binxuan commented Jul 28, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants