Releases · lucidrains/native-sparse-attention-pytorch

21 Feb 18:15

0.0.44

36837b2

0.0.44

when doing interpolation of importance score, remask to 0 for illegal…

Assets 2

21 Feb 17:31

lucidrains

0.0.43

463963b

0.0.43

default to one mem kv for compressed attn

Assets 2

21 Feb 17:12

lucidrains

0.0.42

0ad8c5e

0.0.42

Full Changelog: 0.0.41...0.0.42

Assets 2

21 Feb 16:21

lucidrains

0.0.41

e9476ec

0.0.41

ready to be compared with full attention.

Assets 2

21 Feb 15:54

lucidrains

0.0.40

f32ed38

0.0.40

oops

Assets 2

21 Feb 15:48

lucidrains

0.0.39

cfa6e08

0.0.39

do the differential topk gating in a more suboptimal way, but accommo…

Assets 2

21 Feb 15:22

lucidrains

0.0.38

e40b35f

0.0.38

Full Changelog: 0.0.36...0.0.38

Assets 2

21 Feb 15:21

lucidrains

0.0.37

bc31eb7

0.0.37

account for learned memory key values in flex compress mask, also cle…

Assets 2

21 Feb 14:57

lucidrains

0.0.36

583cc11

0.0.36

refactor compressed pathway with gqa

Assets 2

21 Feb 14:44

lucidrains

0.0.35

6aa5fd8

0.0.35

deviate from the paper and allow for interpolation of the compressed …

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Releases: lucidrains/native-sparse-attention-pytorch

0.0.44

Uh oh!

0.0.43

Uh oh!

0.0.42

Uh oh!

0.0.41

Uh oh!

0.0.40

Uh oh!

0.0.39

Uh oh!

0.0.38

Uh oh!

0.0.37

Uh oh!

0.0.36

Uh oh!

0.0.35

Uh oh!