Skip to content

Commit 197112b

Browse files
committed
fix
1 parent 0e1f6fe commit 197112b

File tree

2 files changed

+6
-3
lines changed

2 files changed

+6
-3
lines changed

native_sparse_attention_pytorch/transformer.py

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -214,12 +214,15 @@ def sample(
214214

215215
cache = None
216216

217-
for _ in tqdm(range(sample_num_times)):
217+
for ind in tqdm(range(sample_num_times)):
218+
is_first = ind == 0
218219

219220
logits, next_cache = self.forward(
220221
out,
221222
cache = cache,
222-
return_cache = True
223+
return_cache = True,
224+
disable_flex = not is_first,
225+
disable_triton_kernel = not is_first
223226
)
224227

225228
if use_cache_kv:

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[project]
22
name = "native-sparse-attention-pytorch"
3-
version = "0.0.61"
3+
version = "0.0.62"
44
description = "Native Sparse Attention"
55
authors = [
66
{ name = "Phil Wang", email = "lucidrains@gmail.com" }

0 commit comments

Comments
 (0)