Skip to content

Commit e0c910b

Browse files
authored
[Hybrid] [Kernel] Fix chunk scan kernel when BLOCK_SIZE_DSTATE > 128 (#28295)
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
1 parent bf3ffb6 commit e0c910b

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

vllm/model_executor/layers/mamba/ops/ssd_chunk_scan.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -245,7 +245,7 @@ def _chunk_scan_fwd_kernel(
245245
)
246246
if not HAS_INITSTATES and (seq_idx != seq_idx_prev):
247247
prev_states = tl.zeros(
248-
(BLOCK_SIZE_DSTATE, BLOCK_SIZE_K), dtype=C_ptr.dtype.element_ty
248+
(BLOCK_SIZE_K, BLOCK_SIZE_N), dtype=C_ptr.dtype.element_ty
249249
)
250250
else:
251251
prev_states = tl.load(

0 commit comments

Comments
 (0)