the usual

lucidrains · lucidrains · commit 949e71609df7 · 2025-02-19T08:03:39.000-08:00
diff --git a/README.md b/README.md
@@ -4,6 +4,35 @@
 
 Implementation of the sparse attention pattern proposed by the Deepseek team in their [Native Sparse Attention](https://arxiv.org/abs/2502.11089) paper
 
+## Install
+
+```bash
+$ pip install native-sparse-attention-pytorch
+```
+
+## Usage
+
+```python
+import torch
+from native_sparse_attention_pytorch import SparseAttention
+
+attn = SparseAttention(
+    dim = 512,
+    dim_head = 64,
+    heads = 8,
+    sliding_window_size = 2,
+    compress_block_size = 4,
+    selection_block_size = 4,
+    num_selected_blocks = 2
+)
+
+tokens = torch.randn(2, 31, 512)
+
+attended = attn(tokens)
+
+assert tokens.shape == attended.shape
+```
+
 ## Citations
 
 ```bibtex