Skip to content

Commit bb85854

Browse files
authored
[Feat]Support UCM Sparse on cuda (#126)
* [Feat]Support UCM Sparse on cuda * [DOCS]Add doc for format code.
1 parent 1eca8fb commit bb85854

File tree

6 files changed

+376
-47
lines changed

6 files changed

+376
-47
lines changed

.gitignore

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -48,4 +48,5 @@
4848
**/build/**
4949
**/output/**
5050
.venv/**
51-
**/__pycache__/**
51+
**/__pycache__/**
52+
*.egg-info/**
Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
# Contributing
2+
## Building and testing
3+
It’s recommended to set up a local development environment to build and test before you submit a PR.
4+
### Run lint locally
5+
Run following commands to format your code before submit:
6+
```bash
7+
# Choose a base dir (~/vllm-project/) and set up venv
8+
cd ~/vllm-project/
9+
python3 -m venv .venv
10+
source ./.venv/bin/activate
11+
12+
# Clone UCM and install
13+
git clone https://github.com/ModelEngine-Group/unified-cache-management.git
14+
cd unified-cache-management
15+
16+
# Install lint requirement and enable pre-commit hook
17+
pip install -r requirements-lint.txt
18+
19+
# Run lint (You need install pre-commits deps via proxy network at first time)
20+
bash format.sh
21+
```
22+
### Run unit test locally
23+
Run unit test locally with following command:
24+
```bash
25+
python3 -m unittest discover -s test
26+
```

docs/source/developer/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
:::{toctree}
44
:maxdepth: 2
55
architecture.md
6+
contributing.md
67
add_connector.md
78
nfs_connector.md
89
performance_benchmark.md

unifiedcache/integration/vllm/ucm_sparse/base.py

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,6 @@
3535
import torch
3636
from vllm.distributed.kv_transfer import get_kv_transfer_group, has_kv_transfer_group
3737
from vllm.forward_context import ForwardContext
38-
from vllm_ascend.worker.npu_input_batch import CachedRequestState, InputBatch
3938

4039
INVALID_SLOT = -1
4140

@@ -194,9 +193,9 @@ def update_state_after_alloc(self, request: Request, num_blocks: int):
194193

195194
def build_sparse_meta(
196195
self,
197-
scheduler_output: SchedulerOutput,
198-
requests: dict[str, CachedRequestState],
199-
input_batch: InputBatch,
196+
scheduler_output,
197+
requests,
198+
input_batch,
200199
) -> UcmSparseMetadata:
201200
"""
202201
Build the sparse metadata for this step.

0 commit comments

Comments
 (0)