[Feature] Add mooncake store #117

hufumans · 2025-08-26T06:25:35Z

Purpose

What this PR does / why we need it?

This PR integrates support for Mooncake Store as a unified cache backend in unified-cache-management.

It introduces a new UcmMooncakeStore connector that wraps MooncakeDistributedStore, enabling seamless dump/load/lookup operations for KV cache tensors in vLLM and related systems.

This provides:

Improved extensibility for distributed cache offloading
Full async event loop support and task scheduling
Compatibility with safetensors-based serialization
A consistent interface aligned with UcmKVStoreBase

Modifications

This PR adds the following files:

unifiedcache/ucm_connector/ucm_mooncake.py: Mooncake connector implementation.
test/test_mooncake.py: Unit tests for dump/load/lookup logic. Load Mooncake config from dict.
docs/source/getting-started/example/mooncake_conn.md: How to use Mooncake Store in UCM.
docs/source/getting-started/images/:
- mooncake_performance.png: mooncake_performance.
- mooncake_default_performance.png: default performance, used for comparing with mooncake.

This PR adjusts the following files:

docs/source/getting-started/example/index.md: add mooncake_conn.md.
unifieddcache/ucm_connector/factory.py: add mooncake config.

Test

Unit test

This patch was tested via:

✅ Unit tests in test/test_mooncake.py:
- test_lookup_not_found
- test_lookup_found
- test_dump_once
- test_dump_repeated
- test_load_existing_data
- test_load_non_existent_data

Precision test

✅ Precision test in example/offine_inference.py

End to end test

This test follows the step in mooncake_conn.md to start vLLM server.

model: QWQ32B

tokens	mooncake-first	mooncake-second	default
2k	1.9231491860002279	0.8265988459810615	0.5419427898712457
4k	3.9460434830747544	1.5273493870627135	0.991630249004811
8k	7.577957597002387	2.7632693520281464	2.0716467570047827
16k	16.823639799049126	5.515289016952738	4.742832682048902
32k	81.98759594326839	14.217441103421152	12.310140203218907

Use mooncake fig:

docqa_TTFT_QwQ-32B_MoonCake_connector_MoonCacke1

…d from env config file.

ygwpz · 2025-08-27T02:40:10Z

docs/source/images/mooncake_default_performance.png

what's this?

We tested the performance of using ucm mooncake and default (without using ucm) separately in this test without enabling prefix_caching. Do we need to merge the test charts or use other forms of processing?

ygwpz · 2025-08-29T06:37:54Z

unifiedcache/ucm_connector/ucm_mooncake.py

+        # Mooncake only has get and put interfaces, this operation is not supported
+        pass
+
+    def shutdown(self):


may be del()method？

…functionality

…nfiguration file method, and only retain the parameter transmission method

* fix issue#26 and issue#36 (#55) * [Doc] Add vllm institution (#61) * [CI] Add issue and pull request template; [Fix][Doc] Fix nfs doc error. (#62) (#64) * [CI] Add issue template * [CI] Add pr template * [Fix][Doc] Fix nfs doc error, close #57 Co-authored-by: harrisonyhq <harrisonyhq@gmail.com> * [Doc] update install doc using patch to build from source code (#68) * [Feat] Merge 0.0.1 back into develop (#72) * [CI] Add issue and pull request template; [Fix][Doc] Fix nfs doc error. (#62) * [CI] Add issue template * [CI] Add pr template * [Fix][Doc] Fix nfs doc error, close #57 * [CI][Style] Add Github workflow for pre commit and format the codestyle (#70) * [CI] Add github flow for pre-commit and unittest * [Style] Fix typo and sytle problem in repo --------- Co-authored-by: harrisonyhq <harrisonyhq@gmail.com> * [Style] Fix codestyle problems and typo in develop (#75) * [Style] Fix codestyle problems and typo * [Fix] Fix CI bug * [CI] Add workflow trigger on push * [CI] Add support pyproject.toml to enable using python -m build to compile whl package * ucm_sparse framework v1.0 (#79) * [Fix] Fix cant find cmake error when using pip install -e . * Revert "ucm_sparse framework v1.0 (#79)" (#82) This reverts commit b965dc8. * [Feature] add Mooncake Store * [Fix bug] fix docker build err and installation.md (#87) * adapt deepseek (#89) * [Feature][P/D] add example for disaggregated prefill (#90) * [Perf] Pipelined ucmnfsstore (#97) * pipelined ucmnfsstore * update default stream number * Revert "[Feature] add Mooncake Store" (#98) * [Fix bug] fix uc_connector ut and change hash generation method * [Fix] Fix .so build error (#104) [Fix] Fix so file import error in build and edit mode [Fix] format the code [Feat] Add device recognize function * [Fix] Fix ascend compile error (#106) * ESA 1.0 fix typo ESA: add vllm and vllm-ascend patch add vllm and vllm_ascend patch * fix typo * [fix] compatible with prefix cache * add sparse_attn example * add sparse_attn docs * Modify start_load_kv (#103) * [Fix] Fix duplicate create/commit errors upon preemption (#109) * [refact] format * adapt for vllm 0.9.1 (#113) Co-authored-by: y00945504 <yuhui87@huawei.com> * add patch * fix: uc_connector,rm .gitkeep ucm_oceanstor.py * rename vllm-adapt-2 to vllm-adapt-sparse * [Fix] Fix spelling issues with PR templates (#119) * remove load_tasks * [bugfix] bugfix in ucmnfsstore (#123) * trans task timeout support * [Fix] posix file open interface bugfix * add config parameter * Fix rank handling in multi-node PP setup (#129) * [Feat]Support UCM Sparse on cuda (#126) * [Feat]Support UCM Sparse on cuda * [DOCS]Add doc for format code. * [Feature] Add mooncake store (#117) * 暂存 * [Feature] Monncake connector support both config and file * [Doc] Add docs for Ucm Mooncake Connector * [Feature] Add mooncake to ucm factory * [Doc][Fix] Modify the description of configuration to match usage. * [Feature] [Fix] Load Mooncake config from dict, when lack params, load from env config file. * [Doc] update the performance and modify description. * [Test] Example config file for Mooncake test `test_mooncake_env.py`. * [Test] [Del] Removed unnecessary tests that do not match the current functionality * [Feat!] [Del] Adjust the mooncake configuration method, remove the configuration file method, and only retain the parameter transmission method * [Doc] [Fix] modifiy the performance figure of Mooncake Store. * [Feat] add __del__() to shutdown all the mooncake components --------- Co-authored-by: z00452769 <zhangyichen@huawei.com> Co-authored-by: propanone1006 <1035097916@qq.com> Co-authored-by: propanone1006 <1035067916@qq.com> * [bugfix]modify mla dump (#128) * modify mla dump * fix ci problem * [BugFix] aggregate work ouputs to decide dumped blocks * [BugFix] Modify npu worker for aggregating modelrunner_outputs * [CI] Add vllm patch for sparse in dockerfile (#134) * [CI] Add vllm patch for sparse in dockerfile * [Fix] Add patch in dockerfile and pip mirror. * [Fix] Update version 0.0.2 * ESA: skip processing for short requests (#147) * ucm_sparse: skip processing for short requests * add comments --------- Co-authored-by: flesher0813 <33923823+flesher0813@users.noreply.github.com> Co-authored-by: harrisonyhq <harrisonyhq@gmail.com> Co-authored-by: hek14 <1023129548@qq.com> Co-authored-by: Chen Deng <120033622+propanone1006@users.noreply.github.com> Co-authored-by: propanone1006 <1035067916@qq.com> Co-authored-by: qyh111 <qiuyuhao1@huawei.com> Co-authored-by: Mag1c.H <hemajun815@163.com> Co-authored-by: t00939662 <tianxuehan@huawei.com> Co-authored-by: Fate469434 <58885253+Fate469434@users.noreply.github.com> Co-authored-by: y00945504 <yuhui87@huawei.com> Co-authored-by: Zbm1996 <370478722@qq.com> Co-authored-by: NaganooMei <290992347@qq.com> Co-authored-by: NaganooMei <104300720+NaganooMei@users.noreply.github.com> Co-authored-by: f00943869 <fenghao0720@outlook.com> Co-authored-by: hufumans <113507465+hufumans@users.noreply.github.com> Co-authored-by: z00452769 <zhangyichen@huawei.com> Co-authored-by: propanone1006 <1035097916@qq.com> Co-authored-by: zhou-haitao <74044944+zhou-haitao@users.noreply.github.com> Co-authored-by: flesher0813 <1208954694@qq.com> Co-authored-by: AooooooA-C <chenaozhu@outlook.com>

z00452769 and others added 6 commits August 14, 2025 14:45

暂存

95396e5

[Feature] Monncake connector support both config and file

5daf6fe

[Doc] Add docs for Ucm Mooncake Connector

74bfbe1

[Feature] Add mooncake to ucm factory

9f7cee2

[Doc][Fix] Modify the description of configuration to match usage.

4e8af97

[Feature] [Fix] Load Mooncake config from dict, when lack params, loa…

8e873de

…d from env config file.

propanone1006 changed the title ~~Add mooncake store~~ [Feature] Add mooncake store Aug 26, 2025

propanone1006 requested a review from ygwpz August 26, 2025 08:16

[Doc] update the performance and modify description.

8d015cb

propanone1006 requested review from flesher0813 and ygwpz and removed request for flesher0813 and ygwpz August 27, 2025 02:15

ygwpz reviewed Aug 27, 2025

View reviewed changes

propanone1006 added 2 commits August 29, 2025 11:29

Merge branch 'develop' into develop_mooncake

2187a2e

[Test] Example config file for Mooncake test test_mooncake_env.py.

1691676

ygwpz reviewed Aug 29, 2025

View reviewed changes

propanone1006 added 4 commits August 29, 2025 14:50

[Test] [Del] Removed unnecessary tests that do not match the current …

315af32

…functionality

[Feat!] [Del] Adjust the mooncake configuration method, remove the co…

b637287

…nfiguration file method, and only retain the parameter transmission method

[Doc] [Fix] modifiy the performance figure of Mooncake Store.

82c418e

[Feat] add __del__() to shutdown all the mooncake components

9d982ce

qyh111 merged commit ccbee78 into develop Aug 30, 2025
6 checks passed

ygwpz deleted the develop_mooncake branch September 1, 2025 08:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature] Add mooncake store #117

[Feature] Add mooncake store #117

Uh oh!

hufumans commented Aug 26, 2025 •

edited by propanone1006

Loading

Uh oh!

ygwpz Aug 27, 2025

Uh oh!

propanone1006 Aug 28, 2025

Uh oh!

ygwpz Aug 29, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[Feature] Add mooncake store #117

[Feature] Add mooncake store #117

Uh oh!

Conversation

hufumans commented Aug 26, 2025 • edited by propanone1006 Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

This provides:

Modifications

Test

Unit test

Precision test

End to end test

Uh oh!

ygwpz Aug 27, 2025

Choose a reason for hiding this comment

Uh oh!

propanone1006 Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

ygwpz Aug 29, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

hufumans commented Aug 26, 2025 •

edited by propanone1006

Loading