Skip to content

Commit

Permalink
[FIX] Fix the content of README.md (#130)
Browse files Browse the repository at this point in the history
* [FIX] update readme content

* [FIX] update README

Signed-off-by: Feng Ren <[email protected]>
  • Loading branch information
alogfans authored Mar 7, 2025
1 parent ac98ea9 commit 4d89151
Showing 1 changed file with 7 additions and 10 deletions.
17 changes: 7 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,16 +3,13 @@
<h2 align="center">
A KVCache-centric Disaggregated Architecture for LLM Serving
</h2>
<a href="https://www.usenix.org/system/files/fast25-qin.pdf" target="_blank"><strong>Paper</strong></a>
| <a href="https://www.usenix.org/system/files/fast25_slides-qin.pdf" target="_blank"><strong>Slides</strong></a>
| <a href="FAST25-release/traces" target="_blank"><strong>Traces</strong></a>
| <a href="https://arxiv.org/abs/2407.00079" target="_blank"><strong>Technical Report</strong></a>
</div>
<br/>

| [**Paper**](https://www.usenix.org/system/files/fast25-qin.pdf)
| [**Slides**](https://www.usenix.org/system/files/fast25_slides-qin.pdf)
| [**Traces**](FAST25-release/traces)
| [**Technical Report**](https://arxiv.org/abs/2407.00079)
|


Mooncake is the serving platform for <a href="https://kimi.ai/"><img src="image/kimi.png" alt="icon" style="height: 16px; vertical-align: middle;"> Kimi</a>, a leading LLM service provided by <a href="https://www.moonshot.cn/"><img src="image/moonshot.jpg" alt="icon" style="height: 16px; vertical-align: middle;"> Moonshot AI</a>.
Now both the Transfer Engine and Mooncake Store are open-sourced!
This repository also hosts its technical report and the open sourced traces.
Expand Down Expand Up @@ -70,9 +67,6 @@ With 40 GB of data (equivalent to the size of the KVCache generated by 128k toke
P2P Store is built on the Transfer Engine and supports sharing temporary objects between peer nodes in a cluster. P2P Store is ideal for scenarios like checkpoint transfer, where data needs to be rapidly and efficiently shared across a cluster.
**P2P Store has been used in the checkpoint transfer service of Moonshot AI.**

### Mooncake Store ([Guide](doc/en/mooncake-store-preview.md))
Mooncake Store is a distributed KVCache storage engine specialized for LLM inference. It offers object-level APIs (`Put`, `Get` and `Remove`), and we will soon release an new vLLM integration to demonstrate xPyD disaggregation. Mooncake Store is the central component of the KVCache-centric disaggregated architecture.

#### Highlights
- **Decentralized architecture.** P2P Store leverages a pure client-side architecture with global metadata managed by the etcd service.

Expand All @@ -83,6 +77,9 @@ Thanks to the high performance of Transfer Engine, P2P Stores can also distribut

![p2p-store.gif](image/p2p-store.gif)

### Mooncake Store ([Guide](doc/en/mooncake-store-preview.md))
Mooncake Store is a distributed KVCache storage engine specialized for LLM inference. It offers object-level APIs (`Put`, `Get` and `Remove`), and we will soon release an new vLLM integration to demonstrate xPyD disaggregation. Mooncake Store is the central component of the KVCache-centric disaggregated architecture.

### vLLM Integration ([Guide v0.2](doc/en/vllm-integration-v0.2.md))
To optimize LLM inference, the vLLM community is working on supporting [disaggregated prefilling (PR 10502)](https://github.com/vllm-project/vllm/pull/10502). This feature allows separating the **prefill** phase from the **decode** phase in different processes. The vLLM uses `nccl` and `gloo` as the transport layer by default, but currently it cannot efficiently decouple both phases in different machines.

Expand Down

0 comments on commit 4d89151

Please sign in to comment.