You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
> ScaleLLM is currently in the active development stage and may not yet provide the optimal level of inference efficiency. We are fully dedicated to continuously enhancing its efficiency while also adding more features.
10
-
8
+
ScaleLLM is currently undergoing active development. We are fully committed to consistently enhancing its efficiency while also incorporating additional features. We appreciate your understanding and look forward to delivering an even better solution.
11
9
12
-
In the coming weeks, we have exciting plans to focus on[**_speculative decoding_**](https://github.com/orgs/vectorch-ai/projects/1) and [**_stateful conversation_**](https://github.com/orgs/vectorch-ai/projects/2), alongside further kernel optimizations. We appreciate your understanding and look forward to delivering an even better solution.
10
+
Feel free to explore our[**_Roadmap_**](https://github.com/vectorch-ai/ScaleLLM/issues/84) for more details.
13
11
14
12
15
13
## Latest News:
16
-
*[11/2023] - First [official release](https://github.com/vectorch-ai/ScaleLLM/releases/tag/v0.0.1) with support for popular open-source models.
17
-
14
+
*[03/2024] - We've implemented several [advanced feature enhancements](https://github.com/vectorch-ai/ScaleLLM/releases/tag/v0.0.7), including support for CUDA graph, dynamic prefix cache, dynamic chunked prefill and speculative decoding.
15
+
*[11/2023] - We're excited to announce the first release with support for popular open-source models. Check it out [here](https://github.com/vectorch-ai/ScaleLLM/releases/tag/v0.0.1).
18
16
19
17
## Table of contents
20
18
@@ -49,16 +47,6 @@ ScaleLLM is a cutting-edge inference system engineered for large language models
49
47
50
48
## Supported Models
51
49
52
-
Please note that in order to use Yi models, you need to add `--model_type=Yi` to the command line. For example:
53
-
```bash
54
-
docker pull docker.io/vectorchai/scalellm:latest
55
-
docker run -it --gpus=all --net=host --shm-size=1g \
0 commit comments