Skip to content

Commit d87d51c

Browse files
authored
update r2.8 LLM table (#3782)
* update r2.8 LLM table * update html pages accordingly * append qwen3-30b-a3b
1 parent bba28a0 commit d87d51c

File tree

4 files changed

+72
-5
lines changed

4 files changed

+72
-5
lines changed

README.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,8 @@ and the phenomenal high-quality reasoning model DeepSeek-R1.
5555
|Qwen| Qwen/Qwen-7B-Chat |||||
5656
|Qwen| Qwen/Qwen2-7B |||||
5757
|Qwen| Qwen/Qwen2.5-7B-Instruct |||||
58+
|Qwen| Qwen/Qwen3-14B |||| |
59+
|Qwen| Qwen/Qwen3-30B-A3B |||||
5860
|LLaVA| liuhaotian/llava-v1.5-7b |||||
5961
|GIT| microsoft/git-base |||||
6062
|Yuan| IEITYuan/Yuan2-102B-hf |||| |
@@ -66,6 +68,7 @@ and the phenomenal high-quality reasoning model DeepSeek-R1.
6668
|Phi| microsoft/Phi-4-mini-instruct |||| |
6769
|Phi| microsoft/Phi-4-multimodal-instruct |||| |
6870
|Whisper| openai/whisper-large-v2 |||||
71+
|Whisper| openai/whisper-large-v3 |||| |
6972
|Maira| microsoft/maira-2 |||||
7073
|Jamba| ai21labs/Jamba-v0.1 |||||
7174
|DeepSeek| deepseek-ai/DeepSeek-V2.5-1210 |||||

docs/_static/htmls/tbl_deepspeed.html

Lines changed: 19 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -188,6 +188,18 @@
188188
<td><p style="text-align: center; vertical-align: middle;"></p></td>
189189
<td><p style="text-align: center; vertical-align: middle;"></p></td>
190190
</tr>
191+
<tr class="row-even">
192+
<td><p>Qwen</p></td>
193+
<td><p>Qwen/Qwen3-14B</p></td>
194+
<td><p style="text-align: center; vertical-align: middle;"></p></td>
195+
<td><p style="text-align: center; vertical-align: middle;"></p></td>
196+
</tr>
197+
<tr class="row-odd">
198+
<td><p>Qwen</p></td>
199+
<td><p>Qwen/Qwen3-30B-A3B</p></td>
200+
<td><p style="text-align: center; vertical-align: middle;"></p></td>
201+
<td><p style="text-align: center; vertical-align: middle;"></p></td>
202+
</tr>
191203
<tr class="row-even">
192204
<td><p>GIT</p></td>
193205
<td><p>microsoft/git-base</p></td>
@@ -231,12 +243,18 @@
231243
<td><p style="text-align: center; vertical-align: middle;"></p></td>
232244
</tr>
233245
<tr class="row-odd">
246+
<td><p>Whisper</p></td>
247+
<td><p>openai/whisper-large-v3</p></td>
248+
<td><p style="text-align: center; vertical-align: middle;"></p></td>
249+
<td><p style="text-align: center; vertical-align: middle;"></p></td>
250+
</tr>
251+
<tr class="row-even">
234252
<td><p>DeepSeek</p></td>
235253
<td><p>deepseek-ai/DeepSeek-V2.5-1210</p></td>
236254
<td><p style="text-align: center; vertical-align: middle;"></p></td>
237255
<td><p style="text-align: center; vertical-align: middle;"></p></td>
238256
</tr>
239-
<tr class="row-even">
257+
<tr class="row-odd">
240258
<td><p>DeepSeek</p></td>
241259
<td><p>meituan/DeepSeek-R1-Channel-INT8</p></td>
242260
<td><p style="text-align: center; vertical-align: middle;"></p></td>

docs/_static/htmls/tbl_single.html

Lines changed: 27 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -274,6 +274,22 @@
274274
<td><p style="text-align: center; vertical-align: middle;"></p></td>
275275
<td><p style="text-align: center; vertical-align: middle;"></p></td>
276276
</tr>
277+
<tr class="row-odd">
278+
<td><p>Qwen</p></td>
279+
<td><p>Qwen/Qwen3-14B</p></td>
280+
<td><p style="text-align: center; vertical-align: middle;"></p></td>
281+
<td><p style="text-align: center; vertical-align: middle;"></p></td>
282+
<td><p style="text-align: center; vertical-align: middle;"></p></td>
283+
<td><p style="text-align: center; vertical-align: middle;"></p></td>
284+
</tr>
285+
<tr class="row-even">
286+
<td><p>Qwen</p></td>
287+
<td><p>Qwen/Qwen3-30B-A3B</p></td>
288+
<td><p style="text-align: center; vertical-align: middle;"></p></td>
289+
<td><p style="text-align: center; vertical-align: middle;"></p></td>
290+
<td><p style="text-align: center; vertical-align: middle;"></p></td>
291+
<td><p style="text-align: center; vertical-align: middle;"></p></td>
292+
</tr>
277293
<tr class="row-odd">
278294
<td><p>LLaVA</p></td>
279295
<td><p>liuhaotian/llava-v1.5-7b</p></td>
@@ -363,30 +379,38 @@
363379
<td><p style="text-align: center; vertical-align: middle;"></p></td>
364380
</tr>
365381
<tr class="row-even">
382+
<td><p>Whisper</p></td>
383+
<td><p>openai/whisper-large-v3</p></td>
384+
<td><p style="text-align: center; vertical-align: middle;"></p></td>
385+
<td><p style="text-align: center; vertical-align: middle;"></p></td>
386+
<td><p style="text-align: center; vertical-align: middle;"></p></td>
387+
<td><p style="text-align: center; vertical-align: middle;"></p></td>
388+
</tr>
389+
<tr class="row-odd">
366390
<td><p>Maira</p></td>
367391
<td><p>microsoft/maira-2</p></td>
368392
<td><p style="text-align: center; vertical-align: middle;"></p></td>
369393
<td><p style="text-align: center; vertical-align: middle;"></p></td>
370394
<td><p style="text-align: center; vertical-align: middle;"></p></td>
371395
<td><p style="text-align: center; vertical-align: middle;"></p></td>
372396
</tr>
373-
<tr class="row-odd">
397+
<tr class="row-even">
374398
<td><p>Jamba</p></td>
375399
<td><p>ai21labs/Jamba-v0.1</p></td>
376400
<td><p style="text-align: center; vertical-align: middle;"></p></td>
377401
<td><p style="text-align: center; vertical-align: middle;"></p></td>
378402
<td><p style="text-align: center; vertical-align: middle;"></p></td>
379403
<td><p style="text-align: center; vertical-align: middle;"></p></td>
380404
</tr>
381-
<tr class="row-even">
405+
<tr class="row-odd">
382406
<td><p>DeepSeek</p></td>
383407
<td><p>deepseek-ai/DeepSeek-V2.5-1210</p></td>
384408
<td><p style="text-align: center; vertical-align: middle;"></p></td>
385409
<td><p style="text-align: center; vertical-align: middle;"></p></td>
386410
<td><p style="text-align: center; vertical-align: middle;"></p></td>
387411
<td><p style="text-align: center; vertical-align: middle;"></p></td>
388412
</tr>
389-
<tr class="row-odd">
413+
<tr class="row-even">
390414
<td><p>DeepSeek</p></td>
391415
<td><p>meituan/DeepSeek-R1-Channel-INT8</p></td>
392416
<td><p style="text-align: center; vertical-align: middle;"></p></td>

examples/cpu/llm/inference/README.md

Lines changed: 23 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,8 @@ and the phenomenal high-quality reasoning model [DeepSeek-R1](#223-deepseek-r1-6
4141
|Qwen| Qwen/Qwen-7B-Chat |||||
4242
|Qwen| Qwen/Qwen2-7B |||||
4343
|Qwen| Qwen/Qwen2.5-7B-Instruct |||||
44+
|Qwen| Qwen/Qwen3-14B |||| |
45+
|Qwen| Qwen/Qwen3-30B-A3B |||||
4446
|LLaVA| liuhaotian/llava-v1.5-7b |||||
4547
|GIT| microsoft/git-base |||||
4648
|Yuan| IEITYuan/Yuan2-102B-hf |||| |
@@ -52,6 +54,7 @@ and the phenomenal high-quality reasoning model [DeepSeek-R1](#223-deepseek-r1-6
5254
|Phi| microsoft/Phi-4-mini-instruct |||| |
5355
|Phi| microsoft/Phi-4-multimodal-instruct |||| |
5456
|Whisper| openai/whisper-large-v2 |||||
57+
|Whisper| openai/whisper-large-v3 |||| |
5558
|Maira| microsoft/maira-2 |||||
5659
|Jamba| ai21labs/Jamba-v0.1 |||||
5760
|DeepSeek| deepseek-ai/DeepSeek-V2.5-1210 |||||
@@ -91,13 +94,16 @@ and the phenomenal high-quality reasoning model [DeepSeek-R1](#223-deepseek-r1-6
9194
|Qwen| Qwen/Qwen-7B-Chat |||
9295
|Qwen| Qwen/Qwen2-7B |||
9396
|Qwen| Qwen/Qwen2.5-7B-Instruct |||
97+
|Qwen| Qwen/Qwen3-14B |||
98+
|Qwen| Qwen/Qwen3-30B-A3B |||
9499
|GIT| microsoft/git-base |||
95100
|Phi| microsoft/phi-2 |||
96101
|Phi| microsoft/Phi-3-mini-4k-instruct |||
97102
|Phi| microsoft/Phi-3-mini-128k-instruct |||
98103
|Phi| microsoft/Phi-3-medium-4k-instruct |||
99104
|Phi| microsoft/Phi-3-medium-128k-instruct |||
100105
|Whisper| openai/whisper-large-v2 |||
106+
|Whisper| openai/whisper-large-v3 |||
101107
|DeepSeek| deepseek-ai/DeepSeek-V2.5-1210 |||
102108
|DeepSeek| meituan/DeepSeek-R1-Channel-INT8 | ||
103109

@@ -474,7 +480,7 @@ Please add the `quantization_config` field to the end of the file as below.
474480
+ "bits": 8,
475481
+ "group_size": -1
476482
+ }
477-
}
483+
}
478484
```
479485

480486
- Use the following command to run the test.
@@ -510,6 +516,22 @@ There are some model-specific requirements to be aware of, as follows:
510516

511517
- For Llava models from remote hub, additional setup is required, i.e., `bash ./tools/prepare_llava.sh`.
512518

519+
- For INT8 quantized Qwen/Qwen3-30B-A3B model, a `quantization_config` field needs to be added in `config.json`.
520+
521+
```diff
522+
"transformers_version": "4.46.3",
523+
"use_cache": true,
524+
"v_head_dim": 128,
525+
- "vocab_size": 129280
526+
+ "vocab_size": 129280,
527+
+ "quantization_config": {
528+
+ "quant_method": "int8",
529+
+ "bits": 8,
530+
+ "group_size": -1
531+
+ }
532+
}
533+
```
534+
513535
## 2.3 Instructions for Running Multimodal LLMs
514536

515537
Multimodal LLMs are large language models capable of processing multiple types of inputs,

0 commit comments

Comments
 (0)