Commit 11132fe
Add relu2 to kernel and python api
Signed-off-by: Neta Zmora <96238833+nzmora-nvidia@users.noreply.github.com>
Fixes and UT
Signed-off-by: Neta Zmora <96238833+nzmora-nvidia@users.noreply.github.com>
Use trtllm moe for relu2 mlp case
Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>
Fix the runGemmProfile
Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>
Replace the FP8 fused MoE backend
Before: torch.ops.auto_deploy.triton_quant_fp8_moe
After: torch.ops.auto_deploy.trtllm_quant_fp8moe_fused
Signed-off-by: Neta Zmora <96238833+nzmora-nvidia@users.noreply.github.com>
Code refactoring
Signed-off-by: Neta Zmora <96238833+nzmora-nvidia@users.noreply.github.com>
syntax error fixes
Signed-off-by: Neta Zmora <96238833+nzmora-nvidia@users.noreply.github.com>
remove dead code
Signed-off-by: Neta Zmora <96238833+nzmora-nvidia@users.noreply.github.com>
fix moe operator function name
Signed-off-by: Neta Zmora <96238833+nzmora-nvidia@users.noreply.github.com>
Add skips if not hopper+
Signed-off-by: Neta Zmora <96238833+nzmora-nvidia@users.noreply.github.com>
remove unused code
Signed-off-by: Neta Zmora <96238833+nzmora-nvidia@users.noreply.github.com>1 parent 6e8037a commit 11132fe
File tree
9 files changed
+722
-31
lines changed- cpp/tensorrt_llm
- cutlass_extensions/include/cutlass_extensions/epilogue/thread
- kernels/cutlass_kernels
- include
- moe_gemm
- thop
- tensorrt_llm/_torch
- auto_deploy
- custom_ops/fused_moe
- transform/library
- custom_ops
- tests/unittest/_torch/auto_deploy/unit/singlegpu/custom_ops
9 files changed
+722
-31
lines changedcpp/tensorrt_llm/cutlass_extensions/include/cutlass_extensions/epilogue/thread/fused_activations.h
Lines changed: 24 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
59 | 59 | | |
60 | 60 | | |
61 | 61 | | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
62 | 86 | | |
63 | 87 | | |
64 | 88 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
28 | 28 | | |
29 | 29 | | |
30 | 30 | | |
| 31 | + | |
31 | 32 | | |
32 | 33 | | |
33 | 34 | | |
| |||
Lines changed: 1 addition & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
954 | 954 | | |
955 | 955 | | |
956 | 956 | | |
| 957 | + | |
957 | 958 | | |
958 | 959 | | |
959 | 960 | | |
| |||
Lines changed: 2 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2307 | 2307 | | |
2308 | 2308 | | |
2309 | 2309 | | |
| 2310 | + | |
| 2311 | + | |
2310 | 2312 | | |
2311 | 2313 | | |
2312 | 2314 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
259 | 259 | | |
260 | 260 | | |
261 | 261 | | |
262 | | - | |
263 | | - | |
| 262 | + | |
| 263 | + | |
264 | 264 | | |
265 | 265 | | |
266 | 266 | | |
| |||
328 | 328 | | |
329 | 329 | | |
330 | 330 | | |
| 331 | + | |
| 332 | + | |
| 333 | + | |
331 | 334 | | |
332 | 335 | | |
333 | 336 | | |
| |||
337 | 340 | | |
338 | 341 | | |
339 | 342 | | |
340 | | - | |
341 | | - | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
| 351 | + | |
| 352 | + | |
342 | 353 | | |
343 | 354 | | |
344 | 355 | | |
| |||
375 | 386 | | |
376 | 387 | | |
377 | 388 | | |
378 | | - | |
| 389 | + | |
379 | 390 | | |
380 | 391 | | |
381 | 392 | | |
| |||
474 | 485 | | |
475 | 486 | | |
476 | 487 | | |
477 | | - | |
478 | | - | |
| 488 | + | |
| 489 | + | |
479 | 490 | | |
480 | 491 | | |
481 | 492 | | |
| |||
541 | 552 | | |
542 | 553 | | |
543 | 554 | | |
544 | | - | |
| 555 | + | |
| 556 | + | |
| 557 | + | |
545 | 558 | | |
546 | 559 | | |
547 | 560 | | |
| |||
652 | 665 | | |
653 | 666 | | |
654 | 667 | | |
655 | | - | |
| 668 | + | |
| 669 | + | |
656 | 670 | | |
657 | 671 | | |
658 | 672 | | |
| |||
661 | 675 | | |
662 | 676 | | |
663 | 677 | | |
| 678 | + | |
664 | 679 | | |
665 | 680 | | |
666 | 681 | | |
| |||
715 | 730 | | |
716 | 731 | | |
717 | 732 | | |
718 | | - | |
| 733 | + | |
719 | 734 | | |
720 | 735 | | |
721 | 736 | | |
722 | 737 | | |
723 | 738 | | |
724 | 739 | | |
725 | | - | |
| 740 | + | |
726 | 741 | | |
727 | 742 | | |
728 | 743 | | |
| |||
0 commit comments