[WIP][Infer] Inference Distributed RPC Framework Optimization #5756

LRY89757 · 2024-05-27T07:16:44Z

Optimize the data path: from List->CPU Tensor->List->rpc_param->GPU Tensor to List->rpc_param->GPU Tensor
Wrap the async forward only once
Only rank0 Worker runs the sampler and returns the return value
Pass the rpc param to worker 0 instead of all workers, and worker 0 broadcast the param to all workers using NCCL.

The performance is not good enough, which needs to be further optimized

…me-opt

…ime-opt

dist runtime opt source

55a5dd9

LRY89757 added the colossal-inference label May 27, 2024

LRY89757 requested a review from a team as a code owner May 27, 2024 07:16

LRY89757 changed the title ~~[Infer] Inference Distributed RPC Framework Optimization~~ [WIP][Infer] Inference Distributed RPC Framework Optimization May 27, 2024

LRY89757 added the tensor-parallel related to the tensor-parallel feature label May 27, 2024

LRY89757 added 4 commits June 5, 2024 04:02

tmp save for profiling

509f3a6

Merge branch 'main' of github.com:LRY89757/ColossalAI into dist-runti…

2ae42e7

…me-opt

Merge branch 'main' of github.com:hpcaitech/ColossalAI into dist-runt…

6dcc127

…ime-opt

remove timer

01ca9b8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP][Infer] Inference Distributed RPC Framework Optimization #5756

[WIP][Infer] Inference Distributed RPC Framework Optimization #5756

LRY89757 commented May 27, 2024

Uh oh!

Uh oh!

[WIP][Infer] Inference Distributed RPC Framework Optimization #5756

Are you sure you want to change the base?

[WIP][Infer] Inference Distributed RPC Framework Optimization #5756

Conversation

LRY89757 commented May 27, 2024

Uh oh!

Uh oh!