Skip to content

不知能否提供Sparse Upcycling的config #2

@sharkdrop

Description

@sharkdrop

Describe the feature

@Adlith 有幸拜读到您的文章,很受到启发,不知道能否分享一下消融实验Table 3对应的Sparse Upcycling的config,例如MoE Layers, Core Expert Number, Universal Expert Number.

Will you implement it?

  • I would like to implement this feature and create a PR!

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions