Skip to content

Question: Task ordering strategy during training stage2, 3 #105

@shin-wn

Description

@shin-wn

Thank you for the great work on this project.

I noticed that tasks are trained in a fixed order rather than being shuffled:

  • In stage 2, tasks follow the sequence t2m -> m2t -> predict for one epoch
  • In stage 3, tasks appear to be processed in the order defined in the JSON file

Since motion is treated as discrete data similar to text, using the same loss function across tasks should be possible. This makes me wonder about the following:

  1. Was there a specific reason for not shuffling tasks during training?
  2. Have you found better results with this fixed-order approach compared to random task selection?
  3. Did you experiment with randomly selecting tasks in pretraining (stage 2) and instruction tuning (stage 3)?

I'm curious to learn more about the design decisions behind this approach. Looking forward to hearing your insights!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions