Skip to content

Inquiry: Fusion Encoder in Downstream Task #8

@JungMinKyun

Description

@JungMinKyun

Firs of all, thank you very much for your research, I'm really interested in this works.

I have a question regarding the "Fusion Encoder" component of the downstream task,
specifically in the context of the linear probe evaluation.

My understanding is that, for the linear probe, you freeze the pre-trained encoder,
attach a classifier head on top, and then perform classification.
In the code, I see that you load pre-trained weights for the Image and Audio encoders, but for the Fusion Encoder you initialize it randomly, freeze it, and then train only the classifier.

I wonder whether it might make sense to load pre-trained weights for the Fusion Encoder as well, or at least allow it to be learnable if it is randomly initialized.
(If I misunderstanding something in here, please tell me)

Thank you for your time, and I would greatly appreciate any clarification or correction

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions