Skip to content
This repository was archived by the owner on Nov 1, 2024. It is now read-only.
This repository was archived by the owner on Nov 1, 2024. It is now read-only.

How to getting word embedding from the trained model of TransCoder? #49

@sushantkumar007007

Description

@sushantkumar007007

I am trying to extract a words embedding of the various tokenized (.tok) files. I have preprocessed the various dataset using preprocessing pipeline suggested in the TransCoder. I have also trained the model and can also used pretrained (TransCoder) to extract embedding matrix and embedding vectors of various tokens of various tokenized file.
Authors have plotted t-SNE visualization of a cross-lingual token embeddings. They obtained by encoding programming language tokens into TransCoder's lookup table.
Can authors explain how you did that? I also want to extract embedding of these tokens.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions