You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Nov 1, 2024. It is now read-only.
I am trying to extract a words embedding of the various tokenized (.tok) files. I have preprocessed the various dataset using preprocessing pipeline suggested in the TransCoder. I have also trained the model and can also used pretrained (TransCoder) to extract embedding matrix and embedding vectors of various tokens of various tokenized file.
Authors have plotted t-SNE visualization of a cross-lingual token embeddings. They obtained by encoding programming language tokens into TransCoder's lookup table.
Can authors explain how you did that? I also want to extract embedding of these tokens.