You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
No auxiliary tasks like BERT's next-sentence prediction were used for any model described here.
But in the PEER, the [CLS] token is used for ProtBert as a protein-level embedding representation. In this case the [CLS] token may not have the ability to represent sequence embedding.
For ProtBert, should we use the same strategy as for ESM (i.e., mean pooling over all residues) to get a fairer comparison?