Skip to content

选择albert模型tokenizer加载错误 #11

@Wang-Daoji

Description

@Wang-Daoji

The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization.
The tokenizer class you load from this checkpoint is 'BertTokenizer'.
The class this function is called from is 'AlbertTokenizer'.
Traceback (most recent call last):
File "/home/efsz/localCode/Pytorch-NLU/test/tc/tet_tc_base_multi_label.py", line 73, in
lc.process()
File "/home/efsz/localCode/Pytorch-NLU/pytorch_nlu/pytorch_textclassification/tcRun.py", line 32, in process
self.corpus = Corpus(self.config, self.logger)
File "/home/efsz/localCode/Pytorch-NLU/pytorch_nlu/pytorch_textclassification/tcData.py", line 25, in init
self.tokenizer = self.load_tokenizer(self.config)
File "/home/efsz/localCode/Pytorch-NLU/pytorch_nlu/pytorch_textclassification/tcData.py", line 206, in load_tokenizer
tokenizer = PRETRAINED_MODEL_CLASSES[config.model_type][1].from_pretrained(config.pretrained_model_name_or_path)
File "/home/efsz/anaconda3/envs/nlu/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1854, in from_pretrained
return cls._from_pretrained(
File "/home/efsz/anaconda3/envs/nlu/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2017, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
File "/home/efsz/anaconda3/envs/nlu/lib/python3.10/site-packages/transformers/models/albert/tokenization_albert.py", line 183, in init
self.sp_model.Load(vocab_file)
File "/home/efsz/anaconda3/envs/nlu/lib/python3.10/site-packages/sentencepiece/init.py", line 905, in Load
return self.LoadFromFile(model_file)
File "/home/efsz/anaconda3/envs/nlu/lib/python3.10/site-packages/sentencepiece/init.py", line 310, in LoadFromFile
return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg)
TypeError: not a string

其他模型好像没问题,但是albert的tokenizer加载会报这个错

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions