Issue with loading Additional Entities

I have tried to load additional entities as per the README by running `preprocess_all`. Everything appears to run fine - however when I try and load the refined model afterwards with something like:
```
refined = Refined(
    model_file_or_model=data_dir+ "/wikipedia_model_with_numbers/model.pt",
    model_config_file_or_model_config=data_dir + "/wikipedia_model_with_numbers/config.json",
    entity_set="wikidata",
    data_dir=data_dir,
    use_precomputed_descriptions = False,
    download_files=False,
    preprocessor=preprocessor
)
```

I get an error like:
```
Traceback (most recent call last):
  File "/home/azureuser/Hafnia/email_ee/email_refined.py", line 91, in <module>
    refined = Refined(
  File "/home/azureuser/ReFinED/src/refined/inference/processor.py", line 100, in __init__
    self.model = RefinedModel.from_pretrained(
  File "/home/azureuser/ReFinED/src/refined/model_components/refined_model.py", line 643, in from_pretrained
    model.load_state_dict(checkpoint, strict=False)
  File "/home/azureuser/.pyenv/versions/venv3108/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1671, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for RefinedModel:
        size mismatch for entity_typing.linear.weight: copying a param with shape torch.Size([1369, 768]) from checkpoint, the shape in current model is torch.Size([1447, 768]).
        size mismatch for entity_typing.linear.bias: copying a param with shape torch.Size([1369]) from checkpoint, the shape in current model is torch.Size([1447]).
        size mismatch for entity_disambiguation.classifier.weight: copying a param with shape torch.Size([1, 1372]) from checkpoint, the shape in current model is torch.Size([1, 1450]).
```

To the best of my understanding, this is because the number of classes in the wikidata dump has changed since the original model was trained. (Number of class_to_label.json now has 1446 entries.) Is there any way to accomodate this without completely retraining the model?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Issue with loading Additional Entities #19

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue with loading Additional Entities #19

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions