Skip to content

A glitch in WN18RR data #13

@navdeepkjohal

Description

@navdeepkjohal

Dear Authors,

I found a little glitch in the WN18RR data updated by you. Although the data/wn18rr/entity.dict mentions 40943 entities, the actual entities which are a part of train.txt files are only 40559. Hence there are 40943-40559 = 384 entities that do not occur in the train.txt data but only are a part of the valid.txt and test.txt data and the model is doing zero-shot inference for these entities at the validation/test time, which might have adversarially affected the performance of your model. For instance, entity id: 14501545, does not occur in train.txt although it has been mentioned in the entities.dict file.

Apologies if I missed something, or my interpretation is wrong.

Best
Navdeep

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions