Skip to content

Training data ordering bias #45

@lukebarton

Description

@lukebarton

Depending on which order the examples are provided, the model can be extremely biased - is this expected?

https://colab.research.google.com/drive/1nkzYd9m6Q2aIx5v3coepbmvC-mipGeSm

swim, expect zing
[('zoob', 0.7240370727788529), ('zing', 0.275962927221147)]
[('zing', 0.5323253870010376), ('zoob', 0.4676746129989624)] zoobs (cats) first
meow, expect zoob
[('zing', 0.5290462532355265), ('zoob', 0.47095374676447355)]
[('zoob', 0.8305825838620241), ('zing', 0.16941741613797592)] zoobs (cats) first

If adding the zoob label to the classifier first, we can see a ~26% and ~36% swing in confidence, giving completely different answers depending on the training input ordering.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions