Skip to content

Conversation

@xelandernt
Copy link
Contributor

@xelandernt xelandernt commented Oct 7, 2025

Description of Changes

In #174 that was supposed to resolve #173 I only fixed part of the problem. Plackettluce, model still produces the following results on the main branch:

from openskill.models import PlackettLuce

model = PlackettLuce()

player_1 = model.rating()
player_2 = model.rating()

print(player_1)
print(player_2)


ranks = [1, 1]

new_ratings = model.rate([[player_1], [player_2]], ranks=ranks)

print(new_ratings) 
# [[PlackettLuceRating(mu=27.635389493140497, sigma=8.06590141354368)], [PlackettLuceRating(mu=27.635389493140497, sigma=8.06590141354368)]]

--> Both players gain rating even in a tie.

So I read through the paper and verified with the code and I believe there was a small mistake made here:

image

I fixed this and this has resolved the issue.

How has this affected benchmarks?

Please Double Check!!

  • no changes in draw.ipynb
  • no changes in rank.ipynb
  • no changes in win.ipynb
  • changes in benchmark.py only for the PlacketLuce Model

So now the benchmark is slightly worse, but is consistent with the change of performance of all the other models in #174


Benchmark Results:
               Rating System Benchmark Results - Margin Comparison               
┏━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓
┃ Model                  ┃ Margin ┃ Accuracy       ┃ Predictions ┃ Avg Time (s) ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━┩
│ PlackettLuce           │ 0.0    │ 62.30% ± 1.58% │ 1342/2154   │ 1.76         │
│ PlackettLuce           │ 1.0    │ 64.53% ± 1.32% │ 1390/2154   │ 1.66         │
│ ThurstoneMostellerPart │ 0.0    │ 59.89% ± 0.88% │ 1290/2154   │ 1.69         │
│ ThurstoneMostellerPart │ 1.0    │ 61.00% ± 0.48% │ 1314/2154   │ 1.66         │
│ ThurstoneMostellerFull │ 0.0    │ 65.32% ± 1.47% │ 1407/2154   │ 1.71         │
│ ThurstoneMostellerFull │ 1.0    │ 65.18% ± 1.24% │ 1404/2154   │ 1.69         │
│ BradleyTerryFull       │ 0.0    │ 62.30% ± 1.58% │ 1342/2154   │ 1.77         │
│ BradleyTerryFull       │ 1.0    │ 63.97% ± 0.97% │ 1378/2154   │ 1.66         │
│ BradleyTerryPart       │ 0.0    │ 62.30% ± 1.58% │ 1342/2154   │ 1.57         │
│ BradleyTerryPart       │ 1.0    │ 63.97% ± 0.97% │ 1378/2154   │ 1.58         │
└────────────────────────┴────────┴────────────────┴─────────────┴──────────────┘


Issue(s) Resolved

Fully fixes

#173

Affirmation

By submitting this Pull Request or typing my (user)name below,
I affirm the Developer Certificate of Origin
with respect to all commits and content included in this PR,
and understand I am releasing the same under openskill.py's MIT license.

I certify the above statement is true and correct: xelandernt

@codecov
Copy link

codecov bot commented Oct 7, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 100.00%. Comparing base (d5226bc) to head (c5db88a).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff            @@
##              main      #176   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           10        10           
  Lines         2047      2047           
  Branches       513       513           
=========================================
  Hits          2047      2047           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@xelandernt
Copy link
Contributor Author

@vivekjoshy I have double-checked all the benchmarks and everything is as described in the PR.

If you have time, could you please have a look at this PR?

@vivekjoshy
Copy link
Owner

vivekjoshy commented Oct 28, 2025

Thanks! I kinda forgot about this PR, I'll go over it soon!

@vivekjoshy
Copy link
Owner

I have reviewed this. I'll merge this for now and push a fix that improves accuracy in a few days.

@vivekjoshy vivekjoshy merged commit 874ebc5 into vivekjoshy:main Nov 1, 2025
43 checks passed
@vivekjoshy
Copy link
Owner

Thank you btw for the fixes you put in :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Ties give unexpected results

2 participants