-
-
Notifications
You must be signed in to change notification settings - Fork 36
Description
Paper Uncertainty in Bayesian Leave-One-Out Cross-Validation Based Model Comparison describes when elpd_diff
and se_diff
based normal approximation can be expected to be well calibrated. The paper includes case studies with the code at https://users.aalto.fi/~ave/casestudies/LOO_uncertainty/loo_uncertainty.html. In that code, the probability that a model has worse elpd_loo than the best model has been computed separately and the text includes comments on when we can expect the normal approximation to be well calibrated. We could add the probability that a model has worse elpd_loo and diagnostic message to loo_compare() output
Example outputs would look like
Small data:
> loo_compare(loo1, loo2, loo3, loo4)
elpd_diff se_diff p_worse diag_pnorm
M_4 0.0 0.0 NA N < 100
M_3 -3.9 1.6 0.99 N < 100
M_1 -4.0 2.4 0.95 N < 100
M_2 -4.7 2.5 0.97 N < 100
Outliers (khat threshold depends on the number of observations):
> loo_compare(M_1, M_2, M_3)
elpd_diff se_diff p_worse diag_pnorm
M_3 0.0 0.0 NA
M_2 -13.5 9.8 0.92 khat_diff > 0.54
M_1 -78.1 20.8 1.00 khat_diff > 0.54
All good:
loo_compare(M_1t, M_2t, M_3t)
elpd_diff se_diff p_worse diag_pnorm
M_3t 0.0 0.0 NA
M_2t -44.8 8.6 1.00
M_1t -118.7 15.9 1.00
Models that have very similar predictions
> loo_compare(loo(M_3), loo(M_4))
elpd_diff se_diff p_worse diag_pnorm
M_4 0.0 0.0 NA
M_3 -2.2 3.2 0.76 similar predictions
I have a branch that clarifies the diagnostics, and illustrates some things probably need more thought. Now I added diag_pnorm
as string, which meant I had to switch to use data.frame, and I want to show one digit for elpd_diff
and se_diff
but two digits for p_worse