Skip to content

Add normal approximation based probabilities to loo_compare #299

@avehtari

Description

@avehtari

Paper Uncertainty in Bayesian Leave-One-Out Cross-Validation Based Model Comparison describes when elpd_diff and se_diff based normal approximation can be expected to be well calibrated. The paper includes case studies with the code at https://users.aalto.fi/~ave/casestudies/LOO_uncertainty/loo_uncertainty.html. In that code, the probability that a model has worse elpd_loo than the best model has been computed separately and the text includes comments on when we can expect the normal approximation to be well calibrated. We could add the probability that a model has worse elpd_loo and diagnostic message to loo_compare() output

Example outputs would look like

Small data:

> loo_compare(loo1, loo2, loo3, loo4)
    elpd_diff se_diff p_worse diag_pnorm
M_4       0.0     0.0      NA    N < 100
M_3      -3.9     1.6    0.99    N < 100
M_1      -4.0     2.4    0.95    N < 100
M_2      -4.7     2.5    0.97    N < 100

Outliers (khat threshold depends on the number of observations):

> loo_compare(M_1, M_2, M_3)
    elpd_diff se_diff p_worse       diag_pnorm
M_3       0.0     0.0      NA                 
M_2     -13.5     9.8    0.92 khat_diff > 0.54
M_1     -78.1    20.8    1.00 khat_diff > 0.54

All good:

loo_compare(M_1t, M_2t, M_3t)
     elpd_diff se_diff p_worse diag_pnorm
M_3t       0.0     0.0      NA           
M_2t     -44.8     8.6    1.00           
M_1t    -118.7    15.9    1.00           

Models that have very similar predictions

> loo_compare(loo(M_3), loo(M_4))
    elpd_diff se_diff p_worse          diag_pnorm
M_4       0.0     0.0      NA                    
M_3      -2.2     3.2    0.76 similar predictions

I have a branch that clarifies the diagnostics, and illustrates some things probably need more thought. Now I added diag_pnorm as string, which meant I had to switch to use data.frame, and I want to show one digit for elpd_diff and se_diff but two digits for p_worse

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions