Skip to content

Commit df3d01f

Browse files
committed
Fix links and standardize section titles in coverage documentation
1 parent d228c7d commit df3d01f

File tree

5 files changed

+74
-14
lines changed

5 files changed

+74
-14
lines changed

doc/irm/apo.qmd

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ init_notebook_mode(all_interactive=True)
2424

2525
## APO Pointwise Coverage
2626

27-
The simulations are based on the the [make_irm_data_discrete_treatments](https://docs.doubleml.org/stable/api/api.html#datasets-module)-DGP with $500$ observations. Due to the linearity of the DGP, Lasso and Logit Regression are nearly optimal choices for the nuisance estimation.
27+
The simulations are based on the the [make_irm_data_discrete_treatments](https://docs.doubleml.org/stable/api/datasets.html#dataset-generators)-DGP with $500$ observations. Due to the linearity of the DGP, Lasso and Logit Regression are nearly optimal choices for the nuisance estimation.
2828

2929
::: {.callout-note title="Metadata" collapse="true"}
3030

@@ -80,7 +80,7 @@ generate_and_show_styled_table(
8080

8181
## APOS Coverage
8282

83-
The simulations are based on the the [make_irm_data_discrete_treatments](https://docs.doubleml.org/stable/api/api.html#datasets-module)-DGP with $500$ observations. Due to the linearity of the DGP, Lasso and Logit Regression are nearly optimal choices for the nuisance estimation.
83+
The simulations are based on the the [make_irm_data_discrete_treatments](https://docs.doubleml.org/stable/api/datasets.html#dataset-generators)-DGP with $500$ observations. Due to the linearity of the DGP, Lasso and Logit Regression are nearly optimal choices for the nuisance estimation.
8484

8585
The non-uniform results (coverage, ci length and bias) refer to averaged values over all quantiles (point-wise confidende intervals).
8686

@@ -136,7 +136,7 @@ generate_and_show_styled_table(
136136

137137
## Causal Contrast Coverage
138138

139-
The simulations are based on the the [make_irm_data_discrete_treatments](https://docs.doubleml.org/stable/api/api.html#datasets-module)-DGP with $500$ observations. Due to the linearity of the DGP, Lasso and Logit Regression are nearly optimal choices for the nuisance estimation.
139+
The simulations are based on the the [make_irm_data_discrete_treatments](https://docs.doubleml.org/stable/api/datasets.html#dataset-generators)-DGP with $500$ observations. Due to the linearity of the DGP, Lasso and Logit Regression are nearly optimal choices for the nuisance estimation.
140140

141141
The non-uniform results (coverage, ci length and bias) refer to averaged values over all quantiles (point-wise confidende intervals).
142142

doc/irm/iivm.qmd

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -22,9 +22,9 @@ from utils.style_tables import generate_and_show_styled_table
2222
init_notebook_mode(all_interactive=True)
2323
```
2424

25-
## LATE Coverage
25+
## Coverage
2626

27-
The simulations are based on the the [make_iivm_data](https://docs.doubleml.org/stable/api/generated/doubleml.datasets.make_iivm_data.html)-DGP with $500$ observations. Due to the linearity of the DGP, Lasso and Logit Regression are nearly optimal choices for the nuisance estimation.
27+
The simulations are based on the the [make_iivm_data](https://docs.doubleml.org/stable/api/generated/doubleml.irm.datasets.make_iivm_data.html)-DGP with $500$ observations. Due to the linearity of the DGP, Lasso and Logit Regression are nearly optimal choices for the nuisance estimation.
2828

2929
::: {.callout-note title="Metadata" collapse="true"}
3030

doc/irm/irm.qmd

Lines changed: 65 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -22,9 +22,9 @@ from utils.style_tables import generate_and_show_styled_table
2222
init_notebook_mode(all_interactive=True)
2323
```
2424

25-
## ATE Coverage
25+
## Coverage
2626

27-
The simulations are based on the the [make_irm_data](https://docs.doubleml.org/stable/api/generated/doubleml.datasets.make_irm_data.html)-DGP with $500$ observations. Due to the linearity of the DGP, Lasso and Logit Regression are nearly optimal choices for the nuisance estimation.
27+
The simulations are based on the the [make_irm_data](https://docs.doubleml.org/stable/api/generated/doubleml.irm.datasets.make_irm_data.html)-DGP with $500$ observations. Due to the linearity of the DGP, Lasso and Logit Regression are nearly optimal choices for the nuisance estimation.
2828

2929
::: {.callout-note title="Metadata" collapse="true"}
3030

@@ -37,6 +37,8 @@ print(metadata_df.T.to_string(header=False))
3737

3838
:::
3939

40+
### ATE
41+
4042
```{python}
4143
#| echo: false
4244
@@ -78,9 +80,9 @@ generate_and_show_styled_table(
7880
```
7981

8082

81-
## ATTE Coverage
83+
### ATTE
8284

83-
As for the ATE, the simulations are based on the the [make_irm_data](https://docs.doubleml.org/stable/api/generated/doubleml.datasets.make_irm_data.html)-DGP with $500$ observations.
85+
As for the ATE, the simulations are based on the the [make_irm_data](https://docs.doubleml.org/stable/api/generated/doubleml.irm.datasets.make_irm_data.html)-DGP with $500$ observations.
8486

8587
::: {.callout-note title="Metadata" collapse="true"}
8688

@@ -135,7 +137,7 @@ generate_and_show_styled_table(
135137

136138
## Sensitivity
137139

138-
The simulations are based on the the [make_confounded_irm_data](https://docs.doubleml.org/stable/api/generated/doubleml.datasets.make_confounded_irm_data.html#doubleml.datasets.make_confounded_irm_data)-DGP with $5,000$ observations. Since the DGP includes an unobserved confounder, we would expect a bias in the ATE estimates, leading to low coverage of the true parameter.
140+
The simulations are based on the the [make_confounded_irm_data](https://docs.doubleml.org/stable/api/generated/doubleml.irm.datasets.make_confounded_irm_data.html#doubleml.datasets.make_confounded_irm_data)-DGP with $5,000$ observations. Since the DGP includes an unobserved confounder, we would expect a bias in the ATE estimates, leading to low coverage of the true parameter.
139141

140142
The confounding is set such that both sensitivity parameters are approximately $cf_y=cf_d=0.1$, such that the robustness value $RV$ should be approximately $10\%$.
141143
Further, the corresponding confidence intervals are one-sided (since the direction of the bias is unkown), such that only one side should approximate the corresponding coverage level (here only the lower coverage is relevant since the bias is positive). Remark that for the coverage level the value of $\rho$ has to be correctly specified, such that the coverage level will be generally (significantly) larger than the nominal level under the conservative choice of $|\rho|=1$.
@@ -245,3 +247,61 @@ generate_and_show_styled_table(
245247
coverage_highlight_cols=coverage_highlight_cols_sens
246248
)
247249
```
250+
251+
252+
## Tuning
253+
254+
The simulations are based on the the [make_irm_data](https://docs.doubleml.org/stable/api/generated/doubleml.irm.datasets.make_irm_data.html)-DGP with $1000$ observations. This is only an example as the untuned version just relies on the default configuration.
255+
256+
::: {.callout-note title="Metadata" collapse="true"}
257+
258+
```{python}
259+
#| echo: false
260+
metadata_file = '../../results/irm/irm_ate_tune_metadata.csv'
261+
metadata_df = pd.read_csv(metadata_file)
262+
print(metadata_df.T.to_string(header=False))
263+
```
264+
265+
:::
266+
267+
### ATE
268+
269+
```{python}
270+
#| echo: false
271+
272+
# set up data
273+
df_ate_tune_cov = pd.read_csv("../../results/irm/irm_ate_tune_coverage.csv", index_col=None)
274+
275+
assert df_ate_tune_cov["repetition"].nunique() == 1
276+
n_rep_ate_tune_cov = df_ate_tune_cov["repetition"].unique()[0]
277+
278+
display_columns_ate_tune_cov = ["Learner g", "Learner m", "Tuned", "Bias", "CI Length", "Coverage",]
279+
```
280+
281+
282+
```{python}
283+
#| echo: false
284+
285+
generate_and_show_styled_table(
286+
main_df=df_ate_tune_cov,
287+
filters={"level": 0.95},
288+
display_cols=display_columns_ate_tune_cov,
289+
n_rep=n_rep_ate_cov,
290+
level_col="level",
291+
coverage_highlight_cols=["Coverage"]
292+
)
293+
```
294+
295+
296+
```{python}
297+
#| echo: false
298+
299+
generate_and_show_styled_table(
300+
main_df=df_ate_tune_cov,
301+
filters={"level": 0.9},
302+
display_cols=display_columns_ate_tune_cov,
303+
n_rep=n_rep_ate_cov,
304+
level_col="level",
305+
coverage_highlight_cols=["Coverage"]
306+
)
307+
```

doc/irm/irm_cate.qmd

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -22,9 +22,9 @@ from utils.style_tables import generate_and_show_styled_table
2222
init_notebook_mode(all_interactive=True)
2323
```
2424

25-
## CATE Coverage
25+
## Coverage
2626

27-
The simulations are based on the the [make_heterogeneous_data](https://docs.doubleml.org/stable/api/generated/doubleml.datasets.make_heterogeneous_data.html)-DGP with $2000$ observations. The groups are defined based on the first covariate, analogously to the [CATE IRM Example](https://docs.doubleml.org/stable/examples/py_double_ml_cate.html), but rely on [LightGBM](https://lightgbm.readthedocs.io/en/latest/index.html) to estimate nuisance elements (due to time constraints).
27+
The simulations are based on the the [make_heterogeneous_data](https://docs.doubleml.org/stable/api/generated/doubleml.datasets.irm.make_heterogeneous_data.html)-DGP with $2000$ observations. The groups are defined based on the first covariate, analogously to the [CATE IRM Example](https://docs.doubleml.org/stable/examples/py_double_ml_cate.html), but rely on [LightGBM](https://lightgbm.readthedocs.io/en/latest/index.html) to estimate nuisance elements (due to time constraints).
2828

2929
The non-uniform results (coverage, ci length and bias) refer to averaged values over all groups (point-wise confidende intervals).
3030

doc/irm/irm_gate.qmd

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -22,9 +22,9 @@ from utils.style_tables import generate_and_show_styled_table
2222
init_notebook_mode(all_interactive=True)
2323
```
2424

25-
## GATE Coverage
25+
## Coverage
2626

27-
The simulations are based on the the [make_heterogeneous_data](https://docs.doubleml.org/stable/api/generated/doubleml.datasets.make_heterogeneous_data.html)-DGP with $500$ observations. The groups are defined based on the first covariate, analogously to the [GATE IRM Example](https://docs.doubleml.org/stable/examples/py_double_ml_gate.html), but rely on [LightGBM](https://lightgbm.readthedocs.io/en/latest/index.html) to estimate nuisance elements (due to time constraints).
27+
The simulations are based on the the [make_heterogeneous_data](https://docs.doubleml.org/stable/api/generated/doubleml.irm.datasets.make_heterogeneous_data.html)-DGP with $500$ observations. The groups are defined based on the first covariate, analogously to the [GATE IRM Example](https://docs.doubleml.org/stable/examples/py_double_ml_gate.html), but rely on [LightGBM](https://lightgbm.readthedocs.io/en/latest/index.html) to estimate nuisance elements (due to time constraints).
2828

2929
The non-uniform results (coverage, ci length and bias) refer to averaged values over all groups (point-wise confidende intervals).
3030

0 commit comments

Comments
 (0)