Skip to content

Commit aba4913

Browse files
authored
Merge pull request #839 from cmu-delphi/add_more_details_to_quidel_doc
Update Quidel doc
2 parents f08d5a6 + f2bc972 commit aba4913

File tree

1 file changed

+43
-37
lines changed

1 file changed

+43
-37
lines changed

docs/api/covidcast-signals/quidel.md

Lines changed: 43 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ grand_parent: COVIDcast Epidata API
2020
* **Earliest issue available:** July 29, 2020
2121
* **Number of data revisions since May 19, 2020:** 1
2222
* **Date of last change:** October 22, 2020
23-
* **Available for:** hrr, msa, state (see [geography coding docs](../covidcast_geography.md))
23+
* **Available for:** county, hrr, msa, state, HHS, nation (see [geography coding docs](../covidcast_geography.md))
2424
* **Time type:** day (see [date format docs](../covidcast_times.md))
2525
* **License:** [CC BY](../covidcast_licensing.md#creative-commons-attribution)
2626

@@ -68,60 +68,66 @@ $$
6868
p = \frac{100 x}{n}
6969
$$
7070

71-
We estimate p across 3 temporal-spatial aggregation schemes:
71+
We estimate p across 6 temporal-spatial aggregation schemes:
72+
- daily, at the county level;
7273
- daily, at the MSA (metropolitan statistical area) level;
7374
- daily, at the HRR (hospital referral region) level;
74-
- daily, at the state level.
75+
- daily, at the state level;
76+
- daily, at the HHS level;
77+
- daily, at the US national level.
7578

76-
**MSA and HRR levels**: In a given MSA or HRR, suppose $$N$$ COVID tests are taken
77-
in a certain time period, $$X$$ is the number of tests taken with positive
78-
results.
79+
#### Standard Error
7980

80-
For raw signals:
81-
- if $$N \geq 50$$, we simply use:
81+
We assume the estimates for each time point follow a binomial distribution. The
82+
estimated standard error then is:
8283

8384
$$
84-
p = \frac{100 X}{N}
85+
\text{se} = 100 \sqrt{ \frac{\frac{p}{100}(1- \frac{p}{100})}{N} }
8586
$$
8687

87-
For smoothed signals, before taking the temporal pooling average,
88-
- if $$N \geq 50$$, we also use:
88+
#### Smoothing
89+
90+
We add two kinds of smoothing to the smoothed signals:
91+
92+
##### Temporal Smoothing
93+
Smoothed estimates are formed by pooling data over time. That is, daily, for
94+
each location, we first pool all data available in that location over the last 7
95+
days, and we then recompute everything described in the two subsections above.
96+
97+
Pooling in this way makes estimates available in more geographic areas, as many areas
98+
report very few tests per day, but have enough data to report when 7 days are considered.
99+
100+
##### Geographical Smoothing
101+
102+
**County, MSA and HRR levels**: In a given County, MSA or HRR, suppose $$N$$ COVID tests
103+
are taken in a certain time period, $$X$$ is the number of tests taken with positive
104+
results.
105+
106+
107+
For smoothed signals, after taking the temporal pooling,
108+
- if $$N \geq 50$$, we still use:
89109
$$
90110
p = \frac{100 X}{N}
91111
$$
92-
- if $$25 \leq N < 50$$, we lend $$50 - N$$ fake samples from its home state to shrink the
112+
- if $$25 \leq N < 50$$, we lend $$50 - N$$ fake samples from its parent state to shrink the
93113
estimate to the state's mean, which means:
94114
$$
95115
p = 100 \left( \frac{N}{50} \frac{X}{N} + \frac{50 - N}{50} \frac{X_s}{N_s} \right)
96116
$$
97117
where $$N_s, X_s$$ are the number of COVID tests and the number of COVID tests
98-
taken with positive results taken in its home state in the same time period.
118+
taken with positive results taken in its parent state in the same time period.
119+
A parent state is defined as the state with the largest proportion of the population
120+
in this county/MSA/HRR.
99121

100-
**State level**: the states with fewer than 50 tests are discarded. For the
101-
rest of the states with sufficient samples,
122+
Counties with sample sizes smaller than 50 are merged into megacounties for
123+
the raw signals; counties with sample sizes smaller than 25 are merged into megacounties for
124+
the smoothed signals.
102125

126+
**State level, HHS level, National level**: locations with fewer than 50 tests are discarded. For the remaining locations,
103127
$$
104128
p = \frac{100 X}{N}
105129
$$
106130

107-
#### Standard Error
108-
109-
We assume the estimates for each time point follow a binomial distribution. The
110-
estimated standard error then is:
111-
112-
$$
113-
\text{se} = 100 \sqrt{ \frac{\frac{p}{100}(1- \frac{p}{100})}{N} }
114-
$$
115-
116-
#### Smoothing
117-
118-
Smoothed estimates are formed by pooling data over time. That is, daily, for
119-
each location, we first pool all data available in that location over the last 7
120-
days, and we then recompute everything described in the last two
121-
subsections. Pooling in this way makes estimates available in more geographic
122-
areas, as many areas report very few tests per day, but have enough data to
123-
report when 7 days are considered.
124-
125131
### Lag and Backfill
126132

127133
Because testing centers may report their data to Quidel several days after they
@@ -142,13 +148,13 @@ This data source is based on data provided to us by a lab testing company. They
142148

143149
### Missingness
144150

145-
When fewer than 50 tests are reported in a state on a specific day, no data is
151+
When fewer than 50 tests are reported in a state/a HHS region/US on a specific day, no data is
146152
reported for that area on that day; an API query for all reported states on that
147153
day will not include it.
148154

149-
When fewer than 50 tests are reported in an HRR or MSA on a specific day, and
150-
not enough samples can be filled in from the parent state, no data is reported
151-
for that area on that day; an API query for all reported geographic areas on
155+
When fewer than 50 tests are reported in a county, HRR or MSA on a specific day, and
156+
not enough samples can be filled in from the parent state for smoothed signals specifically,
157+
no data is reported for that area on that day; an API query for all reported geographic areas on
152158
that day will not include it.
153159

154160
## Flu Tests

0 commit comments

Comments
 (0)