You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/api/covidcast-signals/google-symptoms.md
+17-7Lines changed: 17 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -19,9 +19,17 @@ nav_order: 1
19
19
## Overview
20
20
21
21
This data source is based on the [COVID-19 Search Trends symptoms
22
-
dataset](https://console.cloud.google.com/marketplace/product/bigquery-public-datasets/covid19-search-trends?hl=en-GB). Using
23
-
this search data, we estimate the volume of searches mapped to symptom sets related
24
-
to COVID-19. The resulting daily dataset for each region shows the average relative frequency of searches for each symptom set. The signals are measured in arbitrary units that are normalized for overall search users in the region and scaled by the maximum value of the normalized popularity within a geographic region across a specific time range. **Values are comparable across signals in the same location but NOT across geographic regions**. For example, within a state, we can compare `s01_smoothed_search` and `s02_smoothed_search`. However, we cannot compare `s01_smoothed_search` between states. Larger numbers represent increased relative popularity of symptom-related searches.
We use this data to estimate the volume of web searches related
24
+
to COVID-19 and H5N1 highly-pathogenic avian influenza (HPAI).
25
+
26
+
The resulting daily dataset for each location shows the average relative frequency of searches for sets of specific symptoms.
27
+
The signals are measured in arbitrary units that are normalized for overall search users in the location and scaled by the maximum value of the normalized popularity within a location across a specific time range.
28
+
Larger numbers represent increased relative popularity of symptom-related searches.
29
+
30
+
**Values are comparable across signals in the same location, but NOT between locations or between geographic region types**.
31
+
For example, within a state, we can compare `s01_smoothed_search` and `s02_smoothed_search`.
32
+
However, we cannot compare `s01_smoothed_search` between states, or between a state and a county.
25
33
26
34
Between May 13 2024 and August 6 2024, [signal values were much lower](#limitations) compared to previous time periods due to a data outage.
27
35
@@ -36,7 +44,9 @@ Between May 13 2024 and August 6 2024, [signal values were much lower](#limitati
36
44
*_s07_: Conjunctivitis, Red eye, Epiphora, Eye pain, Rheum
37
45
*_scontrol_: Type 2 diabetes, Urinary tract infection, Hair loss, Candidiasis, Weight gain
38
46
39
-
The symptoms were combined in sets that showed positive correlation with cases, especially after Omicron was declared a variant of concern by the WHO. Note that symptoms in _scontrol_ are not COVID-19 related, and this symptom set can be used as a negative control.
47
+
The symptoms were combined in sets _s01_-_s06_ that showed positive correlation with COVID-19 cases, especially after Omicron was declared a variant of concern by the WHO.
48
+
Symptom set _s07_ is designed to track novel eye-related symptoms of H5N1.
49
+
Note that symptoms in _scontrol_ are not COVID-19 or H5N1 related, and this symptom set can be used as a negative control.
40
50
41
51
Until January 20, 2022, we had separate signals for symptoms Anosmia, Ageusia, and their sum.
42
52
@@ -118,16 +128,16 @@ The data was unfortunately not recoverable and the dip can not be repaired, but
118
128
119
129
When daily volume in a region does not meet quality or privacy thresholds, set
120
130
by Google, no daily value is reported. Weekly data may be available from Google
121
-
in these cases, but we do not yet support importation using weekly data.
131
+
in these cases, but we do not yet support weekly data.
122
132
123
133
Google uses differential privacy, which adds artificial noise to the raw
124
134
datasets to avoid identifying any individual persons without affecting the
125
135
quality of results.
126
136
127
137
Google normalizes and scales time series values to determine the relative
128
138
popularity of symptoms in searches within each geographical region individually.
129
-
This means that the resulting values of symptom set popularity are **NOT**
130
-
comparable across geographic regions, while the values of different symptom sets are comparable within the same location.
139
+
This means that Delphi's computed symptom set popularity values are **NOT**
140
+
comparable _between_ geographic regions or region types, but are comparable within the same location.
131
141
132
142
Standard errors and sample sizes are not available for this data source.
0 commit comments