Collector
Collect internet search volumes from the Google Trends timeline for health endpoint.
See the schema
for more about the API. Only the getTimelinesForHealth
endpoint is used here.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
scope_dir
|
str
|
Directory containing the |
'scope'
|
key_dir
|
str
|
Directory containing a |
'.'
|
terms_per_batch
|
int
|
Maximum terms to include in each collection batch. Theoretically 30 is the API's max, but more than 1 seems to not work. |
1
|
wait_time
|
float
|
Seconds to wait between each batch. |
0.1
|
version
|
str
|
Version of the service API. |
'v1beta'
|
Specification
To process in batches, search terms and locations must be specified in separate
files (terms.txt
and locations.txt
), stored in the scope_dir
directory.
These should contain 1 term / location code per line.
Collection Process
Initializing this class retrieves the Google API service, stores the developer key, and points to the scope directory.
The process_batches()
method reads in the terms and locations,
and collects them in batches over the specified time frame.
Results from each batch are stored in the batches
property,
which can be pulled from in case the process_batches
process does not complete
(such as if the daily rate limit is reached).
The collect()
method collects a single batch, and
can be used on its own.
Examples:
Source code in gtrends_collection/collector.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 |
|
collect(location, params)
Collect a single batch.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
location
|
str
|
Country (e.g., |
required |
params
|
dict[str, list[str] | str]
|
A dictionary with the following entries:
|
required |
Examples:
# collect a small, custom sample
data = collector.collect(
"US-NY",
{
"terms": ["cough", "/m/01b_21"],
"timelineResolution": "month",
"time_startDate": "2014-01-01",
"time_endDate": "2024-01-01",
},
)
Returns:
Type | Description |
---|---|
DataFrame
|
A
|
Source code in gtrends_collection/collector.py
process_batches(start=None, end=None, resolution='week', override_terms=None, override_location=None)
Processes collection batches from scope.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
start
|
str | None
|
First date to collect from; |
None
|
end
|
str | None
|
Last date to collect from; |
None
|
resolution
|
str
|
Collection resolution; |
'week'
|
override_terms
|
str
|
List of terms to collect instead of those in scope. Useful for changing collection order or filling out select terms. |
None
|
override_location
|
str
|
List of locations to collect from instead of those in scope. |
None
|
Examples:
# collect across all scope-defined terms and locations in 2024
data = collector.process_batches("2024-01-01", "2024-12-31")
Returns:
Type | Description |
---|---|
DataFrame
|
A |