Collector
Collect internet search volumes from the Google Trends timeline for health endpoint.
See the schema
for more about the API. Only the getTimelinesForHealth
endpoint is used here.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
scope_dir
|
str
|
Directory containing the |
'scope'
|
key_dir
|
str
|
Directory containing a |
'.'
|
terms_per_batch
|
int
|
Maximum terms to include in each collection batch. Theoretically 30 is the API's max, but more than 1 seems to not work. |
1
|
wait_time
|
float
|
Seconds to wait between each batch. |
0.1
|
version
|
str
|
Version of the service API. |
'v1beta'
|
Specification
To process in batches, search terms and locations must be specified in separate
files (terms.txt
and locations.txt
), stored in the scope_dir
directory.
These should contain 1 term / location code per line.
Collection Process
Initializing this class retrieves the Google API service, stores the developer key, and points to the scope directory.
The process_batches()
method reads in the terms and locations,
and collects them in batches over the specified time frame.
Results from each batch are stored in the batches
property,
which can be pulled from in case the process_batches
process does not complete
(such as if the daily rate limit is reached).
The collect()
method collects a single batch, and
can be used on its own.
Examples:
Source code in gtrends_collection/collector.py
|
|
collect(location, params)
Collect a single batch.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
location
|
str
|
Country (e.g., |
required |
params
|
dict[str, list[str] | str]
|
A dictionary with the following entries:
|
required |
Examples:
# collect a small, custom sample
data = collector.collect(
"US-NY",
{
"terms": ["cough", "/m/01b_21"],
"timelineResolution": "month",
"time_startDate": "2014-01-01",
"time_endDate": "2024-01-01",
},
)
Returns:
Type | Description |
---|---|
DataFrame
|
A
|
Source code in gtrends_collection/collector.py
process_batches(start=None, end=None, resolution='week', override_terms=None, override_location=None)
Processes collection batches from scope.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
start
|
str | None
|
First date to collect from; |
None
|
end
|
str | None
|
Last date to collect from; |
None
|
resolution
|
str
|
Collection resolution; |
'week'
|
override_terms
|
str
|
List of terms to collect instead of those in scope. Useful for changing collection order or filling out select terms. |
None
|
override_location
|
str
|
List of locations to collect from instead of those in scope. |
None
|
Examples:
# collect across all scope-defined terms and locations in 2024
data = collector.process_batches("2024-01-01", "2024-12-31")
Returns:
Type | Description |
---|---|
DataFrame
|
A |