Skip to contents

Get a list of measures (a codebook) that are included in a data collection project.

Usage

dcf_variables(project = ".", exclude = c("geography", "time", "age"), ...)

Arguments

project

Path to a local project, or the GitHub account and repository name ("{account_name}/{repo_name}") of a remote project. Or a report as returned from dcf_report.

exclude

A character vector of variable names to exclude from the list (usually ID columns).

...

Additional arguments passed to dcf_report.

Value

A tibble containing variables:

nameName of the variable, as it appears in the data file.
typeThe value's storage type.
nNumber of non-missing observations within the file.
duplicatesNumber of duplicated values within the file.
missingNumber of missing values within the file.
project_typeThe project type, between source and bundle.
data_formatThe orientation of the data, between wide and tall.
fileThe file containing the variable; a path relative to the project root.
short_nameShort name, if included in measure info.
long_nameLong name, if included in measure info.
short_decriptionShort description, if included in measure info.
long_descriptionLong description, if included in measure info.
measure_typeHigher-level description of type than storage type (e.g., count versus integer), if included in measure info.
unitHow a single value should be interpreted (e.g., per 100k people for a rate per 100k people), if included in measure info.
time_resolutionThe measure's collection frequency, if included in measure info.
categoryThe measure's category, if included in measure info.
subcategoryThe measure's subcategory, if included in measure info.

See also

Other data user interface functions: dcf_data(), dcf_report()

Examples

dcf_variables("dissc-yale/pophive_demo")
#> # A tibble: 49 × 17
#>    name            type      n duplicates missing project_type data_format file 
#>    <chr>           <chr> <int>      <int>   <int> <chr>        <chr>       <chr>
#>  1 epic_all_encou… inte…  7074      23245   22097 bundle       wide        data…
#>  2 epic_covid      inte…  7074      27936   22097 bundle       wide        data…
#>  3 epic_flu        inte…  7074      27925   22097 bundle       wide        data…
#>  4 epic_rsv        inte…  6943      28643   22228 bundle       wide        data…
#>  5 gtrends_rsv_va… float 17524      22165   11647 bundle       wide        data…
#>  6 gtrends_naloxo… float 17524      19924   11647 bundle       wide        data…
#>  7 gtrends_overdo… float 17524      12632   11647 bundle       wide        data…
#>  8 gtrends_rsv     float 17524      14629   11647 bundle       wide        data…
#>  9 wastewater_cov… float 10976      18624   18195 bundle       wide        data…
#> 10 wastewater_flua float  8561      25213   20610 bundle       wide        data…
#> # ℹ 39 more rows
#> # ℹ 9 more variables: short_name <chr>, long_name <chr>,
#> #   short_description <chr>, long_description <chr>, measure_type <chr>,
#> #   unit <chr>, time_resolution <chr>, category <chr>, subcategory <chr>