Skip to contents

Get a list of measures (a codebook) that are included in a data collection project.

Usage

dcf_variables(project = ".", exclude = c("geography", "time", "age"), ...)

Arguments

project

Path to a local project, or the GitHub account and repository name ("{account_name}/{repo_name}") of a remote project. Or a report as returned from dcf_report.

exclude

A character vector of variable names to exclude from the list (usually ID columns).

...

Additional arguments passed to dcf_report.

Value

A tibble containing variables:

nameName of the variable, as it appears in the data file.
typeThe value's storage type.
nNumber of non-missing observations within the file.
duplicatesNumber of duplicated values within the file.
missingNumber of missing values within the file.
project_typeThe project type, between source and bundle.
data_formatThe orientation of the data, between wide and tall.
fileThe file containing the variable; a path relative to the project root.
short_nameShort name, if included in measure info.
long_nameLong name, if included in measure info.
short_decriptionShort description, if included in measure info.
long_descriptionLong description, if included in measure info.
measure_typeHigher-level description of type than storage type (e.g., count versus integer), if included in measure info.
unitHow a single value should be interpreted (e.g., per 100k people for a rate per 100k people), if included in measure info.
time_resolutionThe measure's collection frequency, if included in measure info.
categoryThe measure's category, if included in measure info.
subcategoryThe measure's subcategory, if included in measure info.

See also

Other data user interface functions: dcf_data(), dcf_report()

Examples

dcf_variables("dissc-yale/pophive_demo")
#> # A tibble: 49 × 17
#>    name            type      n duplicates missing project_type data_format file 
#>    <chr>           <chr> <int>      <int>   <int> <chr>        <chr>       <chr>
#>  1 epic_all_encou… inte…  7074      22205   21057 bundle       wide        data…
#>  2 epic_covid      inte…  7074      26896   21057 bundle       wide        data…
#>  3 epic_flu        inte…  7074      26885   21057 bundle       wide        data…
#>  4 epic_rsv        inte…  6943      27603   21188 bundle       wide        data…
#>  5 gtrends_rsv_va… float 16848      21790   11283 bundle       wide        data…
#>  6 gtrends_naloxo… float 16848      19559   11283 bundle       wide        data…
#>  7 gtrends_overdo… float 16848      12268   11283 bundle       wide        data…
#>  8 gtrends_rsv     float 16848      14265   11283 bundle       wide        data…
#>  9 wastewater_cov… float 10621      17852   17510 bundle       wide        data…
#> 10 wastewater_flua float  8203      24233   19928 bundle       wide        data…
#> # ℹ 39 more rows
#> # ℹ 9 more variables: short_name <chr>, long_name <chr>,
#> #   short_description <chr>, long_description <chr>, measure_type <chr>,
#> #   unit <chr>, time_resolution <chr>, category <chr>, subcategory <chr>