An R package to establish and work within a data collection framework.
Installation
# install.packages("remotes")
remotes::install_github("dissc-yale/dcf")Get Started
Projects
A data collection project ultimately consists of source and bundle projects:
Start by initializing the overall project:
dcf_init("collection_project")Then add a source project, which will ingest data from a single source, and produce a standardized data file:
dcf_add_source("source_a", "collection_project")And add a bundle project, which will use the standardized source files to produce a data product:
dcf_add_bundle("bundle_a", "collection_project")Processing
Once the source and bundle scripts have been written, the project can be built:
dcf_build("collection_project")This runs dcf_process on each sub-project, and dcf_check_source on each source, then writes a report to collection_project/report.json.gz, which includes processing details (like logs and timing) and metadata from the standardized data files.