Convert original Work In Data cleaned files to a parquet dataset.
Usage
wid_reformat(original_dir, reformat_dir, selection = NULL,
isic_prefixes = list(), cores = parallel::detectCores() - 2,
overwrite = FALSE)
Arguments
- original_dir
Directory containing cleaned data.
- reformat_dir
Directory to save the dataset files in.
- selection
Character vector specifying a subset of files to include (e.g.,
c("AGO_2008_IBEP", "ALB_20.._LFS", "2024")
).- isic_prefixes
A list mapping country_year_survey IDs to ISIC prefixes (
30_
,31_
, or40_
, for revisions3
,3.1
, or4
).- cores
Number of CPU cores to use during processing.
- overwrite
Logical; if
TRUE
, will rewrite existing partitions.