Convert Long File Paths to Simple Paths
to_simple_names(paths, method = 1L, get_base = NULL, sha1_digits = 4)
paths | vector of character containing file paths |
---|---|
method |
|
get_base | function taking a vector of character as input and returning
a vector of character as output. If not |
sha1_digits | number of digits used when |
vector of character as long as paths
paths <- c("v1_ugly_name_1.doc", "v1_very_ugly_name.xml", "v2_ugly_name_1.docx", "v2_very_ugly_name.xmlx") to_simple_names(paths, method = 1L)#> [1] "file_01.doc" "file_02.xml" "file_03.docx" "file_04.xmlx"#> file_2ecd.xml #> file_3f3a.xmlx #> file_82f1.doc #> file_f400.docx# All sha1 are different because all base names (file name without extension # by default) are different. If you want to give the same sha1 to files that # correspond to each other but have a different extension, set the function # that extracts the "base name" of the file: get_base <- function(x) kwb.utils::removeExtension(gsub("^v\\d+_", "", x)) writeLines(sort(to_simple_names(paths, method = 2L, get_base = get_base)))#> file_3abc.xml #> file_3abc.xmlx #> file_d71a.doc #> file_d71a.docx# Now the file names that have the same base name (neglecting the prefix # v1_ or v2_) get the same sha1 and thus appear as groups in the sorted # file list