Decreasingly sorted frequencies of strings, by default weighted by their length. This function can be used to find the most "important" folder paths in terms of frequency and length.
sorted_importance(x, weighted = TRUE)
x | vector of character strings |
---|---|
weighted | if |
named integer vector (of class table) containing the decreasingly
sorted importance values of the elements in x
. The importance of a
string is either its frequency in x
(if weighted is FALSE) or the
product of this frequency and the string length (if weighted is TRUE)
strings <- c("a", "a", "a", "bc", "bc", "cdefg") (importance <- kwb.pathdict:::sorted_importance(strings))#> x #> cdefg bc a #> 5 4 3# Check that each input element is mentioned in the output all(unique(strings) %in% names(importance))#> [1] TRUE# weighted = FALSE just returns the frequencies of strings in x (importance <- kwb.pathdict:::sorted_importance(strings, weighted = FALSE))#> x #> a bc cdefg #> 3 2 1#> [1] TRUE# You may use the function to assess the "importance" of directory paths kwb.pathdict:::sorted_importance(dirname(kwb.pathdict:::example_paths()))#> x #> //very/long/path/to/the/projects/project-1/wp-1/input #> 106 #> //very/long/path/to/the/projects/project-2/Berichte #> 102 #> //very/long/path/to/the/projects/project-1/wp-1/analysis #> 56 #> //very/long/path/to/the/projects/project-1/wp 2/output #> 54 #> //very/long/path/to/the/projects/project-1/wp 2/input #> 53 #> //very/long/path/to/the/projects/project-2/Grafiken #> 51 #> //very/long/path/to/the/projects/project-2/Daten #> 48