Frequency of Value Combinations in Data Frame Columns
Source:R/applyFilterCriteria.R
fieldSummary.Rd
Frequency of Value Combinations in Data Frame Columns
Usage
fieldSummary(x, groupBy = names(x)[-1L], lengthColumn = "", na = "Unknown")
Arguments
- x
data frame
- groupBy
vector of character naming the columns (fields) in
x
to be included in the evaluation. Default: names of all columns inx
except the first one (assuming it could be an ID column).- lengthColumn
optional. Name of column in
x
to be summed up- na
optional. Value to be treated as
NA
. Default: "Unknown"
Examples
n <- 1000L
sample_replace <- function(x, ...) sample(x, size = n, replace = TRUE, ...)
x <- data.frame(
pipe_id = 1:n,
material = sample_replace(c("clay", "concrete", "other")),
age_cat = sample_replace(c("young", "old")),
length = as.integer(rnorm(n, 50)),
stringsAsFactors = FALSE
)
fieldSummary(x)
#> material age_cat length Count Percentage
#> 1 clay young 46 1 0.1
#> 2 clay old 47 3 0.3
#> 3 concrete old 47 3 0.3
#> 4 other old 47 3 0.3
#> 5 clay young 47 7 0.7
#> 6 concrete young 47 5 0.5
#> 7 other young 47 1 0.1
#> 8 clay old 48 11 1.1
#> 9 concrete old 48 24 2.4
#> 10 other old 48 26 2.6
#> 11 clay young 48 26 2.6
#> 12 concrete young 48 23 2.3
#> 13 other young 48 23 2.3
#> 14 clay old 49 62 6.2
#> 15 concrete old 49 61 6.1
#> 16 other old 49 63 6.3
#> 17 clay young 49 42 4.2
#> 18 concrete young 49 57 5.7
#> 19 other young 49 52 5.2
#> 20 clay old 50 54 5.4
#> 21 concrete old 50 58 5.8
#> 22 other old 50 48 4.8
#> 23 clay young 50 52 5.2
#> 24 concrete young 50 72 7.2
#> 25 other young 50 59 5.9
#> 26 clay old 51 28 2.8
#> 27 concrete old 51 20 2.0
#> 28 other old 51 25 2.5
#> 29 clay young 51 20 2.0
#> 30 concrete young 51 19 1.9
#> 31 other young 51 23 2.3
#> 32 clay old 52 1 0.1
#> 33 concrete old 52 6 0.6
#> 34 other old 52 4 0.4
#> 35 clay young 52 8 0.8
#> 36 concrete young 52 6 0.6
#> 37 other young 52 4 0.4
fieldSummary(x, "age_cat")
#> age_cat Count Percentage
#> 1 old 500 50
#> 2 young 500 50
fieldSummary(x, "material")
#> material Count Percentage
#> 1 clay 315 31.5
#> 2 concrete 354 35.4
#> 3 other 331 33.1
fieldSummary(x, "material", lengthColumn = "length")
#> material Count length Percentage
#> 1 clay 315 15604 31.5
#> 2 concrete 354 17527 35.4
#> 3 other 331 16389 33.1