Do Data Frame Row Match Given Criteria? — matchesCriteria • kwb.utils

are data frame rows matching given criteria?

Usage

matchesCriteria(
  Data,
  criteria = NULL,
  na.to.false = FALSE,
  add.details = FALSE,
  dbg = TRUE
)

Arguments

Data: data frame
criteria: vector of character containing conditions, in which the column names of Data, e.g. A can appear unquoted, e.g. "A == 'x'"
na.to.false: if TRUE (the default is FALSE) NA in the resulting vector will be replaced with FALSE
add.details: if TRUE (the default is FALSE) a matrix containing the evaluation of each criterion is returned in attribute details
dbg: if TRUE (default) for each criterion in criteria it is shown for how many rows in Data the criterion is TRUE and for how many rows it is FALSE

Value

vector of logical containing TRUE at positions representing rows in

Data fulfilling the conditions and FALSE elsewhere

Examples

# Define an example data frame
Data <- data.frame(A = c("x", "y", "z", NA),
                   B = c( NA,   2,   3, 4))

# Define one or more criteria
criteria <- c("A %in% c('y', 'z')", "B %in% 1:3")

# For which rows the criteria are met (vector of logical)?
matchesCriteria(Data, criteria, dbg = FALSE)
#> [1] FALSE  TRUE  TRUE FALSE

# You may use the function in the context of indexing:
Data[matchesCriteria(Data, criteria), ]
#> Evaluating A %in% c("y", "z") ...
#>   is TRUE for       2 rows ( 50.0 %),
#>     FALSE for       2 rows ( 50.0 %) and
#>        NA for       0 rows (  0.0 %).
#>   Selected rows now: 2
#> Evaluating B %in% 1:3 ...
#>   is TRUE for       2 rows ( 50.0 %),
#>     FALSE for       2 rows ( 50.0 %) and
#>        NA for       0 rows (  0.0 %).
#>   Selected rows now: 2
#>   A B
#> 2 y 2
#> 3 z 3

# Filtering for non-NA values
D1 <- Data[matchesCriteria(Data, "! is.na(A) & ! is.na(B)"), ]
#> Evaluating !is.na(A) & !is.na(B) ...
#>   is TRUE for       2 rows ( 50.0 %),
#>     FALSE for       2 rows ( 50.0 %) and
#>        NA for       0 rows (  0.0 %).
#>   Selected rows now: 2

# the same result is returned by:
D2 <- Data[matchesCriteria(Data, c("! is.na(A)", "! is.na(B)")), ]
#> Evaluating !is.na(A) ...
#>   is TRUE for       3 rows ( 75.0 %),
#>     FALSE for       1 rows ( 25.0 %) and
#>        NA for       0 rows (  0.0 %).
#>   Selected rows now: 3
#> Evaluating !is.na(B) ...
#>   is TRUE for       3 rows ( 75.0 %),
#>     FALSE for       1 rows ( 25.0 %) and
#>        NA for       0 rows (  0.0 %).
#>   Selected rows now: 2

identical(D1, D2)
#> [1] TRUE