Skip to contents

Get Changes of Rows That Are Duplicated in Selected Columns

Usage

getChangesOfDuplicates(df, columns, add_columns = columns)

Arguments

df

a data frame

columns

names of columns in df in which to look for duplicate value combinations

add_columns

names of additional columns that shall appear in the output even if there are no changes in these columns

Value

list of data frames. The list has as many elements as there are different value combinations in columns that appear more than once in df. Each element is a data frame with all rows from dfthat have the same value combination in columns. By default the data frame contains the columns given in columns and those columns out of df in which there is at least one change over the values in the different rows.

Examples

df <- data.frame(
  id = 1:7, 
  name = c("one", "one", "two", "two", "three", "three", "three"), 
  type = c("A", "A", "B", "C", "D", "D", "D"),
  size = c(10, 11, 12, 12, 13, 13, 14),
  height = c(1, 1, 2, 3, 4, 4, 5)
)

df
#>   id  name type size height
#> 1  1   one    A   10      1
#> 2  2   one    A   11      1
#> 3  3   two    B   12      2
#> 4  4   two    C   12      3
#> 5  5 three    D   13      4
#> 6  6 three    D   13      4
#> 7  7 three    D   14      5

getChangesOfDuplicates(df, "name")
#> [[1]]
#>   name id size .n
#> 1  one  1   10  1
#> 2  one  2   11  1
#> 
#> [[2]]
#>    name id size height .n
#> 1 three  5   13      4  1
#> 2 three  6   13      4  1
#> 3 three  7   14      5  1
#> 
#> [[3]]
#>   name id type height .n
#> 1  two  3    B      2  1
#> 2  two  4    C      3  1
#> 
getChangesOfDuplicates(df, c("name", "type"))
#> [[1]]
#>   name type id size .n
#> 1  one    A  1   10  1
#> 2  one    A  2   11  1
#> 
#> [[2]]
#>    name type id size height .n
#> 1 three    D  5   13      4  1
#> 2 three    D  6   13      4  1
#> 3 three    D  7   14      5  1
#>