Get Changes of Rows That Are Duplicated in Selected Columns
Source:R/utils_berlin.R
getChangesOfDuplicates.RdGet Changes of Rows That Are Duplicated in Selected Columns
Arguments
- df
a data frame
- columns
names of columns in
dfin which to look for duplicate value combinations- add_columns
names of additional columns that shall appear in the output even if there are no changes in these columns
Value
list of data frames. The list has as many elements as there are
different value combinations in columns that appear more than once
in df. Each element is a data frame with all rows from df
that have the same value combination in columns. By default the data
frame contains the columns given in columns and those columns out of
df in which there is at least one change over the values in the
different rows.
Examples
df <- data.frame(
id = 1:7,
name = c("one", "one", "two", "two", "three", "three", "three"),
type = c("A", "A", "B", "C", "D", "D", "D"),
size = c(10, 11, 12, 12, 13, 13, 14),
height = c(1, 1, 2, 3, 4, 4, 5)
)
df
#> id name type size height
#> 1 1 one A 10 1
#> 2 2 one A 11 1
#> 3 3 two B 12 2
#> 4 4 two C 12 3
#> 5 5 three D 13 4
#> 6 6 three D 13 4
#> 7 7 three D 14 5
getChangesOfDuplicates(df, "name")
#> [[1]]
#> name id size .n
#> 1 one 1 10 1
#> 2 one 2 11 1
#>
#> [[2]]
#> name id size height .n
#> 1 three 5 13 4 1
#> 2 three 6 13 4 1
#> 3 three 7 14 5 1
#>
#> [[3]]
#> name id type height .n
#> 1 two 3 B 2 1
#> 2 two 4 C 3 1
#>
getChangesOfDuplicates(df, c("name", "type"))
#> [[1]]
#> name type id size .n
#> 1 one A 1 10 1
#> 2 one A 2 11 1
#>
#> [[2]]
#> name type id size height .n
#> 1 three D 5 13 4 1
#> 2 three D 6 13 4 1
#> 3 three D 7 14 5 1
#>