Get Changes of Rows That Are Duplicated in Selected Columns
Source:R/utils_berlin.R
getChangesOfDuplicates.Rd
Get Changes of Rows That Are Duplicated in Selected Columns
Arguments
- df
a data frame
- columns
names of columns in
df
in which to look for duplicate value combinations- add_columns
names of additional columns that shall appear in the output even if there are no changes in these columns
Value
list of data frames. The list has as many elements as there are
different value combinations in columns
that appear more than once
in df
. Each element is a data frame with all rows from df
that have the same value combination in columns
. By default the data
frame contains the columns given in columns
and those columns out of
df
in which there is at least one change over the values in the
different rows.
Examples
df <- data.frame(
id = 1:7,
name = c("one", "one", "two", "two", "three", "three", "three"),
type = c("A", "A", "B", "C", "D", "D", "D"),
size = c(10, 11, 12, 12, 13, 13, 14),
height = c(1, 1, 2, 3, 4, 4, 5)
)
df
#> id name type size height
#> 1 1 one A 10 1
#> 2 2 one A 11 1
#> 3 3 two B 12 2
#> 4 4 two C 12 3
#> 5 5 three D 13 4
#> 6 6 three D 13 4
#> 7 7 three D 14 5
getChangesOfDuplicates(df, "name")
#> [[1]]
#> name id size .n
#> 1 one 1 10 1
#> 2 one 2 11 1
#>
#> [[2]]
#> name id size height .n
#> 1 three 5 13 4 1
#> 2 three 6 13 4 1
#> 3 three 7 14 5 1
#>
#> [[3]]
#> name id type height .n
#> 1 two 3 B 2 1
#> 2 two 4 C 3 1
#>
getChangesOfDuplicates(df, c("name", "type"))
#> [[1]]
#> name type id size .n
#> 1 one A 1 10 1
#> 2 one A 2 11 1
#>
#> [[2]]
#> name type id size height .n
#> 1 three D 5 13 4 1
#> 2 three D 6 13 4 1
#> 3 three D 7 14 5 1
#>