Filter duplicated row by the columns ...

filter_duplicates_dplyr(data, ...)



a data.frame-like


columns (unquoted) to consider to identify duplicated rows


a tibble, subset of data, containing only duplicated rows by the columns indicated by ..., and including an additional column .ndups with the number of duplicated rows for each group The returned data are sorted by the columns, to be able to inspect the duplicated together


filter_duplicates_dplyr(ggplot2::diamonds, carat, cut, price)
#> # A tibble: 24,957 × 11
#> # Groups:   carat, cut, price [6,477]
#>    carat cut     color clarity depth table price     x     y     z .ndups
#>    <dbl> <ord>   <ord> <ord>   <dbl> <dbl> <int> <dbl> <dbl> <dbl>  <int>
#>  1   0.2 Premium E     VS2      59.8    62   367  3.79  3.77  2.26      7
#>  2   0.2 Premium E     VS2      59      60   367  3.81  3.78  2.24      7
#>  3   0.2 Premium E     VS2      61.1    59   367  3.81  3.78  2.32      7
#>  4   0.2 Premium E     VS2      59.7    62   367  3.84  3.8   2.28      7
#>  5   0.2 Premium F     VS2      62.6    59   367  3.73  3.71  2.33      7
#>  6   0.2 Premium D     VS2      62.3    60   367  3.73  3.68  2.31      7
#>  7   0.2 Premium D     VS2      61.7    60   367  3.77  3.72  2.31      7
#>  8   0.2 Ideal   E     VS2      59.7    55   367  3.86  3.84  2.3       3
#>  9   0.2 Ideal   D     VS2      61.5    57   367  3.81  3.77  2.33      3
#> 10   0.2 Ideal   E     VS2      62.2    57   367  3.76  3.73  2.33      3
#> # ℹ 24,947 more rows