Remove Duplicate Rows from R Data Frame

In this tutorial, we will learn how to remove duplicate rows in R Data frame.

To remove duplicate rows in R data frame, use unique() function with the following syntax

newDataFrame = unique(redundantDataFrame)

where

  • redundantDataFrame is the data frame with duplicate rows.
  • newDataFrame is the data frame with all the duplicate rows removed.
  • unique is the keyword.

Example 1 – Remove Duplicate Rows in R Data Frame

In this example, we will create a data frame with a duplicate row of another. We shall use unique function to remove these duplicate rows.

> DF1 = data.frame(C1= c(1, 5, 14, 1, 54), C2= c(9, 15, 85, 9, 42), C3= c(8, 7, 42, 8, 16))
> DF1
  C1 C2 C3
1  1  9  8
2  5 15  7
3 14 85 42
4  1  9  8
5 54 42 16
>

Row 1 and Row 4 are duplicates. When we run unique() function, it retains the first row which is original and any duplicates further in the data frame are removed.

> DF2 = unique(DF1)
> DF2
  C1 C2 C3
1  1  9  8
2  5 15  7
3 14 85 42
5 54 42 16
>
ADVERTISEMENT

Conclusion

In this R Tutorial, we have learnt how to remove duplicate rows in R Data frame.