Remove rows of R Data Frame with one or more NAs
In this tutorial, we will learn hot to remove rows in a data frame with one or more NAs as column values.
To remove rows of a data frame with one or more NAs, use complete.cases() function as shown below
resultDF = myDataframe[complete.cases(myDataframe),]
where
myDataframe
is the data frame containing rows with one or more NAs
resultDF
is the resulting data frame with rows not containing atleast one NA
Example 1 – Remove rows with NA in Data Frame
In this example, we will create a data frame with some of the rows containing NAs.
> DF1 = data.frame(x = c(9, NA, 7, 4), y = c(4, NA, NA, 21))
> DF1
x y
1 9 4
2 NA NA
3 7 NA
4 4 21
In the second row we have all the column values as NA. In the third row, we have some columns with NA and some with numbers.
Now, we will use complete.cases() function to remove these rows in data frame containing NAs
> resultDF = DF1[complete.cases(DF1), ]
> resultDF
x y
1 9 4
4 4 21
The resultDF contains rows with none of the values being NA.
Remove rows of R Data Frame with all NAs
In the previous example with complete.cases() function, we considered the rows without any missing values. But in this example, we will consider rows with NAs but not all NAs.
To remove rows of a data frame that has all NAs, use data frame subsetting as shown below
resultDF = mydataframe[rowSums(is.na(mydataframe[ , 0:ncol(mydataframe)])) < ncol(mydataframe), ]
where
mydataframe
is the data frame containing rows with one or more NAs
resultDF
is the resulting data frame with rows not containing atleast one NA
Let us understand what we have done here. First we got the count of NAs for each row and compared with the number of columns of data frame. If that count is less than the number of columns, then that row does not have all rows. And we filter those rows.
Example 2 – Remove rows with all NAs in Data Frame
In this example, we will create a data frame with some of the rows containing NAs.
> DF1 = data.frame(x = c(9, NA, 7, 4), y = c(4, NA, NA, 21))
> DF1
x y
1 9 4
2 NA NA
3 7 NA
4 4 21
In the second row we have all the column values as NA. In the third row, we have some columns with NA and some with numbers.
Now, we will use data frame sub-setting to remove these rows in data frame containing all NAs.
> resultDF = DF1[rowSums(is.na(DF1[ , 0:ncol(DF1)])) < ncol(DF1), ]
> resultDF
x y
1 9 4
3 7 NA
4 4 21
>
The resultDF contains rows with none of the rows having all NAs.
Example 3 – Remove rows with all NAs in Data Frame
In this example, we will create a data frame with some of the rows containing NAs.
> DF1 = data.frame(x = c(9, NA, 7, 4), y = c(4, NA, NA, 21))
> DF1
x y
1 9 4
2 NA NA
3 7 NA
4 4 21
In the second row we have all the column values as NA. In the third row, we have some columns with NA and some with numbers.
Now, we will use data frame sub-setting to remove these rows in data frame containing all NAs.
> resultDF = DF1[rowSums(is.na(DF1[ , 0:ncol(DF1)])) < ncol(DF1), ]
> resultDF
x y
1 9 4
3 7 NA
4 4 21
>
The resultDF contains rows with none of the rows having all NAs.
Conclusion
In this R Tutorial, we will learn hot to remove rows in a data frame with one or more NAs as column values.