What is the difference between na.omit and complete.cases in R?


The na.omit function removes all the missing values in a data frame and complete.cases also does the same thing if applied to the whole data frame. The main difference between the two is that complete.cases can be applied to some columns or rows. Check out the below example to understand the difference.

Example

Consider the below data frame:

Live Demo

> set.seed(2584)
> x<-sample(c(NA,2,8,6,5,4),20,replace=TRUE)
> y<-sample(c(NA,5,25),20,replace=TRUE)
> df<-data.frame(x,y)
> df

Output

   x y
1 NA 25
2 5 5
3 8 NA
4 6 5
5 4 NA
6 4 5
7 6 NA
8 4 NA
9 4 5
10 8 5
11 8 5
12 6 25
13 5 25
14 6 5
15 5 5
16 4 5
17 NA 25
18 8 NA
19 4 NA
20 8 5

Applying na.omit to df:

Example

> na.omit(df)

Output

  x y
2 5 5
4 6 5
6 4 5
9 4 5
10 8 5
11 8 5
12 6 25
13 5 25
14 6 5
15 5 5
16 4 5
20 8 5

Applying complete.cases to df:

Example

> df[complete.cases(df),]

Output

  x y
2 5 5
4 6 5
6 4 5
9 4 5
10 8 5
11 8 5
12 6 25
13 5 25
14 6 5
15 5 5
16 4 5
20 8 5

Applying complete.cases to df to remove missing values in column 1 only:

Example

> df[complete.cases(df[,1]),]

Output

  x y
2 5 5
3 8 NA
4 6 5
5 4 NA
6 4 5
7 6 NA
8 4 NA
9 4 5
10 8 5
11 8 5
12 6 25
13 5 25
14 6 5
15 5 5
16 4 5
18 8 NA
19 4 NA
20 8 5

Applying complete.cases to df to remove missing values in column 2 only:

Example

> df[complete.cases(df[,2]),]

Output

   x y
1 NA 25
2 5 5
4 6 5
6 4 5
9 4 5
10 8 5
11 8 5
12 6 25
13 5 25
14 6 5
15 5 5
16 4 5
17 NA 25
20 8 5

Updated on: 21-Nov-2020

666 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements