How to subset a data.table object in R by specifying columns that contains NA?


To subset a data.table object by specifying columns that contains NA, we can follow the below steps −

  • First of all, create a data.table object with some columns containing NAs.

  • Then, use is.na along with subset function to subset the data.table object by specifying columns that contains NA.

Example

Create the data.table object

Let’s create a data.table object as shown below −

library(data.table)
x1<-sample(c(NA,round(rnorm(2),2)),25,replace=TRUE)
x2<-sample(c(NA,round(rnorm(3),2)),25,replace=TRUE)
x3<-sample(c(NA,round(rnorm(3),2)),25,replace=TRUE)
x4<-sample(c(NA,round(rnorm(2),2)),25,replace=TRUE)
DT<-data.table(x1,x2,x3,x4)
DT

Output

On executing, the above script generates the below output(this output will vary on your system due to randomization) −

     x1     x2   x3    x4
1:  -2.34 -0.57  NA    NA
2:  -2.34 -0.57 -0.85 -0.47
3:   NA   -0.57  NA   -0.47
4:  -2.34 -0.57 -0.84  0.69
5:   NA   -0.57  1.82  0.69
6:   1.14 -2.03  1.82  NA
7:  -2.34  NA   -0.84  NA
8:   1.14  0.63 -0.85  NA
9:   NA    NA   -0.84 -0.47
10:  1.14  NA    NA   -0.47
11: -2.34  NA   -0.84  NA
12:  NA    NA   -0.85  NA
13:  1.14  0.63 -0.84  NA
14: -2.34  0.63 -0.84  NA
15: -2.34 -2.03  1.82  NA
16:  NA   -2.03  1.82  NA
17:  NA    NA    NA   -0.47
18:  1.14 -2.03  NA    NA
19:  NA    0.63  1.82  NA
20: -2.34  NA    1.82 -0.47
21:  1.14  0.63  NA    NA
22:  1.14  NA   -0.85 -0.47
23: -2.34 -2.03  NA   -0.47
24:  1.14  0.63  1.82 -0.47
25: -2.34  NA    NA    0.69
    x1     x2     x3   x4

Subset data.table object by specifying columns having NAs

Using is.na along with subset function to subset the data.table object DT by specifying columns x1 and x2 that contains NA as shown below −

library(data.table)
x1<-sample(c(NA,round(rnorm(2),2)),25,replace=TRUE)
x2<-sample(c(NA,round(rnorm(3),2)),25,replace=TRUE)
x3<-sample(c(NA,round(rnorm(3),2)),25,replace=TRUE)
x4<-sample(c(NA,round(rnorm(2),2)),25,replace=TRUE)
DT<-data.table(x1,x2,x3,x4)
subset(DT,is.na(x1)|is.na(x2))

Output

     x1   x2    x3    x4
1:   NA  -0.57 NA   -0.47
2:   NA  -0.57 1.82  0.69
3:  -2.34 NA  -0.84  NA
4:   NA   NA  -0.84 -0.47
5:   1.14 NA   NA   -0.47
6:  -2.34 NA  -0.84  NA
7:   NA   NA  -0.85  NA
8:   NA  -2.03 1.82  NA
9:   NA   NA   NA   -0.47
10:  NA   0.63 1.82  NA
11: -2.34 NA   1.82 -0.47
12:  1.14 NA  -0.85 -0.47
13: -2.34 NA   NA    0.69

Updated on: 15-Nov-2021

253 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements