How to extract unique rows by categorical column of a data.table object in R?


If we have categorical data in a data.table object and some values are duplicate then we might want to extract unique rows from that object.

To extract unique rows by categorical column of a data.table object, we can use unique function and define the columns with by argument as shown in the below examples. To understand how the extraction is done check out the below examples.

Example 1

Following snippet creates a data.table object −

library(data.table)
grp<-sample(LETTERS[1:4],20,replace=TRUE)
Score<-rpois(20,5)
DT1<-data.table(grp,Score)
DT1

The following data.table object is created −

   grp Score
1:  D  3
2:  B  3
3:  B  4
4:  B  3
5:  D  5
6:  B  7
7:  B  4
8:  D  1
9:  A  4
10: A  3
11: B  2
12: A  5
13: B  4
14: A  5
15: D  4
16: D  3
17: D  4
18: D  7
19: B  3
20: B  2

To extract unique rows in DT1, add the following code to the above snippet −

unique(DT1,by=c("grp","Score"))

Output

If you execute all the above given snippets as a single program, it generates the following output −

  grp Score
1:  D  3
2:  B  3
3:  B  4
4:  D  5
5:  B  7
6:  D  1
7:  A  4
8:  A  3
9:  B  2
10: A  5
11: D  4
12: D  7

Example 2

Following snippet creates a data.table object −

Category<-sample(c("Low","Medium","High"),20,replace=TRUE)
Price<-sample(1:10,20,replace=TRUE)
DT2<-data.table(Category,Price)
DT2

The following data.table object is created −

   Category Price
1:  High     7
2:  Medium   5
3:  Low      1
4:  Medium   5
5:  Medium   5
6:  Medium   8
7:  Low      2
8:  Medium   4
9:  Medium   7
10: Medium   3
11: Medium   4 
12: Medium  10
13: High     7
14: Medium   3
15: Low      8
16: Low      2
17: Low      6
18: Medium   2
19: High     6
20: High     4

To extract unique rows in DT2, add the following code to the above snippet −

unique(DT2,by=c("Category","Price"))

Output

If you execute all the above given snippets as a single program, it generates the following output −

 Category Price
1:  High    7
2:  Medium  5
3:  Low     1
4:  Medium  8
5:  Low     2
6:  Medium  4
7:  Medium  7
8:  Medium  3
9:  Medium 10
10: Low     8
11: Low     6
12: Medium  2
13: High    6
14: High    4

Updated on: 11-Nov-2021

295 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements