How to subset a data frame based on a vector values in R?


If we have a vector and a data frame, and the data frame has a column that contains the values similar as in the vector then we can create a subset of the data frame based on that vector. This can be done with the help of single square brackets and %in% operator. The %in% operator will help us to find the values in the data frame column that matches with the vector values. Check out the below examples to understand how it works.

Example1

Consider the below data frame df1 and vector v1 −

Live Demo

> x1<-rpois(20,2)
> x2<-rnorm(20)
> df1<-data.frame(x1,x2)
> df1

Output

x1 x2
1 2 -1.0627997
2 4 -0.2159125
3 1 0.2443734
4 3 -1.3513780
5 3 1.7359994
6 1 1.2563915
7 1 -0.8998470
8 2 0.4187820
9 1 2.6305826
10 4 -0.8040052
11 4 0.4067659
12 3 -1.7879203
13 3 1.7214544
14 2 -0.4699642
15 2 0.3626548
16 4 1.3013632
17 2 -0.2983836
18 1 1.8943313
19 1 1.5637219
20 2 0.8786897

Sub-setting the data frame df1 based on values in vector v1 −

> df1[df1$x1 %in% v1,]

Output

x1 x2
1 2 -1.0627997
3 1 0.2443734
4 3 -1.3513780
5 3 1.7359994
6 1 1.2563915
7 1 -0.8998470
8 2 0.4187820
9 1 2.6305826
12 3 -1.7879203
13 3 1.7214544
14 2 -0.4699642
15 2 0.3626548
17 2 -0.2983836
18 1 1.8943313
19 1 1.5637219
20 2 0.8786897

Example2

Live Demo

> y1<-sample(LETTERS[1:5],20,replace=TRUE)
> y2<-rpois(20,2)
> y3<-rpois(20,5)
> df2<-data.frame(y1,y2,y3)
> df2

Output

y1 y2 y3
1 C 0 5
2 A 2 5
3 A 2 1
4 D 1 6
5 B 0 4
6 E 6 9
7 E 0 5
8 C 1 9
9 D 1 6
10 D 2 6
11 A 4 5
12 D 1 6
13 E 1 5
14 E 2 6
15 C 5 4
16 A 0 3
17 D 2 5
18 B 1 10
19 E 3 3
20 A 2 1

Sub-setting the data frame df2 based on values in vector v2 −

> v2<-c("A","B","C","D")
> df2[df2$y1 %in% v2,]

Output

y1 y2 y3
1 C 0 5
2 A 2 5
3 A 2 1
4 D 1 6
5 B 0 4
8 C 1 9
9 D 1 6
10 D 2 6
11 A 4 5
12 D 1 6
15 C 5 4
16 A 0 3
17 D 2 5
18 B 1 10
20 A 2 1

Updated on: 04-Mar-2021

3K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements