How to select values less than or greater than a specific percentile from an R data frame column?


The percentiles divide a set of numeric values into hundred groups or individual values if the size of the values is 100. We can find percentiles for a numeric column of an R data frame, therefore, it is also possible to select values of a column based on these percentiles. For this purpose, we can use quantile function.

Example

Consider the below data frame −

set.seed(111)
x<-sample(0:15,30,replace=TRUE) df<-data.frame(x)
df

Output

      x
1    13
2    10
3    3
4    2
5    14
6    8
7    10
8    4
9    2
10    11
11    7
12    9
13    0
14    12
15    12
16    9
17    3
18    14
19    7
20    13
21    9
22    8
23    7
24    10
25    15
26    0
27    10
28    6
29    4
30    14

Finding values that are less than or equal to a certain percentile for x in df −

df[df$x<quantile(df$x,0.75),]

[1] 10 3 2 8 10 4 2 11 7 9 0 9 3 7 9 8 7 10 0 10 6 4

df[df$x<quantile(df$x,0.25),]

[1] 3 2 4 2 0 3 0 4

df[df$x<quantile(df$x,0.50),]

[1] 3 2 8 4 2 7 0 3 7 8 7 0 6 4

df[df$x<quantile(df$x,0.90),]

[1] 13 10 3 2 8 10 4 2 11 7 9 0 12 12 9 3 7 13 9 8 7 10 0 10 6
[26] 4

df[df$x<quantile(df$x,0.95),]

[1] 13 10 3 2 8 10 4 2 11 7 9 0 12 12 9 3 7 13 9 8 7 10 0 10 6
[26] 4

df[df$x<quantile(df$x,0.99),]

[1] 13 10 3 2 14 8 10 4 2 11 7 9 0 12 12 9 3 14 7 13 9 8 7 10 0
[26] 10 6 4 14

df[df$x<quantile(df$x,0.10),]

[1] 0 0

df[df$x<quantile(df$x,0.05),]

[1] 0 0

df[df$x<quantile(df$x,0.20),]

[1] 3 2 2 0 3 0

df[df$x<quantile(df$x,0.30),]

[1] 3 2 4 2 0 3 0 6 4

df[df$x>quantile(df$x,0.05),]

[1] 13 10 3 2 14 8 10 4 2 11 7 9 12 12 9 3 14 7 13 9 8 7 10 15 10 [26] 6 4 14

df[df$x>quantile(df$x,0.10),]

[1] 13 10 3 14 8 10 4 11 7 9 12 12 9 3 14 7 13 9 8 7 10 15 10 6 4 [26] 14

df[df$x>quantile(df$x,0.20),]

[1] 13 10 14 8 10 4 11 7 9 12 12 9 14 7 13 9 8 7 10 15 10 6 4 14

df[df$x>quantile(df$x,0.25),]

[1] 13 10 14 8 10 11 7 9 12 12 9 14 7 13 9 8 7 10 15 10 6 14

df[df$x>quantile(df$x,0.60),]

[1] 13 14 11 12 12 14 13 15 14

df[df$x>quantile(df$x,0.70),]

[1] 13 14 11 12 12 14 13 15 14

df[df$x>quantile(df$x,0.75),]

[1] 13 14 12 12 14 13 15 14

df[df$x>quantile(df$x,0.80),]

[1] 13 14 14 13 15 14

df[df$x>quantile(df$x,0.90),]

[1] 15

df[df$x>quantile(df$x,0.95),]

[1] 15

Updated on: 08-Oct-2020

538 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements