How to replace the outliers with 5th and 95th percentile values in R?


There are many ways to define an outlying value and it can be manually set by the researchers as well as technicians. Also, we can use 5th percentile for the lower outlier and the 95th percentile for the upper outlier. For this purpose, we can use squish function of scales package as shown in the below examples.

Example1

library(scales)
x1<−1:10
x1<−squish(x1,quantile(x1,c(.05,0.95)))
x1

Output

[1] 1.45 2.00 3.00 4.00 5.00 6.00 7.00 8.00 9.00 9.55

Example2

 Live Demo

x2<−c(−5,rnorm(78),5)
x2

Output

[1] −5.00000000 −0.39993096 −0.11249038 1.06589235 1.17195813 0.15677178
[7] −0.08325310 0.57986817 −0.05529031 0.13352083 1.00608625 −0.86860404
[13] 0.53672576 −0.15262216 −0.81247587 −0.31263625 −1.51127713 −1.59689010
[19] −0.11242962 −1.08234352 −0.04935398 −0.65185804 −1.10369370 0.68732306
[25] 1.83448401 1.08689945 −1.20674408 −1.25753553 0.03354570 0.67981025
[31] 0.24871123 −1.49969111 1.19287825 1.04406030 −1.31756416 0.10204579
[37] 1.48272096 0.97661717 0.50006441 −1.36247153 0.99895292 −0.49534106
[43] −0.24105508 0.35006991 −2.16041158 −1.12644863 2.23190981 −0.51413222
[49] 0.03760280 −1.12237961 −1.54094088 −0.37365780 0.02138277 1.97702046
[55] 0.37190626 −0.59456892 −0.06652980 −1.04453387 −0.50884324 0.85025142
[61] −0.66718350 −0.69703588 0.44922344 0.64238500 −1.11403189 0.66251032
[67] 0.79601219 −0.74801795 −0.10957126 −0.90781918 −2.13721781 1.43186180
[73] −0.32571115 −0.97929747 1.10822193 0.94719910 0.58934102 −1.29942407
[79] 3.83469537 5.00000000

Example

x2<−squish(x2,quantile(x2,c(.05,0.95)))
x2

Output

[1] −1.54373835 −0.39993096 −0.11249038 1.06589235 1.17195813 0.15677178
[7] −0.08325310 0.57986817 −0.05529031 0.13352083 1.00608625 −0.86860404
[13] 0.53672576 −0.15262216 −0.81247587 −0.31263625 −1.51127713 −1.54373835
[19] −0.11242962 −1.08234352 −0.04935398 −0.65185804 −1.10369370 0.68732306
[25] 1.83448401 1.08689945 −1.20674408 −1.25753553 0.03354570 0.67981025
[31] 0.24871123 −1.49969111 1.19287825 1.04406030 −1.31756416 0.10204579
[37] 1.48272096 0.97661717 0.50006441 −1.36247153 0.99895292 −0.49534106
[43] −0.24105508 0.35006991 −1.54373835 −1.12644863 1.84161083 −0.51413222
[49] 0.03760280 −1.12237961 −1.54094088 −0.37365780 0.02138277 1.84161083
[55] 0.37190626 −0.59456892 −0.06652980 −1.04453387 −0.50884324 0.85025142
[61] −0.66718350 −0.69703588 0.44922344 0.64238500 −1.11403189 0.66251032
[67] 0.79601219 −0.74801795 −0.10957126 −0.90781918 −1.54373835 1.43186180
[73] −0.32571115 −0.97929747 1.10822193 0.94719910 0.58934102 −1.29942407
[79] 1.84161083 1.84161083

Example3

 Live Demo

x3<−c(-50,rpois(198,5),50)
x3

Output

[1] −50 5 4 8 6 2 1 6 3 5 7 7 8 5 8 8 5 8
[19] 3 2 3 0 5 6 2 6 6 2 7 5 9 4 5 3 9 7
[37] 4 3 6 5 2 4 9 5 7 1 2 4 2 3 5 5 6 1
[55] 5 7 1 9 6 3 5 4 3 9 5 4 6 8 4 4 6 4
[73] 5 2 4 5 5 7 8 6 3 5 8 5 8 5 2 5 2 8
[91] 6 6 5 7 2 2 5 5 4 3 5 3 7 2 4 6 8 6
[109] 3 4 9 2 2 2 4 4 6 6 5 5 3 5 3 6 6 4
[127] 6 4 4 5 9 6 2 1 3 8 5 7 5 6 6 5 7 2
[145] 8 8 6 5 3 4 5 10 6 6 3 6 2 7 7 5 8 7
[163] 7 3 4 8 4 4 6 8 3 6 4 10 4 3 5 4 4 5
[181] 4 5 4 5 4 5 6 8 2 5 12 12 3 6 5 4 4 5
[199] 5 50

Example

x3<−squish(x3,quantile(x3,c(.05,0.95)))
x3

Output

[1] 2 5 4 8 6 2 2 6 3 5 7 7 8 5 8 8 5 8 3 2 3 2 5 6 2 6 6 2 7 5 9 4 5 3 9 7 4
[38] 3 6 5 2 4 9 5 7 2 2 4 2 3 5 5 6 2 5 7 2 9 6 3 5 4 3 9 5 4 6 8 4 4 6 4 5 2
[75] 4 5 5 7 8 6 3 5 8 5 8 5 2 5 2 8 6 6 5 7 2 2 5 5 4 3 5 3 7 2 4 6 8 6 3 4 9
[112] 2 2 2 4 4 6 6 5 5 3 5 3 6 6 4 6 4 4 5 9 6 2 2 3 8 5 7 5 6 6 5 7 2 8 8 6 5
[149] 3 4 5 9 6 6 3 6 2 7 7 5 8 7 7 3 4 8 4 4 6 8 3 6 4 9 4 3 5 4 4 5 4 5 4 5 4
[186] 5 6 8 2 5 9 9 3 6 5 4 4 5 5 9

Example4

 Live Demo

x4<−c(−50,rexp(48,3.1),50)
x4

Output

[1] −50.00000000 0.46067329 0.15298747 0.22637363 0.23424447
[6] 0.15467335 0.37455989 0.07762013 0.33175821 0.09303333
[11] 0.03806199 0.20649621 0.22883480 0.49089164 0.82497712
[16] 0.04780089 0.05156566 0.35638257 0.37319578 0.71100713
[21] 0.08649528 0.31543159 0.02263685 0.00963146 0.44814049
[26] 0.34506738 0.29533295 0.13803055 0.05497129 0.03901786
[31] 0.01818446 0.78122217 0.04863415 0.33353520 0.39530353
[36] 0.05385106 0.19991695 0.16913554 0.01549729 0.15901185
[41] 0.65120205 0.36483214 0.18226180 0.20708671 0.01590697
[46] 1.01257680 0.42223292 0.17291614 0.15793390 50.00000000

Example

x4<−squish(x4,quantile(x4,c(.05,0.95)))
x4

Output

[1] 0.01568165 0.46067329 0.15298747 0.22637363 0.23424447 0.15467335
[7] 0.37455989 0.07762013 0.33175821 0.09303333 0.03806199 0.20649621
[13] 0.22883480 0.49089164 0.80528739 0.04780089 0.05156566 0.35638257
[19] 0.37319578 0.71100713 0.08649528 0.31543159 0.02263685 0.01568165
[25] 0.44814049 0.34506738 0.29533295 0.13803055 0.05497129 0.03901786
[31] 0.01818446 0.78122217 0.04863415 0.33353520 0.39530353 0.05385106
[37] 0.19991695 0.16913554 0.01568165 0.15901185 0.65120205 0.36483214
[43] 0.18226180 0.20708671 0.01590697 0.80528739 0.42223292 0.17291614
[49] 0.15793390 0.80528739

Updated on: 08-Feb-2021

294 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements