- Data Structure
- Networking
- RDBMS
- Operating System
- Java
- MS Excel
- iOS
- HTML
- CSS
- Android
- Python
- C Programming
- C++
- C#
- MongoDB
- MySQL
- Javascript
- PHP
- Physics
- Chemistry
- Biology
- Mathematics
- English
- Economics
- Psychology
- Social Studies
- Fashion Studies
- Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to remove rows from an R data frame based on frequency of values in grouping column?
To remove rows from an R data frame based on frequency of values in grouping column, we can follow the below steps −
- First of all, create a data frame.
- Then, remove rows based on frequency of values in grouping column using filter and group_by function of dplyr package.
Create the data frame
Let's create a data frame as shown below −
> Group<-sample(c("I","II","III","IV"),20,replace=TRUE) > Rank<-sample(1:10,20,replace=TRUE) > df<-data.frame(Group,Rank) > df
On executing, the above script generates the below output(this output will vary on your system due to randomization) −
Group Rank 1 IV 7 2 I 8 3 IV 2 4 I 9 5 III 9 6 IV 5 7 II 8 8 III 2 9 III 3 10 I 6 11 II 3 12 II 1 13 IV 7 14 III 4 15 III 5 16 IV 3 17 II 2 18 III 8 19 I 5 20 III 4
Removing rows from data frame based on frequencies in grouping column
Loading dplyr package and removing rows from df based on frequency of values based on Group column −
> Group<-sample(c("I","II","III","IV"),20,replace=TRUE) > Rank<-sample(1:10,20,replace=TRUE) > df<-data.frame(Group,Rank) > library(dplyr) > df %>% group_by(Group) %>% filter(n()>4)
# A tibble: 12 x 2 # Groups: Group [2] Group Rank <chr> <int> 1 IV 7 2 IV 2 3 III 9 4 IV 5 5 III 2 6 III 3 7 IV 7 8 III 4 9 III 5 10 IV 3 11 III 8 12 III 4
Advertisements