![Trending Articles on Technical and Non Technical topics](/images/trending_categories.jpeg)
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
Physics
Chemistry
Biology
Mathematics
English
Economics
Psychology
Social Studies
Fashion Studies
Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to find the number of unique values of multiple categorical columns based on one categorical column in R?
To find the number of unique values of multiple categorical columns based on one categorical column, we can follow the below steps −
- First of all, create a data frame
- Use summarise_each function with n_distinct function to find the number of unique values based on a categorical column.
Create the data frame
Let's create a data frame as shown below −
x<- sample(c("First","Second","Third","Fourth","Fifth","Sixth","Seventh","Eighth","Nineth", "Tenth"),25,replace=TRUE) C1<-sample(LETTERS[1:4],25,replace=TRUE) C2<-sample(letters[1:4],25,replace=TRUE) df<-data.frame(x,C1,C2) df
On executing, the above script generates the below output(this output will vary on your system due to randomization) −
x C1 C2 1 Seventh B a 2 Third C c 3 Nineth A a 4 Third D c 5 Seventh D d 6 Fourth A c 7 Seventh B a 8 Third D a 9 Seventh D c 10 First A a 11 Eighth D d 12 Tenth C b 13 Fifth A c 14 Second A c 15 Fourth B d 16 Nineth C b 17 Fifth D a 18 First A a 19 Tenth B a 20 Nineth A b 21 Third B b 22 Tenth A a 23 Fifth A a 24 Sixth D b 25 First A c
Find number of unique values based on categorical column
Use n_distinct function and summarise_each function of dplyr package to find the number of unique values in C1 and C2 based on x −
x<- sample(c("First","Second","Third","Fourth","Fifth","Sixth","Seventh","Eighth","Nineth", "Tenth"),25,replace=TRUE) C1<-sample(LETTERS[1:4],25,replace=TRUE) C2<-sample(letters[1:4],25,replace=TRUE) df<-data.frame(x,C1,C2) library(dplyr) df %>% group_by(x) %>% summarise_each(funs(n_distinct(.)))
Output
# A tibble: 10 x 3 x C1 C2 <chr> <int> <int> 1 Eighth 1 1 2 Fifth 2 2 3 First 1 2 4 Fourth 2 2 5 Nineth 2 2 6 Second 1 1 7 Seventh 2 3 8 Sixth 1 1 9 Tenth 3 2 10 Third 3 3
Advertisements