- Data Structure
- Networking
- RDBMS
- Operating System
- Java
- MS Excel
- iOS
- HTML
- CSS
- Android
- Python
- C Programming
- C++
- C#
- MongoDB
- MySQL
- Javascript
- PHP
- Physics
- Chemistry
- Biology
- Mathematics
- English
- Economics
- Psychology
- Social Studies
- Fashion Studies
- Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to find the two factor interaction variables in an R data frame?
If we have a data frame called df that contains four columns say x, y, z, and a then the two factor interaction columns will be xy, xz, xa, yz, ya, za. To find how many two factor interaction variables can be created using data frame columns, we can make use of combn function as shown in the below examples.
Consider the below data frame −
Example
x1<-rpois(20,2) x2<-rpois(20,2) x3<-rpois(20,1) x4<-rpois(20,2) x5<-rpois(20,5) x6<-rpois(20,2) df1<-data.frame(x1,x2,x3,x4,x5,x6) df1
Output
x1 x2 x3 x4 x5 x6 1 3 1 1 2 5 0 2 1 2 3 4 6 0 3 3 2 1 4 5 1 4 1 2 0 2 3 3 5 0 0 2 1 4 3 6 4 1 0 8 3 0 7 3 2 1 0 8 3 8 3 2 1 2 6 3 9 4 4 0 1 5 0 10 1 1 1 3 3 2 11 3 2 0 4 3 1 12 0 0 2 1 4 2 13 4 4 0 2 3 3 14 2 3 0 3 3 1 15 1 4 3 1 8 2 16 2 3 1 1 4 2 17 2 3 0 2 4 3 18 2 5 1 1 10 3 19 0 2 0 1 9 3 20 0 3 0 1 4 2
Finding two factor interaction variables in df1 −
combn(colnames(df1),2,FUN=paste,collapse='_')
[1] "x1_x2" "x1_x3" "x1_x4" "x1_x5" "x1_x6" "x2_x3" "x2_x4" "x2_x5" "x2_x6" [10] "x3_x4" "x3_x5" "x3_x6" "x4_x5" "x4_x6" "x5_x6"
Example
y1<-round(rnorm(20),2) y2<-round(rnorm(20),2) y3<-round(rnorm(20),2) y4<-round(rnorm(20),2) y5<-round(rnorm(20),2) y6<-round(rnorm(20),2) df2<-data.frame(y1,y2,y3,y4,y5,y6) df2
Output
y1 y2 y3 y4 y5 y6 1 0.37 -0.25 -2.60 1.56 -0.64 -0.80 2 0.68 0.65 2.06 -0.54 0.16 -0.22 3 0.51 -0.37 0.16 -2.23 -0.42 0.52 4 -0.01 -0.32 1.65 -2.59 1.01 -1.86 5 -0.65 -0.56 -0.41 -0.88 0.50 -0.66 6 -0.42 0.55 0.26 0.02 -1.52 -0.34 7 -0.89 -0.91 -1.28 0.26 -1.27 -1.04 8 0.12 0.59 -0.80 -1.24 1.57 -0.53 9 -0.26 -1.09 0.65 -0.40 0.18 0.16 10 -1.10 -0.70 2.30 0.31 -0.46 -0.16 11 -0.42 -0.06 -0.76 0.45 0.28 -0.10 12 -0.07 2.08 -0.17 -0.16 -0.54 2.06 13 -0.91 0.37 -1.19 -2.44 -0.45 0.46 14 0.74 1.06 0.42 0.85 -0.12 -0.21 15 1.51 0.29 -0.14 0.28 0.76 -0.45 16 0.11 -0.66 -1.70 1.88 -1.16 1.05 17 0.49 0.44 -1.38 -0.39 -1.47 -1.12 18 0.67 -0.29 1.40 0.80 -0.25 1.23 19 0.45 1.57 1.34 1.75 0.25 -0.89 20 1.05 0.23 -0.06 -0.29 1.50 1.20
Finding two factor interaction variables in df2 −
combn(colnames(df2),2,FUN=paste,collapse='_')
[1] "y1_y2" "y1_y3" "y1_y4" "y1_y5" "y1_y6" "y2_y3" "y2_y4" "y2_y5" "y2_y6" [10] "y3_y4" "y3_y5" "y3_y6" "y4_y5" "y4_y6" "y5_y6"
Advertisements