- Data Structure
- Networking
- RDBMS
- Operating System
- Java
- MS Excel
- iOS
- HTML
- CSS
- Android
- Python
- C Programming
- C++
- C#
- MongoDB
- MySQL
- Javascript
- PHP
- Physics
- Chemistry
- Biology
- Mathematics
- English
- Economics
- Psychology
- Social Studies
- Fashion Studies
- Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to find the column that has the largest sum in R?
To find the column that has the largest sum, we can use sort function for sorting in decreasing order with colSums and accessing the first element of the output which will be the largest sum. For example, if we have a data frame called df that contains multiple columns then the column that has the largest sum can be found by using the command −
str(sort(colSums(df[,1:length(df)]),decreasing=TRUE)[1])
Example1
Consider the below data frame −
> x1<-rpois(20,5) > x2<-rpois(20,5) > x3<-rpois(20,5) > x4<-rpois(20,5) > df1<-data.frame(x1,x2,x3,x4) > df1
Output
x1 x2 x3 x4 1 3 4 4 5 2 6 10 3 3 3 6 5 2 5 4 7 6 2 13 5 4 7 7 3 6 2 4 3 4 7 5 7 2 2 8 1 2 8 3 9 10 1 3 2 10 6 4 8 5 11 6 7 2 2 12 6 3 4 6 13 8 6 8 5 14 4 6 1 6 15 3 1 7 10 16 4 3 6 8 17 1 1 8 8 18 6 6 5 6 19 7 3 2 6 20 6 6 4 5
Finding the column that has the largest sum in df1 −
> str(sort(colSums(df1[,1:length(df1)]),decreasing=TRUE)[1])
Output
Named num 107 - attr(*, "names")= chr "x4"
Example2
> y1<-rnorm(20) > y2<-rnorm(20) > y3<-rnorm(20) > df2<-data.frame(y1,y2,y3) > df2
Output
y1 y2 y3 1 -0.67247167 -0.03504090 -0.66697231 2 -0.68074045 -0.25805863 0.84996560 3 0.69900478 -1.88632900 -0.72983709 4 -1.18607010 1.41421023 1.13006070 5 -0.32133261 -0.63577768 -0.11396980 6 -1.32619037 0.61646926 0.89315793 7 0.01712191 -1.07839179 -0.34707437 8 0.16517472 -0.80356200 0.37064564 9 2.52589496 -0.37596219 -0.36734004 10 -0.14817698 -0.11656378 -2.23320356 11 -0.53926289 0.21150137 -0.20352309 12 0.22330625 0.04340639 0.50600645 13 -0.82293233 0.22586452 -0.82058059 14 -0.38483674 -0.38651706 -1.33218404 15 -0.33143327 -0.12833993 -0.33432244 16 0.40020483 -0.58673910 -0.51292024 17 -2.66155329 -0.66032907 -0.98167877 18 -1.49012484 0.91082996 -0.68865703 19 -2.17102582 1.49218359 -0.03119144 20 -0.28752746 -0.27363896 -0.59666780
Finding the column that has the largest sum in df2 −
> str(sort(colSums(df2[,1:length(df2)]),decreasing=TRUE)[1])
Output
Named num -2.31 - attr(*, "names")= chr "y2"
Example3
> z1<-rnorm(20,5,1.2) > z2<-rnorm(20,5,1.2) > z3<-rnorm(20,5,1.2) > df3<-data.frame(z1,z2,z3) > df3
Output
z1 z2 z3 1 4.195753 5.237520 4.718239 2 5.406601 5.467189 5.656534 3 4.107268 4.206512 5.002071 4 4.273912 3.318249 3.851186 5 5.658334 4.044090 5.726887 6 5.794366 6.746781 5.573617 7 5.858288 6.643365 3.670364 8 5.996933 3.587626 3.603394 9 4.828025 5.512565 7.352176 10 5.232532 6.235726 2.827798 11 1.632488 6.318988 5.206436 12 4.033981 7.281025 5.996814 13 4.611700 6.482257 2.515667 14 5.551795 4.824941 4.938571 15 7.026488 5.153775 3.043448 16 4.917164 6.888027 6.673310 17 5.164733 5.986679 4.329136 18 5.114344 2.379626 6.442586 19 5.254078 5.369151 4.240947 20 7.874268 5.076189 7.012805
Finding the column that has the largest sum in df3 −
> str(sort(colSums(df3[,1:length(df3)]),decreasing=TRUE)[1])
Output
Named num 107 - attr(*, "names")= chr "z2"