Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
Articles by Nizamuddin Siddiqui
Page 78 of 196
How to remove rows from data frame in R based on grouping value of a particular column?
If we have a grouping column in an R data frame and we believe that one of the group values is not useful for our analysis then we might want to remove all the rows that contains that value and proceed with the analysis, also it might be possible that the one of the values are repeated and we want to get rid of that. In this situation, we can do subsetting of the data frame using negation and single square brackets.Exampleset.seed(1212) x
Read MoreHow to remove everything before values starting after underscore from column values of an R data frame?
If a column in an R data frame contain string values that are separated with an underscore and stretches the size of the column values that also contain common values then it would be wise to remove underscore sign from all the values at once along with the values that is common. This will help us to read the data properly as well as analysis will become easy. For this purpose, we can use gsub functionConsider the below data frame −Exampleset.seed(191) ID
Read MoreHow to perform fisher test in R?
The fisher test helps us to understand whether there exists a significant non-random relationship among categorical variables or not. It is applied on contingency tables because these tables are used to represent the frequency for categorical variables and we can apply it on a matrix as well as matrices have the similar form. In R, we can use fisher.test function to perform the fisher test.ExampleM1
Read MoreHow to find the groupwise mean and save it in a data frame object in R?
We often need groupwise mean in data analysis, especially in situations where analysis of variance techniques is used because these techniques helps us to compare different groups based on their measures of central tendencies and measures of variations. It can be done by using aggregate function so that the output can be saved in a data frame object. In the below examples, we can see how it can be done and also check the final object type.ExampleConsider the below data frame −set.seed(109) Salary
Read MoreHow to change the size of correlation coefficient value in correlation matrix plot using corrplot in R?
The correlation coefficient value size in correlation matrix plot created by using corrplot function ranges from 0 to 1, 0 referring to the smallest and 1 referring to the largest, by default it is 1. To change this size, we need to use number.cex argument. For example, if we want to decrease the size to half then we can use number.cex = 0.5.ExampleConsider the below matrix −set.seed(99) M corrplot(cor(M), addCoef.col="black")OutputChanging the size of correlation coefficient value to 0.75 −corrplot(cor(M), addCoef.col="black", number.cex=0.75)OutputChanging the size of correlation coefficient value to 0.30 −> corrplot(cor(M), addCoef.col="black", number.cex=0.30)Output
Read MoreHow to create a data frame with combinations of values in R?
Suppose we have two values 0 and 1 then how many combinations of these values are possible, the answer is 8 and these combinations are (0,0), (1,0), (0,1), (1,1). In R, we can use expand.grid function to create these combinations but to save it in a data frame, we would need to use as.data.frame function.Exampledf1
Read MoreHow to create a contingency table with sum on the margins from an R data frame?
The sum of rows and columns on the margins in a contingency table are always useful because they are used for different type of calculations such as odds ratio, probability etc. If an R data frame has factor columns then we can create a contingency table for that data frame and it can be done by using addmargins function.ExampleConsider the below data frame −x1
Read MoreHow to create a sequence of dates by using starting date in R?
The best way to create a sequence of anything is creating it with the help of seq function and this also applies to sequences of dates. But in case of dates, we need to read the dates in date format so that R can understand the input type and create the appropriate vector. If we do not use the date format for the date value then it won’t make sense to R and it will result in error.Examplesx1
Read MoreHow to find the absolute pairwise difference among values of a vector in R?
If a vector contains five values then there will be ten pairwise differences. For example, suppose we have five numbers starting from 1, then the pairwise combinations for these values will be (1,2), (1,3), (1,4), (1,5), (2,3), (2,4), (2,5), (3,4), (3,5), (4,5). Now to find the absolute pairwise differences, we would be need to find the differences between each of these combinations and take the absolute value of the answer hence the result will be 1, 2, 3, 4, 1, 2, 3, 1, 2, 1.Examplex1
Read MoreHow to display R-squared value on scatterplot with regression model line in R?
The R-squared value is the coefficient of determination, it gives us the percentage or proportion of variation in dependent variable explained by the independent variable. To display this value on the scatterplot with regression model line without taking help from any package, we can use plot function with abline and legend functions.Consider the below data frame −Exampleset.seed(1234) x
Read More